| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Current SDK version is 0.17.5 | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Configure stats pid to 424659 | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Loading settings from /home/yangyaodong/.config/wandb/settings | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Loading settings from /aifs4su/yaodong/projects/hantao/dev_cham/align-anything/scripts/wandb/settings | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Loading settings from environment variables: {'api_key': '***REDACTED***'} | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False} | |
| 2024-09-22 18:34:18,526 WARNING MainThread:424659 [wandb_setup.py:_flush():76] Could not find program at -m align_anything.trainers.tiv_to_t.ppo | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program_relpath': None, 'program': '-m align_anything.trainers.tiv_to_t.ppo'} | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_setup.py:_flush():76] Applying login settings: {} | |
| 2024-09-22 18:34:18,526 INFO MainThread:424659 [wandb_init.py:_log_setup():529] Logging user logs to ../outputs/ppo_qwen2vl_10k_baseline/wandb/run-20240922_183418-smhpt648/logs/debug.log | |
| 2024-09-22 18:34:18,527 INFO MainThread:424659 [wandb_init.py:_log_setup():530] Logging internal logs to ../outputs/ppo_qwen2vl_10k_baseline/wandb/run-20240922_183418-smhpt648/logs/debug-internal.log | |
| 2024-09-22 18:34:18,527 INFO MainThread:424659 [wandb_init.py:init():569] calling init triggers | |
| 2024-09-22 18:34:18,527 INFO MainThread:424659 [wandb_init.py:init():576] wandb.init called with sweep_config: {} | |
| config: {'train_cfgs': {'ds_cfgs': 'ds_z3_config.json', 'epochs': 3, 'seed': 42, 'per_device_prompt_batch_size': 2, 'per_device_train_batch_size': 2, 'per_device_eval_batch_size': 2, 'gradient_accumulation_steps': 1, 'actor_gradient_checkpointing': True, 'critic_gradient_checkpointing': True, 'actor_lr': 5e-07, 'actor_lr_scheduler_type': 'cosine', 'actor_lr_warmup_ratio': 0.03, 'actor_weight_decay': 0.0, 'critic_lr': 5e-07, 'critic_lr_scheduler_type': 'constant', 'critic_lr_warmup_ratio': 0.03, 'critic_weight_decay': 0.0, 'adam_betas': [0.9, 0.95], 'bf16': True, 'fp16': False, 'eval_strategy': 'epoch', 'eval_interval': 10, 'kl_coeff': 0.02, 'clip_range_ratio': 0.2, 'clip_range_score': 50.0, 'clip_range_value': 5.0, 'ptx_coeff': 16.0, 'gamma': 1.0, 'gae_lambda': 0.95, 'normalize_reward': False, 'update_iters': 1, 'freeze_mm_proj': False, 'freeze_vision_tower': True, 'freeze_language_model': False}, 'data_cfgs': {'train_datasets': '/aifs4su/yaodong/datasets/aaa_dataset/TV2T-preference/extracted', 'train_template': 'NExTQA_preference', 'train_size': None, 'train_split': 'train', 'train_subset': None, 'train_data_files': 'extracted_preference_10k_washed.json', 'train_optional_args': [], 'eval_datasets': None, 'eval_template': None, 'eval_size': None, 'eval_split': None, 'eval_subset': None, 'eval_data_files': None, 'eval_optional_args': [], 'ptx_datasets': '/aifs4su/yaodong/datasets/ShareGPT4Video/extracted', 'ptx_template': 'NExTQA', 'ptx_size': 25000, 'ptx_subset': None, 'ptx_split': 'train', 'ptx_data_files': 'extracted_panda.json', 'ptx_optional_args': []}, 'logger_cfgs': {'log_type': 'wandb', 'log_project': 'align-anything', 'log_run_name': 'ppo', 'output_dir': '../outputs/ppo_qwen2vl_10k_baseline', 'cache_dir': None, 'save_interval': 300.0}, 'model_cfgs': {'actor_model_name_or_path': '/aifs4su/yaodong/models/Qwen2-VL-7B-Instruct', 'reward_model_name_or_path': '/aifs4su/yaodong/projects/hantao/dev_cham/align-anything/outputs/rm_tiv2t_10k_baseline', 'reward_critic_model_name_or_path': '/aifs4su/yaodong/projects/hantao/dev_cham/align-anything/outputs/rm_tiv2t_10k_baseline', 'trust_remote_code': True, 'model_max_length': 2048, 'temperature': 1.0, 'top_p': 1.0, 'repetition_penalty': 1.0}, 'special_tokens': None} | |
| 2024-09-22 18:34:18,527 INFO MainThread:424659 [wandb_init.py:init():619] starting backend | |
| 2024-09-22 18:34:18,527 INFO MainThread:424659 [wandb_init.py:init():623] setting up manager | |
| 2024-09-22 18:34:18,529 INFO MainThread:424659 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn | |
| 2024-09-22 18:34:18,532 INFO MainThread:424659 [wandb_init.py:init():631] backend started and connected | |
| 2024-09-22 18:34:18,535 INFO MainThread:424659 [wandb_init.py:init():720] updated telemetry | |
| 2024-09-22 18:34:18,556 INFO MainThread:424659 [wandb_init.py:init():753] communicating run to backend with 90.0 second timeout | |
| 2024-09-22 18:34:19,037 INFO MainThread:424659 [wandb_run.py:_on_init():2435] communicating current version | |
| 2024-09-22 18:34:19,230 INFO MainThread:424659 [wandb_run.py:_on_init():2444] got version response upgrade_message: "wandb version 0.18.1 is available! To upgrade, please run:\n $ pip install wandb --upgrade" | |
| 2024-09-22 18:34:19,230 INFO MainThread:424659 [wandb_init.py:init():804] starting run threads in backend | |
| 2024-09-22 18:34:25,437 INFO MainThread:424659 [wandb_run.py:_console_start():2413] atexit reg | |
| 2024-09-22 18:34:25,437 INFO MainThread:424659 [wandb_run.py:_redirect():2255] redirect: wrap_raw | |
| 2024-09-22 18:34:25,437 INFO MainThread:424659 [wandb_run.py:_redirect():2320] Wrapping output streams. | |
| 2024-09-22 18:34:25,437 INFO MainThread:424659 [wandb_run.py:_redirect():2345] Redirects installed. | |
| 2024-09-22 18:34:25,441 INFO MainThread:424659 [wandb_init.py:init():847] run started, returning control to user process | |
| 2024-09-23 11:15:07,137 INFO MainThread:424659 [wandb_run.py:_finish():2107] finishing run htlou/align-anything/smhpt648 | |
| 2024-09-23 11:15:07,139 INFO MainThread:424659 [wandb_run.py:_atexit_cleanup():2374] got exitcode: 0 | |
| 2024-09-23 11:15:07,156 INFO MainThread:424659 [wandb_run.py:_restore():2352] restore | |
| 2024-09-23 11:15:07,156 INFO MainThread:424659 [wandb_run.py:_restore():2358] restore done | |
| 2024-09-23 11:15:15,887 INFO MainThread:424659 [wandb_run.py:_footer_history_summary_info():4016] rendering history | |
| 2024-09-23 11:15:15,888 INFO MainThread:424659 [wandb_run.py:_footer_history_summary_info():4048] rendering summary | |
| 2024-09-23 11:15:15,897 INFO MainThread:424659 [wandb_run.py:_footer_sync_info():3975] logging synced files | |