|
|
nohup: ignoring input |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:70: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead. |
|
|
self.scaler = GradScaler() |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:116: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.embeddings = torch.load(combined_path, map_location=self.device) |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:180: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.compressor.load_state_dict(torch.load('final_compressor_model.pth', map_location=self.device)) |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:181: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.decompressor.load_state_dict(torch.load('final_decompressor_model.pth', map_location=self.device)) |
|
|
/data2/edwardsun/flow_home/cfg_dataset.py:253: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.embeddings = torch.load(combined_path, map_location='cpu') |
|
|
Starting optimized training with batch_size=384, epochs=2000 |
|
|
Using GPU 0 for optimized H100 training |
|
|
Mixed precision: True |
|
|
Batch size: 384 |
|
|
Target epochs: 2000 |
|
|
Learning rate: 0.0012 -> 0.0006 |
|
|
β Mixed precision training enabled (BF16) |
|
|
Loading ALL AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
|
|
Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt... |
|
|
β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
|
|
Computing preprocessing statistics... |
|
|
β Statistics computed and saved: |
|
|
Total embeddings: 17,968 |
|
|
Mean: -0.0005 Β± 0.0897 |
|
|
Std: 0.0869 Β± 0.1168 |
|
|
Range: [-9.1738, 3.2894] |
|
|
Initializing models... |
|
|
β Model compiled with torch.compile for speedup |
|
|
β Models initialized: |
|
|
Compressor parameters: 78,817,360 |
|
|
Decompressor parameters: 39,458,720 |
|
|
Flow model parameters: 50,779,584 |
|
|
Initializing datasets with FULL data... |
|
|
Loading AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
|
|
Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt (FULL DATA)... |
|
|
β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
|
|
Loading CFG data from FASTA: /home/edwardsun/flow/combined_final.fasta... |
|
|
Parsing FASTA file: /home/edwardsun/flow/combined_final.fasta |
|
|
Label assignment: >AP = AMP (0), >sp = Non-AMP (1) |
|
|
β Parsed 6983 valid sequences from FASTA |
|
|
AMP sequences: 3306 |
|
|
Non-AMP sequences: 3677 |
|
|
Masked for CFG: 698 |
|
|
Loaded 6983 CFG sequences |
|
|
Label distribution: [3306 3677] |
|
|
Masked 698 labels for CFG training |
|
|
Aligning AMP embeddings with CFG data... |
|
|
Aligned 6983 samples |
|
|
CFG Flow Dataset initialized: |
|
|
AMP embeddings: torch.Size([17968, 50, 1280]) |
|
|
CFG labels: 6983 |
|
|
Aligned samples: 6983 |
|
|
β Dataset initialized with FULL data: |
|
|
Total samples: 6,983 |
|
|
Batch size: 384 |
|
|
Batches per epoch: 19 |
|
|
Total training steps: 38,000 |
|
|
Validation every: 10,000 steps |
|
|
Initializing optimizer and scheduler... |
|
|
β Optimizer initialized: |
|
|
Base LR: 0.0012 |
|
|
Min LR: 0.0006 |
|
|
Warmup steps: 5000 |
|
|
Weight decay: 0.01 |
|
|
Gradient clip norm: 1.0 |
|
|
β Optimized Single GPU training setup complete with FULL DATA! |
|
|
π Starting Optimized Single GPU Flow Matching Training with FULL DATA |
|
|
GPU: 0 |
|
|
Total iterations: 2000 |
|
|
Batch size: 384 |
|
|
Total samples: 6,983 |
|
|
Mixed precision: True |
|
|
Estimated time: ~8-10 hours (overnight training with ALL data) |
|
|
============================================================ |
|
|
Training Flow Model: 0%| | 0/2000 [00:00<?, ?it/s]/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
Training Flow Model: 0%| | 1/2000 [00:49<27:39:59, 49.82s/it]Epoch 0 | Step 1/ 38000 | Loss: 2.290177 | LR: 1.20e-04 | Speed: 0.0 steps/s | ETA: 376.4h |
|
|
Epoch 0 | Avg Loss: 1.109821 | LR: 1.24e-04 | Time: 49.8s | Samples: 6,983 |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
Training Flow Model: 0%| | 2/2000 [00:55<13:22:45, 24.11s/it]Epoch 1 | Step 20/ 38000 | Loss: 1.010002 | LR: 1.24e-04 | Speed: 0.4 steps/s | ETA: 27.0h |
|
|
Epoch 1 | Avg Loss: 1.002409 | LR: 1.28e-04 | Time: 6.1s | Samples: 6,983 |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
Training Flow Model: 0%| | 3/2000 [00:59<8:09:52, 14.72s/it] Epoch 2 | Step 39/ 38000 | Loss: 0.998573 | LR: 1.28e-04 | Speed: 0.7 steps/s | ETA: 15.4h |
|
|
Epoch 2 | Avg Loss: 0.910289 | LR: 1.32e-04 | Time: 3.5s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 4/2000 [01:02<5:42:30, 10.30s/it]Epoch 3 | Step 58/ 38000 | Loss: 0.787784 | LR: 1.33e-04 | Speed: 1.0 steps/s | ETA: 11.1h |
|
|
Epoch 3 | Avg Loss: 0.644033 | LR: 1.36e-04 | Time: 3.5s | Samples: 6,983 |
|
|
|