This is the second model in the ensemble for the MindsAI @ Tufa Labs team for the ARC Prize 2025 competition. It was originally based on the CodeT5 model from Salesforce. It was modified to have 16 layers in the decoder from the original 24 layers. Testing demonstrated that removing layers was more harmful to performance when removed from the encoder, but was able to fully recover when removing decoder layers. The model has been trained for approximately 2 years on a V4-64 TPU. Google TPU Research cloud was very generous to provide TPUs for training and research. It would have been impossible to develop TTFT, AIRV, train the models, and many other things without the generosity of Google and the TPU Research Cloud program.
- Span-Corruption Refinement (SCR): The model was trained with an additional
- pretraining objective I call SCR (chosen because of the model's deep history of
- training with the span corruption objective). The answer is noised with heavy
- span corruption and data augmentation (to mimic incorrect answers) and passed
- along with the prompt for refinement (model outputs the full corrected grid).
- Note: This did not result in better performance when used during inference (only used during TTT).
ARC Data Formatting
- ARC tasks ship as JSON where each
task_idcontainstrainpairs andtestinputs; every grid is a rectangular list of lists with integers0-9. Dimensions follow the original 1×1–30×30 spec, though the evaluator accepts up to 50×50. - Example task payload:
{ "task_id": { "train": [ {"input": [[0,0],[1,1]], "output": [[1,1],[1,1]]} ], "test": [ {"input": [[0,0,0],[0,1,0],[0,0,0]]} ] } } - Model prompts (
promptcolumn during training/TTT/inference) are serialized text strings:solve: train input1 <train_input> output1 <prefix><train_output>. … test tinput1 <test_input> toutput1. Each grid token<train_input>/<train_output>/<test_input>is produced bygrid_to_string, so rows are concatenated digits separated by spaces. Multiple train examples increment the index (input2,output2, etc.). - Prompt example:
solve: train input1 000 010 000 output1 11 3 3 10 111 101 111. input2 00 02 output2 5 2 2 20 22 20. test tinput1 0000 0300 0000 0000 toutput1 - Model targets (
correct_answercolumn and expected decoder output before post-processing) followoutput_prefixsemantics:{total_chars} {height} {width} {symbols} {row_strings}.Heretotal_chars = height*width + (height - 1)andsymbolsis the deduplicated sequence of colors as they are first encountered when scanning the board row-major; that rule applies to every output grid we emit (training outputs inside the prompt and the predicted test toutput). Example target string for a 3×3 donut:11 3 3 10 111 101 111.
- Downloads last month
- 37
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for mindware/arc-codet5-660m-scr
Base model
Salesforce/codet5-large