Improve model card for AnalogSeeker_2025_07_10_3
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,34 +1,121 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
license: other
|
| 4 |
-
base_model: Qwen2.5-32B-Instruct
|
| 5 |
tags:
|
| 6 |
- llama-factory
|
| 7 |
- full
|
| 8 |
- generated_from_trainer
|
|
|
|
|
|
|
| 9 |
model-index:
|
| 10 |
- name: AnalogSeeker (Qwen2.5-32B-Instruct_nsc-sft)
|
| 11 |
results: []
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
| 15 |
-
should probably proofread and complete it, then remove this comment. -->
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
-
|
|
|
|
| 20 |
|
| 21 |
## Model description
|
| 22 |
|
| 23 |
-
|
|
|
|
|
|
|
| 24 |
|
| 25 |
## Intended uses & limitations
|
| 26 |
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
## Training and evaluation data
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
## Training procedure
|
| 34 |
|
|
@@ -51,7 +138,17 @@ The following hyperparameters were used during training:
|
|
| 51 |
|
| 52 |
### Training results
|
| 53 |
|
| 54 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
### Framework versions
|
| 57 |
|
|
@@ -59,3 +156,17 @@ The following hyperparameters were used during training:
|
|
| 59 |
- Pytorch 2.5.1+cu124
|
| 60 |
- Datasets 3.6.0
|
| 61 |
- Tokenizers 0.21.1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: Qwen2.5-32B-Instruct
|
| 3 |
library_name: transformers
|
| 4 |
license: other
|
|
|
|
| 5 |
tags:
|
| 6 |
- llama-factory
|
| 7 |
- full
|
| 8 |
- generated_from_trainer
|
| 9 |
+
- analog-circuit-design
|
| 10 |
+
pipeline_tag: text-generation
|
| 11 |
model-index:
|
| 12 |
- name: AnalogSeeker (Qwen2.5-32B-Instruct_nsc-sft)
|
| 13 |
results: []
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# AnalogSeeker: An Open-source Foundation Language Model for Analog Circuit Design
|
|
|
|
| 17 |
|
| 18 |
+
This model, `AnalogSeeker_2025_07_10_3`, is a fine-tuned version of `Qwen2.5-32B-Instruct`. It was presented in the paper [AnalogSeeker: An Open-source Foundation Language Model for Analog Circuit Design](https://huggingface.co/papers/2508.10409).
|
| 19 |
|
| 20 |
+
* **Project Page**: [https://huggingface.co/analogllm/analogseeker](https://huggingface.co/analogllm/analogseeker)
|
| 21 |
+
* **GitHub Repository**: [https://github.com/analogllm/AnalogSeeker](https://github.com/analogllm/AnalogSeeker)
|
| 22 |
|
| 23 |
## Model description
|
| 24 |
|
| 25 |
+
AnalogSeeker is an open-source foundation language model specifically developed for analog circuit design. Its primary objective is to integrate specialized domain knowledge and provide design assistance in this complex field. To address the inherent scarcity of data in analog circuit design, AnalogSeeker employs a unique corpus collection strategy: high-quality, accessible textbooks across relevant subfields are systematically curated and cleaned into a textual domain corpus.
|
| 26 |
+
|
| 27 |
+
The model introduces a granular domain knowledge distillation method where raw, unlabeled domain corpus is decomposed into typical, granular learning nodes. A multi-agent framework is then utilized to distill implicit knowledge embedded in unstructured text into detailed question-answer data pairs, complete with detailed reasoning processes. This yields a fine-grained, learnable dataset used for fine-tuning. AnalogSeeker explores and shares novel training methods, establishing a fine-tuning-centric training paradigm and implementing a neighborhood self-constrained supervised fine-tuning algorithm to enhance training outcomes by constraining the perturbation magnitude between the model's output distributions.
|
| 28 |
|
| 29 |
## Intended uses & limitations
|
| 30 |
|
| 31 |
+
**Intended Uses:**
|
| 32 |
+
AnalogSeeker is intended for research use in the field of analog circuit design. It aims to:
|
| 33 |
+
* Integrate domain knowledge for analog circuits.
|
| 34 |
+
* Provide design assistance and answer domain-specific questions.
|
| 35 |
+
* Support tasks such as operational amplifier design.
|
| 36 |
+
* Serve as a foundation for further research and development in analog circuit LLMs.
|
| 37 |
+
|
| 38 |
+
**Limitations:**
|
| 39 |
+
While AnalogSeeker demonstrates strong performance on analog circuit knowledge evaluation benchmarks, it is specialized for this domain. Its applicability and performance in other, unrelated domains may be limited. Users should be aware that, like all language models, it may occasionally generate incorrect or nonsensical information, especially for highly novel or unrepresented concepts within its training data.
|
| 40 |
|
| 41 |
## Training and evaluation data
|
| 42 |
|
| 43 |
+
**Training Data:**
|
| 44 |
+
The model was trained on a meticulously collected corpus based on the domain knowledge framework of analog circuits. This corpus consists of high-quality, accessible textbooks across relevant subfields, systematically curated and cleaned. A granular domain knowledge distillation method was applied, where raw text was decomposed into learning nodes, and a multi-agent framework distilled implicit knowledge into question-answer data pairs with detailed reasoning for fine-tuning.
|
| 45 |
+
|
| 46 |
+
**Evaluation Data and Performance:**
|
| 47 |
+
AnalogSeeker was evaluated on AMSBench-TQA, the analog circuit knowledge evaluation benchmark. It achieved **85.04% accuracy**, marking a significant **15.67% point improvement** over the original Qwen2.5-32B-Instruct model and demonstrating competitive performance with mainstream commercial models.
|
| 48 |
+
|
| 49 |
+
## Sample Usage
|
| 50 |
+
|
| 51 |
+
You can use this model with the Hugging Face `transformers` library:
|
| 52 |
+
|
| 53 |
+
```python
|
| 54 |
+
import torch
|
| 55 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
|
| 56 |
+
|
| 57 |
+
model_id = "analogllm/AnalogSeeker_2025_07_10_3"
|
| 58 |
+
|
| 59 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
| 60 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 61 |
+
model_id,
|
| 62 |
+
torch_dtype=torch.bfloat16,
|
| 63 |
+
device_map="auto",
|
| 64 |
+
trust_remote_code=True
|
| 65 |
+
)
|
| 66 |
+
|
| 67 |
+
# Example chat interaction (Qwen2.5 Instruct format)
|
| 68 |
+
messages = [
|
| 69 |
+
{"role": "user", "content": "What is the primary function of a common-emitter amplifier in analog circuits?"}
|
| 70 |
+
]
|
| 71 |
+
|
| 72 |
+
# Apply the chat template and prepare inputs
|
| 73 |
+
text = tokenizer.apply_chat_template(
|
| 74 |
+
messages,
|
| 75 |
+
tokenize=False,
|
| 76 |
+
add_generation_prompt=True
|
| 77 |
+
)
|
| 78 |
+
inputs = tokenizer(text, return_tensors='pt').to(model.device)
|
| 79 |
+
|
| 80 |
+
# Configure generation parameters
|
| 81 |
+
generation_config = GenerationConfig(
|
| 82 |
+
max_new_tokens=512,
|
| 83 |
+
do_sample=True,
|
| 84 |
+
temperature=0.7,
|
| 85 |
+
top_p=0.8,
|
| 86 |
+
repetition_penalty=1.05,
|
| 87 |
+
eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|im_end|>")] # Ensure it stops correctly
|
| 88 |
+
)
|
| 89 |
+
|
| 90 |
+
# Generate response
|
| 91 |
+
outputs = model.generate(
|
| 92 |
+
inputs=inputs.input_ids,
|
| 93 |
+
attention_mask=inputs.attention_mask,
|
| 94 |
+
generation_config=generation_config
|
| 95 |
+
)
|
| 96 |
+
|
| 97 |
+
# Decode and print the response
|
| 98 |
+
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
|
| 99 |
+
print(response)
|
| 100 |
+
|
| 101 |
+
# Another example: design assistance
|
| 102 |
+
messages_design = [
|
| 103 |
+
{"role": "user", "content": "Explain the key considerations for designing a stable feedback amplifier."}
|
| 104 |
+
]
|
| 105 |
+
text_design = tokenizer.apply_chat_template(
|
| 106 |
+
messages_design,
|
| 107 |
+
tokenize=False,
|
| 108 |
+
add_generation_prompt=True
|
| 109 |
+
)
|
| 110 |
+
inputs_design = tokenizer(text_design, return_tensors='pt').to(model.device)
|
| 111 |
+
outputs_design = model.generate(
|
| 112 |
+
inputs=inputs_design.input_ids,
|
| 113 |
+
attention_mask=inputs_design.attention_mask,
|
| 114 |
+
generation_config=generation_config
|
| 115 |
+
)
|
| 116 |
+
response_design = tokenizer.decode(outputs_design[0][inputs_design.input_ids.shape[1]:], skip_special_tokens=True)
|
| 117 |
+
print(response_design)
|
| 118 |
+
```
|
| 119 |
|
| 120 |
## Training procedure
|
| 121 |
|
|
|
|
| 138 |
|
| 139 |
### Training results
|
| 140 |
|
| 141 |
+
```json
|
| 142 |
+
{
|
| 143 |
+
"epoch": 1.0,
|
| 144 |
+
"num_input_tokens_seen": 113180672,
|
| 145 |
+
"total_flos": 759612479373312.0,
|
| 146 |
+
"train_loss": 1.1406613362056237,
|
| 147 |
+
"train_runtime": 17617.7573,
|
| 148 |
+
"train_samples_per_second": 0.784,
|
| 149 |
+
"train_steps_per_second": 0.012
|
| 150 |
+
}
|
| 151 |
+
```
|
| 152 |
|
| 153 |
### Framework versions
|
| 154 |
|
|
|
|
| 156 |
- Pytorch 2.5.1+cu124
|
| 157 |
- Datasets 3.6.0
|
| 158 |
- Tokenizers 0.21.1
|
| 159 |
+
|
| 160 |
+
## Citation
|
| 161 |
+
|
| 162 |
+
If you find AnalogSeeker useful in your research, please consider citing the original paper:
|
| 163 |
+
|
| 164 |
+
```bibtex
|
| 165 |
+
@article{analogseeker2025,
|
| 166 |
+
title={AnalogSeeker: An Open-source Foundation Language Model for Analog Circuit Design},
|
| 167 |
+
author={AnalogSeeker Team}, # Author information not provided in the prompt's paper details, so a placeholder like 'AnalogSeeker Team' or 'Anonymous' is often used if no specific author list is given. In this case, 'AnalogSeeker Team' seems appropriate from the context.
|
| 168 |
+
journal={arXiv preprint arXiv:2508.10409},
|
| 169 |
+
year={2025}, # Year not directly stated, assuming from the paper ID 2508.10409 which implies 2025.
|
| 170 |
+
url={https://huggingface.co/papers/2508.10409},
|
| 171 |
+
}
|
| 172 |
+
```
|