Smaller Language Models Are Better Instruction Evolvers
Abstract
Smaller language models can generate more effective and diverse instructions than larger models, challenging the assumption that model size directly correlates with instruction synthesis quality.
Instruction tuning has been widely used to unleash the complete potential of large language models. Notably, complex and diverse instructions are of significant importance as they can effectively align models with various downstream tasks. However, current approaches to constructing large-scale instructions predominantly favour powerful models such as GPT-4 or those with over 70 billion parameters, under the empirical presumption that such larger language models (LLMs) inherently possess enhanced capabilities. In this study, we question this prevalent assumption and conduct an in-depth exploration into the potential of smaller language models (SLMs) in the context of instruction evolution. Extensive experiments across three scenarios of instruction evolution reveal that smaller language models (SLMs) can synthesize more effective instructions than LLMs. Further analysis demonstrates that SLMs possess a broader output space during instruction evolution, resulting in more complex and diverse variants. We also observe that the existing metrics fail to focus on the impact of the instructions. Thus, we propose Instruction Complex-Aware IFD (IC-IFD), which introduces instruction complexity in the original IFD score to evaluate the effectiveness of instruction data more accurately. Our source code is available at: https://github.com/HypherX/Evolution-Analysis{https://github.com/HypherX/Evolution-Analysis}
Community
Abstract
Instruction tuning has been widely used to unleash the complete potential of large language models. Notably, complex and diverse instructions are of significant importance as they can effectively align models with various downstream tasks. However, current approaches to constructing large-scale instructions predominantly favor powerful models such as GPT-4 or those with over 70 billion parameters, under the empirical presumption that such larger language models (LLMs) inherently possess enhanced capabilities. In this study, we question this prevalent assumption and conduct an in-depth exploration into the potential of smaller language models (SLMs) in the context of instruction evolution. Extensive experiments across three scenarios of instruction evolution reveal that smaller language models (SLMs) can synthesize more effective instructions than LLMs. Further analysis demonstrates that SLMs possess a broader output space during instruction evolution, resulting in more complex and diverse variants. We also observe that the existing metrics fail to focus on the impact of the instructions. Thus, we propose Instruction Complex-Aware IFD (IC-IFD), which introduces instruction complexity in the original IFD score to evaluate the effectiveness of instruction data more accurately. Our source code is available at: https://github.com/HypherX/Evolution-Analysis
Key Findings
- SLMs can evolve more complex and challenging instructions than LLMs. 
- SLMs can generate more diverse instructions than LLMs. 
- SLMs can automatically evolve more effective instructions than LLMs. 
- SLMs have a broader output space and are less likely to be overconfident than LLMs. 
Detailed Results
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Stronger Models are NOT Stronger Teachers for Instruction Tuning (2024)
- Constraint Back-translation Improves Complex Instruction Following of Large Language Models (2024)
- Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch (2024)
- LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios (2024)
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration (2024)
- IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization (2024)
- Language Models can Self-Lengthen to Generate Long Texts (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
 You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: 
@librarian-bot
	 recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
 KABI
							KABI 
					 
					




 
						 
					 
					