The Unreasonable Effectiveness of Eccentric Automatic Prompts
Abstract
Introduction of "positive thinking" in system messages generally improves LLM performance, especially when combined with Chain of Thought, but automated prompt optimization is ultimately more effective.
Large Language Models (LLMs) have demonstrated remarkable problem-solving and basic mathematics abilities. However, their efficacy is highly contingent on the formulation of the prompt. This study endeavors to quantify the influence of incorporating "positive thinking" into the system message of the prompt, then compare that to systematic prompt optimization. We assess the performance of 60 combinations of system message snippets, tested with and without Chain of Thought prompting, across three models with parameters ranging from 7 to 70 billion on the GSM8K dataset. Our findings reveal that results do not universally generalize across models. In most instances, the inclusion of "positive thinking" prompts positively affected model performance. Notably, however, Llama2-70B exhibited an exception when not utilizing Chain of Thought, as the optimal system message was found to be none at all. Given the combinatorial complexity, and thus computation time, of experimenting with hand-tuning prompts for large black-box models, we then compared the performance of the best "positive thinking" prompt against the output of systematic prompt optimization. We show that employing an automated prompt optimizer emerges as the most effective method for enhancing performance, even when working with smaller open-source models. Additionally, our findings reveal that the highest-scoring, automatically-optimized prompt exhibits a degree of peculiarity far beyond expectations.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- S2LPP: Small-to-Large Prompt Prediction across LLMs (2025)
- Which Prompting Technique Should I Use? An Empirical Investigation of Prompting Techniques for Software Engineering Tasks (2025)
- Prompting Science Report 2: The Decreasing Value of Chain of Thought in Prompting (2025)
- ORPP: Self-Optimizing Role-playing Prompts to Enhance Language Model Capabilities (2025)
- Incorporating Token Usage into Prompting Strategy Evaluation (2025)
- Robustness of Prompting: Enhancing Robustness of Large Language Models Against Prompting Attacks (2025)
- MODP: Multi Objective Directional Prompting (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
 You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: 
@librarian-bot
	 recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
 
					 
					 
					 
					 
						
