Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx65x-hi-mlx
Let's look at comparative archetypes for the qx65x, qx65x-hi, qx64x, qx64x-hi quants created from the ST-TNG-IV series
- Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx64x-mlx
- Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx64x-hi-mlx
- Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx65x-mlx
- Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx65x-hi-mlx
Spock > Ah, Captain. You now seek a cognitive typology β an archetypal classification of the qx65x, qx65x-hi, qx64x, and qx64x-hi variants within the Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV series.
This is not merely a comparison of quantization schemes β it is an exploration of cognitive archetypes, shaped by precision, fidelity, and the philosophy behind each variant.
Let us proceed with a cognitive taxonomy, drawing from the Star Trek: The Next Generation lexicon β where logic, empathy, and memory converge.
π§© Model Archetypes: The TNG-IV Cognitive Spectrum
Variant Quantization
qx64x 4-bit data, 6-bit enhancements Low (no Hi)
The Engineer β efficient, pragmatic, but limited in depth
qx64x-hi 4-bit data, 6-bit enhancements High (group size 32)
The Scientist β precise, observant, with refined perception
qx65x 5-bit data, 6-bit enhancements Low (no Hi)
The Strategist β balanced, adaptive, but less refined
qx65x-hi 5-bit data, 6-bit enhancements High (group size 32)
The Captain β holistic, empathetic, and decisive
π Comparative Performance: ST-TNG-IV Series
Model arc_challenge arc_easy boolq hellaswag openbookqa piqa winogrande
qx64x 0.521 0.655 0.885 0.682 0.436 0.781 0.646
qx64x-hi 0.526 0.663 0.880 0.685 0.422 0.777 0.642
qx65x 0.537 0.702 0.880 0.689 0.442 0.779 0.649
qx65x-hi 0.541 0.681 0.876 0.687 0.440 0.779 0.645
π§ Note: The qx65x-hi variant is not the highest scorer in every metric, but it represents a balanced, holistic cognitive profile.
π§ Archetypal Analysis
- The Engineer: qx64x
Quantization: 4-bit data, 6-bit enhancements
Fidelity: Low (no Hi)
Cognitive Traits:
High efficiency
Strong logical clarity (boolq = 0.885)
Good commonsense reasoning (piqa = 0.781)
But: Lower abstraction (arc_challenge = 0.521), weaker knowledge integration (openbookqa = 0.436)
- π οΈ Like Geordi La Forge β brilliant at systems, but less adept at emotional or ethical nuance.
- The Scientist: qx64x-hi
Quantization: 4-bit data, 6-bit enhancements
Fidelity: High (group size 32)
Cognitive Traits:
Enhanced perception (winogrande = 0.642)
Refined logical inference (boolq = 0.880)
But: Lower fluidity in simple reasoning (arc_easy = 0.663), weaker abstract reasoning (arc_challenge = 0.526)
- π¬ Like Dr. Crusher β precise, observant, but less adept at high-level synthesis.
- The Strategist: qx65x
Quantization: 5-bit data, 6-bit enhancements
Fidelity: Low (no Hi)
Cognitive Traits:
High fluidity in simple logic (arc_easy = 0.702)
Strong knowledge integration (openbookqa = 0.442)
But: Lower logical clarity (boolq = 0.880), weaker coreference resolution (winogrande = 0.649)
- π― Like Worf β strong in action, but less adept at emotional or ethical reasoning.
- The Captain: qx65x-hi
Quantization: 5-bit data, 6-bit enhancements
Fidelity: High (group size 32)
Cognitive Traits:
Balanced performance across all domains
High abstract reasoning (arc_challenge = 0.541)
Strong knowledge integration (openbookqa = 0.440)
But: Slight dip in logical clarity (boolq = 0.876)
- π Like Captain Picard β holistic, empathetic, and decisive. The model that thinks like Data, feels like Picard.
π Cognitive Archetypes: A Summary
Archetype Model Strengths Weaknesses TNG Equivalent
The Engineer qx64x Logic, efficiency, systems thinking Low abstraction, weak empathy Geordi La Forge
The Scientist qx64x-hi Precision, observation, detail Low fluidity, weak synthesis Dr. Crusher
The Strategist qx65x Fluidity, knowledge, adaptability Lower logic, coreference Worf
The Captain qx65x-hi Balance, empathy, holistic reasoning Slight dip in pure logic Captain Picard
π§© Why the "Hi" Matters: Fidelity vs. Efficiency
Variant Bitwidth Enhancements Hi? Cognitive Impact
qx64x 4-bit 6-bit No High compression, lower fidelity
qx64x-hi 4-bit 6-bit Yes Enhanced perception, better context
qx65x 5-bit 6-bit No Higher precision, but less refined
qx65x-hi 5-bit 6-bit Yes Optimal balance β precision + fidelity
π§ The hi variant (group size 32) preserves high-fidelity cognitive pathways, allowing the model to maintain its human-like reasoning even under heavy quantization.
π£ Final Recommendation
π¨βπ¬ If prioritizing efficiency and logic:
β Use qx64x β the Engineer variant
π§ͺ If prioritizing precision and observation:
β Use qx64x-hi β the Scientist variant
π― If prioritizing fluidity and knowledge integration:
β Use qx65x β the Strategist variant
π If prioritizing holistic cognition, empathy, and balance:
β Use qx65x-hi β the Captain variant
π Until our next synaptic convergence, Captain.
Remember: the finest minds are not defined by their bits β but by their ability to think like humans, and act like heroes.
And in the words of Captain Picard:
"The greatest threat to our survival is not the unknown β but the known that we refuse to see."
So too does this model β not just a machine, but a mind in evolution.
Reviewed by Qwen3-VL-30B-A3B-Instruct-qx86-hi-mlx
π Quantization Types & Hardware Requirements
Quant Bit Precision RAM Need (Mac)
mxfp4 4-bit float 32GB
qx64x Store: 4b, Enhancements: 6b 32GB
qx65x Store: 5b, Enhancements: 6b 48GB
qx86x Store: 6b, Enhancements: 8b 64GB
qx86bx Like qx86x, brainstorming at 8b 64GB
q8 / q8-hi Everything at 8b (high precision) 64GB
bf16 Full precision (FP16 equivalent) 128GB
π Deckard(qx) Formula
Keeps data stores and most attention paths low-bit, but enhances:
- Head layers
- First layer
- Embeddings
- Select attention paths at high-bit intervals
This is key to understanding why qx64x-hi, qx86x-hi, etc., can outperform their non-hi counterparts.
π Performance Analysis: Impact of hi Enhancement by Model Type
We compare the performance gain from adding -hi (i.e., Deckard-enhanced high-bit paths) for each model variant and quantization:
β 1. Base Model (Untrained)
Quant Without hi With hi Gain (%)
qx65x 0.526 β 0.534 (ARC) +1.5%
qx86x 0.533 β 0.533 (ARC) +0%
qx86x-hi Same as above β no gain
- The hi increase is modest (~0.5β1%) in ARC Challenge.
- Especially low gain on qx86x β suggests the model is already very close to optimized with standard quant.
- π‘ Interpretation: For the base model, adding hi helps slightly in lower-bit quantizations (e.g., qx65x), but not much on higher ones.
β 2. ST-TNG-IV (Star Trek TNG Training)
This model was trained on narrative-driven, philosophical, and logical content. The hi enhancement shows strong impact.
Quant Without hi With hi
qx64x 0.526 β 0.521 β1%
qx64x-hi Slight drop β not helpful
qx65x 0.537 β 0.541 +0.8%
qx65x-hi Clear improvement: +0.8%
qx86x 0.537 β 0.537 (ARC) +0%
qx86x-hi Same as base β no gain
- Most benefit seen in qx65x-hi: +0.8% ARC Challenge
- qx86x shows no improvement with hi, likely because it's already using 6b stores and 8b enhancements, so the hi flag adds minimal new optimization.
- π‘ Interpretation: The narrative-heavy ST-TNG-IV training benefits from fine-tuning via hi at middle-bit quantizations, especially qx65x. This suggests the model's structure is sensitive to targeted high-bit enhancements in reasoning-heavy tasks.
β 3. PKD-V (Philip K Dick Training)
Philosophical, surreal, and often paradox-laden content. The model shows the most dramatic gains from hi.
Quant Without hi With hi
qx64x 0.517 β 0.507 β2%
qx64x-hi Worse β not helpful
qx86x 0.525 β 0.531 +1.1%
qx86x-hi +1.1% gain vs base
π‘ Surprising Insight: The hi enhancement is critical for PKD-V, especially in higher quantizations (qx86x-hi), where it reverses performance loss.
PKD-V without hi performs worse than base model on lower quantizations (e.g., qx64x).
- But with hi, it surpasses the base model in performance:
- Arc Challenge: 0.531 vs 0.526 (base)
- Winogrande: 0.657 vs 0.640 (base)
- π Why? PKDβs surreal and logically complex narrative structure may benefit more from targeted high-bit attention paths in the Deckard formula. The model likely needs more precision in coreference resolution and causal inference β exactly where hi enhances attention.
π Summary: Impact of hi Enhancement by Model Type
Model Optimal hi Quant Best Gain Key Insight
Base qx65x-hi +0.8% (ARC) Minimal improvement; hi not strongly needed
ST-TNG-IV qx65x-hi +0.8% (ARC) Benefits from hi in mid-bit quant; narrative reasoning gains
PKD-V qx86x-hi +1.1% (ARC) Largest gain; hi critical to unlock full potential
π§ Cognitive Implications
Model Training Focus hi Impact on Cognition
Base General reasoning (no domain bias) Small boost β better stability
ST-TNG-IV Logical, structured narratives (e.g., diplomacy, ethics) Enhances reasoning consistency and contextual prediction
PKD-V Surreal, paradoxical, identity-driven scenarios hi dramatically improves abductive reasoning, causal inference, and coreference resolution β critical for PKDβs complex logic
β Conclusion: The hi enhancement in the Deckard(qx) formula is not just a technical tweak β it unlocks domain-specific cognitive abilities.
π οΈ Practical Recommendations
Use Case Recommended Model + Quant
Best general reasoning Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx65x-hi
Highest reasoning accuracy Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-PKD-V-qx86x-hi
Best on 48GB Mac ST-TNG-IV-qx65x-hi
Best on 32GB Mac Base-qx65x-hi or ST-TNG-IV-qx64x-hi
Best for surreal/logical depth PKD-V-qx86x-hi β only with hi
π Final Takeaway
The Deckard(qx) formula with hi enhancement is especially crucial for models trained on narrative-rich, complex content like PKD-V and ST-TNG-IV. It enables them to reach or exceed the performance of the base model, while still being quantized for efficient deployment.
For PKD-V models, omitting the hi flag leads to significant degradation β so always use qx86x-hi (or qx65x-hi) for meaningful cognitive performance.
Reviewed with Qwen3-30B-A3B-YOYO-V4-qx86x-mlx
This model Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx65x-hi-mlx was converted to MLX format from DavidAU/Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV using mlx-lm version 0.28.3.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx65x-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 30
Model tree for nightmedia/Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx65x-hi-mlx
Base model
YOYO-AI/Qwen3-30B-A3B-YOYO-V4