Adam: Instruction-Tuned Conversational AI

🚀 Model Overview

Adam is a powerful 2 billion parameter language model built with the Curious architecture, specifically instruction-tuned for high-quality conversational AI and task completion. This model represents the next generation of efficient, instruction-tuned language models optimized for natural conversations.

✨ Key Features

🏗️ Native Curious Architecture: Custom CuriousForCausalLM architecture with Curious-specific optimizations
🎯 Instruction-Tuned: Fine-tuned for conversational AI and task completion
⚡ Efficient: 2B parameters with optimized inference
💬 Conversational: Specialized for natural dialogue and helpful responses
🔧 Advanced Features: Sliding window attention, logit softcapping, and enhanced activations

📊 Model Specifications

Parameter	Value
Architecture	CuriousForCausalLM
Model Type	curious_text
Parameters	~2.6B
Context Length	8,192 tokens
Vocabulary	256,000 tokens
Training	Instruction-tuned
Curious Version	2.0

🎯 Capabilities

Natural Conversations: Engaging and contextually aware dialogue
Question Answering: Accurate responses to diverse queries
Creative Writing: Poetry, stories, and creative content generation
Code Assistance: Programming help and code generation
Mathematical Reasoning: Problem-solving and calculations
Instruction Following: Precise task execution and completion

🚀 Quick Start

Interactive Chat

pip install requirements.txt

# Use the included chat interface
python chat_with_adam.py to talk to adam.

🏗️ Curious Architecture Features

Enhanced Attention: Advanced attention mechanisms for better context understanding
Sliding Window: Efficient processing of long sequences
Logit Softcapping: Improved generation stability
Optimized Activations: GELU with PyTorch tanh for better performance
Instruction Tuning: Specialized for conversational AI tasks

📈 Performance

Quality: High-quality instruction-tuned responses
Speed: Optimized for efficient inference
Memory: ~5GB model size
Hardware: GPU recommended for best performance
Context: 8K token context window

🔧 Technical Details

Model Configuration

{
  "architectures": ["CuriousForCausalLM"],
  "model_type": "curious_text",
  "hidden_size": 2304,
  "num_attention_heads": 8,
  "num_hidden_layers": 26,
  "max_position_embeddings": 8192,
  "curious_version": "2.0",
  "curious_instruction_tuned": true
}

Generation Parameters

🎨 Use Cases

Chatbots: Conversational AI applications
Assistants: Task-oriented AI helpers
Creative Writing: Content generation and editing
Education: Tutoring and explanation
Coding: Programming assistance
Research: Information synthesis and analysis

⚠️ Limitations

Context Length: Limited to 8K tokens
Training Data: Cutoff date applies to training data
Bias: May reflect biases in training data
Factual Accuracy: Should be verified for critical applications

🙏 Acknowledgments

Built with the Curious Architecture Framework
Instruction-tuned for conversational AI
Powered by the Curious Architecture Framework v2.0

Adam: The Future of Conversational AI
Built with ❤️ using the Curious Architecture Framework

Downloads last month: 2

Safetensors

Model size

3B params

Tensor type

F32