--- license: apache-2.0 tags: - pretrained - base-model language: - en - ko - ja pipeline_tag: text-generation library_name: transformers extra_gated_fields: Full Name: text Email: text Organization: text ---

Tri-7B-Base

# Tri-7B-Base ## Introduction We present **Tri-7B-Base**, a foundation language model that serves as the pre-trained base for our Tri-7B model family. This model represents our commitment to efficient training while establishing a strong foundation for downstream fine-tuning and adaptation. ### Key Features * **Foundation Architecture**: State-of-the-art transformer architecture optimized for efficiency * **Multi-lingual Foundation**: Pre-trained on diverse data in Korean, English, and Japanese * **Efficient Training**: Optimized training methodology for computational efficiency ### Model Specifications #### Tri-7B-Base - Type: Causal Language Model - Training Stage: Pre-training - Architecture: Transformer Decoder with RoPE, SwiGLU, RMSNorm - Number of Parameters: 7.76B - Number of Layers: 32 - Number of Attention Heads: 32 - Context Length: 4,096 - Vocab Size: 128,128 ## Use Cases As a base model, Tri-7B-Base is designed to serve as a foundation for various downstream applications: - **Fine-tuning**: Adapt to specific domains or tasks - **Instruction Tuning**: Create chat or assistant models - **Domain Specialization**: Customize for specific industries or use cases - **Research**: Explore model behaviors and capabilities - **Language Generation**: General text completion and generation tasks ## Limitations - **Base Model Nature**: This is a pre-trained base model without instruction tuning or alignment. For chat or assistant capabilities, consider fine-tuned variants. - **Language Support**: The model is optimized for English, Korean, and Japanese. Usage with other languages may result in degraded performance. - **Knowledge Cutoff**: The model's information is limited to data available up to February, 2025. - **Generation Quality**: As a base model, outputs may require post-processing or filtering for production use cases. ## License This model is licensed under the Apache License 2.0. ## Contact For inquiries, please contact: info@trillionlabs.co