SportExtract NER Model

Model Description

This is a Named Entity Recognition (NER) model fine-tuned on Indonesian sports news articles, specifically for football/soccer content.

Base Model: IndoBERT (indobenchmark/indobert-base-p1)

Model Type: Multi-label token classification

Entities Detected

The model can detect the following entities in Indonesian sports articles:

ATLET - Athletes/Players
TIM - Teams
ORGANISASI - Organizations
KEWARGANEGARAAN - Nationality
POSISI - Player positions
UMUR - Age
AKSI - Actions in matches
PENGHARGAAN - Awards/achievements
STATISTIK - Statistics
SKOR - Match scores
TANGGAL - Dates
STADION - Stadiums
KEJUARAAN - Tournaments/competitions
ALASAN_PERISTIWA - Event reasons/context

Usage

import torch
from transformers import AutoTokenizer, AutoModel
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(
    repo_id="george121212afasf/model",
    filename="best_model.pt"
)

# Load checkpoint
checkpoint = torch.load(model_path, map_location='cpu')

# Get tokenizer
tokenizer = AutoTokenizer.from_pretrained("indobenchmark/indobert-base-p1")

# Your model class and inference code here

Training Data

Trained on annotated Indonesian sports news articles from various sources.

Model Size

Parameters: ~125M (IndoBERT base)
File size: ~1420 MB

Intended Use

This model is designed for extracting sports-related entities from Indonesian news articles, particularly for:

Sports journalism analysis
Automated content tagging
Information extraction from sports news
5W1H (Who, What, When, Where, Why, How) analysis

Limitations

Optimized for Indonesian language sports content
Best performance on football, basketball, and badminton articles
May not generalize well to other sports domains

Contact

For questions or feedback, please open an issue in the repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

george121212afasf
/

model