SportExtract NER Model

Model Description

This is a Named Entity Recognition (NER) model fine-tuned on Indonesian sports news articles, specifically for football/soccer content.

Base Model: IndoBERT (indobenchmark/indobert-base-p1)

Model Type: Multi-label token classification

Entities Detected

The model can detect the following entities in Indonesian sports articles:

  • ATLET - Athletes/Players
  • TIM - Teams
  • ORGANISASI - Organizations
  • KEWARGANEGARAAN - Nationality
  • POSISI - Player positions
  • UMUR - Age
  • AKSI - Actions in matches
  • PENGHARGAAN - Awards/achievements
  • STATISTIK - Statistics
  • SKOR - Match scores
  • TANGGAL - Dates
  • STADION - Stadiums
  • KEJUARAAN - Tournaments/competitions
  • ALASAN_PERISTIWA - Event reasons/context

Usage

import torch
from transformers import AutoTokenizer, AutoModel
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(
    repo_id="george121212afasf/model",
    filename="best_model.pt"
)

# Load checkpoint
checkpoint = torch.load(model_path, map_location='cpu')

# Get tokenizer
tokenizer = AutoTokenizer.from_pretrained("indobenchmark/indobert-base-p1")

# Your model class and inference code here

Training Data

Trained on annotated Indonesian sports news articles from various sources.

Model Size

  • Parameters: ~125M (IndoBERT base)
  • File size: ~1420 MB

Intended Use

This model is designed for extracting sports-related entities from Indonesian news articles, particularly for:

  • Sports journalism analysis
  • Automated content tagging
  • Information extraction from sports news
  • 5W1H (Who, What, When, Where, Why, How) analysis

Limitations

  • Optimized for Indonesian language sports content
  • Best performance on football, basketball, and badminton articles
  • May not generalize well to other sports domains

Contact

For questions or feedback, please open an issue in the repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using george121212afasf/model 1