Semantic ID Recommender - Qwen3 8B (Video Games)

Model Description

This is a Qwen3 8B model fine-tuned for video games product recommendation using semantic IDs. The model has been trained to understand and generate hierarchical semantic identifiers that encode product relationships, enabling generative retrieval for recommendation systems.

See writeup and demo here: https://eugeneyan.com/writing/semantic-ids/

What are Semantic IDs?

Semantic IDs are learned hierarchical representations that encode product similarities and relationships in their structure. Unlike traditional IDs, semantic IDs carry meaning - similar products have similar ID prefixes.

Special Tokens

The model uses special tokens to work with semantic IDs:

<|sid_start|>: Marks the beginning of a semantic ID
<|sid_X|>: Hierarchical level tokens where X ∈ [0, 1023]
<|sid_end|>: Marks the end of a semantic ID
<|rec|>: Trigger token for generating recommendations

Semantic ID Format

<|sid_start|><|sid_127|><|sid_45|><|sid_89|><|sid_12|><|sid_end|>

This represents a 4-level hierarchy where each level provides increasingly specific categorization.

Training Details

Base Model: Qwen3 8B
Fine-tuning Method: Supervised Fine-Tuning (SFT)
Dataset: Amazon Video Games reviews and metadata
Number of Products: 66,097
Training Epochs: 2
Task: Next item prediction and recommendation generation

Usage

Installation

pip install transformers torch datasets

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "eugeneyan/semantic-id-qwen3-8b-video-games"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Set padding for generation
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Generate recommendations
prompt = "User: <|sid_start|><|sid_8|><|sid_454|><|sid_630|><|sid_768|><|sid_end|>\n<|rec|>"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=50,
        temperature=0.3,
        top_p=0.7,
        top_k=20,
        do_sample=True,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id
    )

# Decode only the generated portion
input_length = inputs["input_ids"].shape[1]
generated_tokens = outputs[:, input_length:]
response = tokenizer.decode(generated_tokens[0], skip_special_tokens=False)
print(response)

Advanced: Mapping Semantic IDs to Product Titles

from datasets import load_dataset
import pandas as pd
import re
from typing import List

# Load mapping dataset
dataset = load_dataset("eugeneyan/video-games-semantic-ids-mapping")
mapping_df = dataset['train'].to_pandas()

def parse_semantic_id(semantic_id: str) -> List[str]:
    """Parse semantic ID into component levels"""
    sid = semantic_id.replace("<|sid_start|>", "").replace("<|sid_end|>", "")
    pattern = r"<\|sid_\d+\|>"
    return re.findall(pattern, sid)

def map_semantic_id_to_titles(semantic_id_str: str, mapping_df: pd.DataFrame) -> dict:
    """
    Map semantic ID to titles with exact match and fallback.
    Returns dict with match_level, titles, count, and match_type.
    """
    levels = parse_semantic_id(semantic_id_str)

    if not levels:
        return {"match_level": 0, "titles": [], "count": 0}

    # Try exact match first
    exact_matches = mapping_df[mapping_df["semantic_id"] == semantic_id_str]
    if len(exact_matches) > 0:
        titles = exact_matches["title"].tolist()
        return {"match_level": 4, "titles": titles, "count": len(titles), "match_type": "exact"}

    # Fallback to prefix matching
    for depth in range(min(3, len(levels)), 0, -1):
        prefix = "<|sid_start|>" + "".join(levels[:depth])
        matches = mapping_df[mapping_df["semantic_id"].str.startswith(prefix)]

        if len(matches) > 0:
            titles = matches["title"].tolist()
            return {
                "match_level": depth,
                "titles": titles[:5],
                "count": len(titles),
                "match_type": "prefix"
            }

    return {"match_level": 0, "titles": [], "count": 0, "match_type": "none"}

def extract_and_replace_semantic_ids(text: str, mapping_df: pd.DataFrame) -> str:
    """Replace all semantic IDs in text with product titles"""
    pattern = r"<\|sid_start\|>(?:<\|sid_\d+\|>)+<\|sid_end\|>"
    semantic_ids = re.findall(pattern, text)

    result = text
    for sid in semantic_ids:
        match_result = map_semantic_id_to_titles(sid, mapping_df)
        if match_result["count"] > 0:
            title = match_result["titles"][0]
            replacement = f'"{title}"'
            if match_result["match_type"] == "prefix":
                replacement += f' (L{match_result["match_level"]} match)'
            if match_result["count"] > 1:
                replacement += f' [+{match_result["count"]-1} similar]'
        else:
            replacement = "[Unknown Item]"
        result = result.replace(sid, replacement)

    return result

Example Interactions

Single Item Recommendation

# Provide input of user past interactions and get recommendation
INPUT = """User: <|sid_start|><|sid_8|><|sid_454|><|sid_630|><|sid_768|><|sid_end|>, <|sid_start|><|sid_126|><|sid_501|><|sid_553|><|sid_768|><|sid_end|>, <|sid_start|><|sid_205|><|sid_370|><|sid_548|><|sid_768|><|sid_end|>
<|rec|>""".strip()
response = chat(INPUT)

# Output: Recommended product
<|sid_start|><|sid_205|><|sid_407|><|sid_586|><|sid_768|><|sid_end|><|im_end|>

# Output mapped
ASSISTANT: "Assassin's Creed 2 Deluxe Edition [Download]"

# Provide input of single past item and get similar item
INPUT = """Customers who bought <|sid_start|><|sid_201|><|sid_311|><|sid_758|><|sid_768|><|sid_end|> also bought:
<|rec|>""".strip()
response = chat(INPUT)

# Output: Recommended product
<|sid_start|><|sid_201|><|sid_396|><|sid_608|><|sid_769|><|sid_end|><|im_end|>

# Output mapped
ASSISTANT: "The Legend of Zelda: Ocarina of Time 3D"

Natural Language with Semantic IDs

# Input: Natural language context
# Provide natural language chat input and get item recommendations
INPUT = """I like scifi and action games.
<|rec|>""".strip()
response = chat(INPUT)

# Output: Multiple relevant products
<|sid_start|><|sid_64|><|sid_313|><|sid_637|><|sid_768|><|sid_end|>, <|sid_start|><|sid_219|><|sid_463|><|sid_660|><|sid_768|><|sid_end|>, <|sid_start|><|sid_64|><|sid_313|><|sid_608|><|sid_768|><|sid_end|><|im_end|>

# Output mapped
ASSISTANT: "Halo 3 Limited Edition -Xbox 360", "Battlefield: Bad Company - Playstation 3", "Halo Reach - Limited Edition -Xbox 360"

Attribute-Steered Recommendations

# Steering recommendations given an item and attribute (Xbox)
INPUT = """Recommend Xbox games similar to <|sid_start|><|sid_201|><|sid_396|><|sid_608|><|sid_769|><|sid_end|>:
<|rec|>""".strip()
response = chat(INPUT)

# Output: Xbox-specific recommendations
<|sid_start|><|sid_64|><|sid_271|><|sid_576|><|sid_768|><|sid_end|>, <|sid_start|><|sid_64|><|sid_400|><|sid_594|><|sid_768|><|sid_end|>, <|sid_start|><|sid_167|><|sid_271|><|sid_578|><|sid_768|><|sid_end|><|im_end|>

# Output mapped
ASSISTANT: "Fallout: New Vegas - Xbox 360 Ultimate Edition", "Tales of Vesperia - Xbox 360", "Halo Reach - Legendary Edition

# Provide natural language chat input and get item recommendations
INPUT = """I like animal and cute games.
<|rec|>""".strip()
response = chat(INPUT)

# Output: Games matching the genre preference
<|sid_start|><|sid_173|><|sid_324|><|sid_764|><|sid_768|><|sid_end|>, <|sid_start|><|sid_201|><|sid_397|><|sid_738|><|sid_769|><|sid_end|>, <|sid_start|><|sid_173|><|sid_305|><|sid_670|><|sid_768|><|sid_end|><|im_end|>

# Output mapped
ASSISTANT: "Animal Crossing: New Leaf", "Disney Magical World - Nintendo 3DS", "Nintendogs + Cats: Golden Retriever and New Friends"

Explanatory Recommendations

# Provide item to get recommendation and explanation
INPUT = """I just finished <|sid_start|><|sid_125|><|sid_417|><|sid_656|><|sid_768|><|sid_end|>. Suggest another <|rec|> and explain why:""".strip()
response = chat(INPUT)

# Output: Recommendation with natural language explanation
<|sid_start|><|sid_139|><|sid_289|><|sid_534|><|sid_768|><|sid_end|>

If you liked Dragon Quest Heroes II, you might like Nights of Azure because both are action RPGs for the PlayStation 4 with a focus on combat and character progression. Both games offer a narrative-driven experience with a strong emphasis on combat mechanics, suggesting a shared appeal for players who enjoy this genre on the platform.<|im_end|>

# Output mapped
ASSISTANT: "Nights of Azure - PlayStation 4"

If you liked Dragon Quest Heroes II, you might like Nights of Azure because both are action RPGs for the PlayStation 4 with a focus on combat and character progression. Both games offer a narrative-driven experience with a strong emphasis on combat mechanics, suggesting a shared appeal for players who enjoy this genre on the platform.

Multi-Turn Conversations

The model supports multi-turn conversations with context preservation:

from transformers import TextStreamer

def chat(text_input: str, messages: list = None, stream: bool = True):
    """Interactive chat with the model"""
    if messages is None:
        messages = []

    messages.append({"role": "user", "content": text_input})

    # Apply chat template
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    inputs = tokenizer(text, return_tensors="pt").to(model.device)

    # Stream output for better UX
    streamer = TextStreamer(tokenizer, skip_prompt=True) if stream else None

    with torch.no_grad():
        output = model.generate(
            **inputs,
            max_new_tokens=512,
            temperature=0.3,
            top_p=0.7,
            top_k=20,
            do_sample=True,
            streamer=streamer
        )

    # Extract only new tokens
    input_length = inputs["input_ids"].shape[1]
    generated = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)

    messages.append({"role": "assistant", "content": generated})
    return generated, messages

# 1st turn: Ask for games similar to Mario Kart
INPUT = "I'm looking for games similar to Mario Kart. <|rec|>"
response1 = chat(INPUT)

# Output
<|sid_start|><|sid_131|><|sid_492|><|sid_639|><|sid_768|><|sid_end|>, <|sid_start|><|sid_145|><|sid_480|><|sid_617|><|sid_768|><|sid_end|>, <|sid_start|><|sid_145|><|sid_290|><|sid_620|><|sid_768|><|sid_end|><|im_end|>

# Output mapped
ASSISTANT: "CTR: Crash Team Racing", "Crazy Taxi 2 - Sega Dreamcast", "Mario Kart: Super Circuit"

# 2nd turn: Tweak it for Xbox
INPUT = "How about something similar but for Xbox? <|rec|>"
response2 = chat(INPUT, new_convo=False)

# Output
<|sid_start|><|sid_183|><|sid_461|><|sid_517|><|sid_768|><|sid_end|>, <|sid_start|><|sid_183|><|sid_313|><|sid_679|><|sid_769|><|sid_end|>, <|sid_start|><|sid_183|><|sid_313|><|sid_605|><|sid_768|><|sid_end|><|im_end|>

# Output mapped
ASSISTANT: "Need for Speed Carbon - Xbox 360", "Forza Motorsport 2 - Xbox 360", "NASCAR '14 - Xbox 360"

# 3rd turn: Ask for bundle name
INPUT = "Suggest a name and description for the bundle"
response3 = chat(INPUT, new_convo=False)

# Output
ASSISTANT: Xbox Racing Legends: NASCAR & Forza Collection

Performance

Model Size: ~16GB
Inference: Requires GPU with at least 20GB VRAM for float16
Quantization: Can run on 12GB VRAM with 8-bit quantization
CPU Inference: Possible but slow; use MPS on Apple Silicon for better performance

Category Information

This model is specifically trained for Video Games products:

Total products: 66,097
Hierarchy levels: 4
Tokens per level: 1024
Semantic similarity encoded in hierarchy depth

Limitations

Trained specifically on video games products
Semantic IDs are fixed from training time
Requires mapping dataset to interpret semantic IDs
Performance may degrade on products very different from training data
May occasionally generate invalid semantic IDs (can be filtered post-generation)

Citation

If you use this model, please cite:

@model{semantic_id_qwen3_8b_video_games,
author = {Eugene Yan},
title = {Semantic ID Recommender - Qwen3 8B (Video Games)},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/eugeneyan/semantic-id-qwen3-8b-video-games}
}

Acknowledgments

Base model: Qwen Team
Training approach inspired by: https://arxiv.org/abs/2305.12218 and https://arxiv.org/abs/2306.08121
Dataset: Amazon Video Games

Related Resources

Mapping Dataset: https://huggingface.co/eugeneyan/video-games-semantic-ids-mapping
GitHub: https://github.com/eugeneyan/semantic-ids

Downloads last month: 49

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for eugeneyan/semantic-id-qwen3-8b-video-games

Base model

Qwen/Qwen2.5-3B

Finetuned

(238)

this model

eugeneyan
/

semantic-id-qwen3-8b-video-games