IncomeNet-13.1k-Embeddings-log
This model is a Multi-Layer Perceptron (MLP) utilizing Categorical Embeddings to classify income levels (>50K or <=50K) from census data. It represents a more complex and robust approach compared to standard one-hot encoded models.
Model Description
- Architecture: MLP with specialized Embedding layers for categorical features (13.1k parameters).
- Hidden Layers: [128, 64] with ReLU activation.
- Key Feature: Instead of simple one-hot encoding, this model learns dense representations (embeddings) for categorical variables.
- Preprocessing: Includes Log-transformation for numerical stability and Standard Scaling.
Performance
While slightly more conservative in its raw scores compared to the base models, the Embedding-log variant offers excellent generalization:
- Accuracy: 0.806
- F1-Score: 0.768
Experimental Comparison
In the scatter plot below, this model is represented by the grey circle/square:
Superior Generalization & Training Stability
The main strength of the Embedding architecture is its stability. As seen in the evaluation loss curves, the grey line for the 13.1k-Embeddings model remains much flatter and more controlled than the base models, which tend to overfit more aggressively:
How to Use
This model requires the model_architecture.py file (specifically the IncomeNetEmbeddings class) and the corresponding preprocessor.pkl.
import torch
from safetensors.torch import load_model
from model_architecture import IncomeNetEmbeddings
# Initialize the architecture with 13.1k parameter settings
model = IncomeNetEmbeddings(input_dim=12, hidden_dims=[128, 64])
load_model(model, "model.safetensors")
model.eval()
- Downloads last month
- 15

