CATastrophe / README.md
ewhk9887's picture
Upload README.md with huggingface_hub
5f7e3fc verified
metadata
language: en
tags:
  - vulnerability-detection
  - code-analysis
  - autoencoder
  - anomaly-detection
library_name: pytorch
metrics:
  - mse

CATastrophe - Code Vulnerability Detector

This model is an autoencoder-based vulnerability detector for Python code. It uses TF-IDF vectorization and an autoencoder architecture to detect anomalies in code that may indicate vulnerabilities.

Model Details

  • Architecture: Autoencoder (Input → 512 → 128 → 512 → Input)
  • Input Features: 2000 (TF-IDF)
  • Training Loss: 0.0005
  • Framework: PyTorch

Usage

import torch
import pickle
from model import Autoencoder

# Load model
model = Autoencoder(input_dim=2000)
model.load_state_dict(torch.load('catastrophe_model.pth'))
model.eval()

# Load vectorizer
with open('vectorizer.pkl', 'rb') as f:
    vectorizer = pickle.load(f)

# Analyze code
code_text = "your code here"
features = vectorizer.transform([code_text]).toarray()
features_tensor = torch.tensor(features, dtype=torch.float32)

with torch.no_grad():
    reconstructed = model(features_tensor)
    anomaly_score = torch.mean((features_tensor - reconstructed) ** 2, dim=1)

Training Configuration

  • Batch Size: 256
  • Epochs: 50
  • Learning Rate: 0.001
  • Optimizer: Adam

Limitations

This model is trained on vulnerable commits only and uses reconstruction error as an anomaly score. High scores indicate potential vulnerabilities, but manual review is recommended.