metadata
language: en
tags:
- vulnerability-detection
- code-analysis
- autoencoder
- anomaly-detection
library_name: pytorch
metrics:
- mse
CATastrophe - Code Vulnerability Detector
This model is an autoencoder-based vulnerability detector for Python code. It uses TF-IDF vectorization and an autoencoder architecture to detect anomalies in code that may indicate vulnerabilities.
Model Details
- Architecture: Autoencoder (Input → 512 → 128 → 512 → Input)
- Input Features: 2000 (TF-IDF)
- Training Loss: 0.0005
- Framework: PyTorch
Usage
import torch
import pickle
from model import Autoencoder
# Load model
model = Autoencoder(input_dim=2000)
model.load_state_dict(torch.load('catastrophe_model.pth'))
model.eval()
# Load vectorizer
with open('vectorizer.pkl', 'rb') as f:
vectorizer = pickle.load(f)
# Analyze code
code_text = "your code here"
features = vectorizer.transform([code_text]).toarray()
features_tensor = torch.tensor(features, dtype=torch.float32)
with torch.no_grad():
reconstructed = model(features_tensor)
anomaly_score = torch.mean((features_tensor - reconstructed) ** 2, dim=1)
Training Configuration
- Batch Size: 256
- Epochs: 50
- Learning Rate: 0.001
- Optimizer: Adam
Limitations
This model is trained on vulnerable commits only and uses reconstruction error as an anomaly score. High scores indicate potential vulnerabilities, but manual review is recommended.