File size: 2,096 Bytes
5d7796c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

# πŸ’³ Credit Card Fraud Detection β€” HF Space (Calibrated RF Model)

This is an interactive **Gradio demo** of a calibrated Random Forest model for credit card fraud detection.  
The model was trained on the [Kaggle Credit Card Fraud dataset](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud),  
and probability calibration ensures reliable decision thresholds for business scenarios.

---

## πŸš€ How to Use
1. **Upload your CSV** with transaction rows.  
   - Required columns: `V1` … `V28`, `Amount`  
   - Either include engineered features, or just add `Time` (seconds from start)  
     β†’ the app will automatically derive:
     - `_log_amount`
     - `Hour_from_start_mod24`
     - `is_night_proxy`
     - `is_business_hours_proxy`

2. **Adjust the decision threshold** with the slider.  
   - Default is set to the validation threshold for **Precision β‰₯90%** (`β‰ˆ0.712`).  
   - Move it left/right to trade off between precision and recall.

3. **Preview results** (first 50 rows) or enable **Return all rows** for the full file.  
   - Each row includes:
     - `Fraud_Probability`
     - `Prediction (0 = normal, 1 = fraud)`

4. **Download results** as `predictions.csv`.

---

## πŸ§ͺ Try with Example Data
You don’t need to bring your own data to test the app!  
Just click **Use Example** inside the app, and it will load the included `example_transactions.csv`.

This file mimics the required structure:
- 60 transactions
- Columns: `V1..V28`, `Amount`, `Time`
- Probabilities + predictions are generated live with the same calibrated RF model.

---

## πŸ“Š Notes
- The model is calibrated with **Isotonic Regression** for probability reliability.  
- Default threshold corresponds to **Precision β‰₯90%**, aligning with fraud detection team priorities.  
- For production use, re-tune thresholds regularly as data drift changes prevalence and costs.

---

## πŸ”— Related
- [Model repo on Hugging Face Hub](https://huggingface.co/TarekMasryo/CreditCard-fraud-detection-ML)  
- [Original Kaggle dataset](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud)