|
|
--- |
|
|
license: mit |
|
|
|
|
|
datasets: |
|
|
- PolyAI/banking77 |
|
|
|
|
|
language: |
|
|
- en |
|
|
|
|
|
tags: |
|
|
- autoencoder |
|
|
--- |
|
|
|
|
|
# VAE trained on Banking 77 Open Intent Classification Dataset |
|
|
This is a Variational Autoencoder (VAE) trained on the [PolyAI/banking77](https://huggingface.co/datasets/PolyAI/banking77) dataset. |
|
|
|
|
|
### Architecture |
|
|
- **input_dim**: 768 |
|
|
- **hidden_dim**: 256 |
|
|
- **latent_dim**: 64 |
|
|
|
|
|
#### Encoder |
|
|
The encoder maps the input to a latent space distribution. |
|
|
|
|
|
```python |
|
|
encoder = nn.Sequential( |
|
|
nn.Linear(input_dim, hidden_dim), |
|
|
nn.ReLU() |
|
|
) |
|
|
|
|
|
mu = nn.Linear(hidden_dim, latent_dim) |
|
|
logvar = nn.Linear(hidden_dim, latent_dim) |
|
|
``` |
|
|
|
|
|
#### Decoder |
|
|
The decoder reconstructs the input from a sample of the latent space. |
|
|
|
|
|
```python |
|
|
decoder = nn.Sequential( |
|
|
nn.Linear(latent_dim, hidden_dim), |
|
|
nn.ReLU(), |
|
|
nn.Linear(hidden_dim, input_dim) |
|
|
) |
|
|
``` |
|
|
|
|
|
#### Metrics |
|
|
The model was trained and evaluated using the following metrics: |
|
|
1. Training set: VAE Loss |
|
|
* 50% reconstruction loss between original input vs reconstructed output |
|
|
* 50% KL divergence between Latent Z vs standard normal distribution |
|
|
2. Validation set: 100% reconstruction loss -> used to find the best model (with the lowest reconstruction loss) |
|
|
|