MNIST Fashion VAE
This project implements a Variational Autoencoder (VAE) for the Fashion-MNIST dataset, demonstrating the power of generative modeling in creating new fashion items and learning meaningful latent representations.
Project Overview
The Fashion-MNIST dataset contains 70,000 grayscale images of fashion items across 10 categories. Using a VAE architecture, this project learns to encode these images into a lower-dimensional latent space and decode them back to generate new, similar fashion items.
Key Features
- Variational Autoencoder architecture with encoder and decoder networks
- Latent space visualization and interpolation
- Generation of new fashion items
- Reconstruction quality analysis
- Loss function combining reconstruction and KL divergence terms
Technical Implementation
The VAE is implemented using PyTorch with the following architecture:
- Encoder: Convolutional layers that compress 28x28 images to a latent vector
- Latent Space: 20-dimensional Gaussian distribution
- Decoder: Transposed convolutional layers that reconstruct images from latent vectors
Results
The trained VAE successfully learns to:
- Reconstruct input images with high fidelity
- Generate diverse and realistic fashion items
- Create smooth interpolations between different fashion items
- Organize similar items in nearby regions of the latent space
Technologies Used
- Python
- PyTorch
- NumPy
- Matplotlib
- Jupyter Notebook
Code and Documentation
The complete implementation is available in the Jupyter notebook, including detailed explanations of the VAE architecture, training process, and results visualization.