MNIST Fashion VAE

This project implements a Variational Autoencoder (VAE) for the Fashion-MNIST dataset, demonstrating the power of generative modeling in creating new fashion items and learning meaningful latent representations.

Project Overview

The Fashion-MNIST dataset contains 70,000 grayscale images of fashion items across 10 categories. Using a VAE architecture, this project learns to encode these images into a lower-dimensional latent space and decode them back to generate new, similar fashion items.

Key Features

Variational Autoencoder architecture with encoder and decoder networks
Latent space visualization and interpolation
Generation of new fashion items
Reconstruction quality analysis
Loss function combining reconstruction and KL divergence terms

Technical Implementation

The VAE is implemented using PyTorch with the following architecture:

Encoder: Convolutional layers that compress 28x28 images to a latent vector
Latent Space: 20-dimensional Gaussian distribution
Decoder: Transposed convolutional layers that reconstruct images from latent vectors

Results

The trained VAE successfully learns to:

Reconstruct input images with high fidelity
Generate diverse and realistic fashion items
Create smooth interpolations between different fashion items
Organize similar items in nearby regions of the latent space

Technologies Used

Python
PyTorch
NumPy
Matplotlib
Jupyter Notebook

Code and Documentation

The complete implementation is available in the Jupyter notebook, including detailed explanations of the VAE architecture, training process, and results visualization.