Goal: To learn and apply the most common deep generative models to audio data Plan: Follow Stanford’s CS236 Deep Generative Models and implement the various models from scratch in JAX
Data: Million song dataset Task: Audio reconstruction Frameworks: JAX, Pyro
| Video | Model | Papers | |||
|---|---|---|---|---|---|
| Autoregressive Models | - [ ] | WaveNet | - MADE - Masked Autoencoder for Audio Representation learning - WaveNet - WaveNet Autoencoder for Audio Synthesis - WaveNet Autoencoder for Representation Learning - WaveNet Autoencoder for Representation Learning II | ||
| VAEs | - [ ] | ||||
| Normalizing Flows | - [ ] | ||||
| GANs | - [ ] | ||||
| EBMs | - [ ] | ||||
| SBMs | - [ ] | ||||
| Diffusion Models | - [ ] | ||||