Goal: To learn and apply the most common deep generative models to audio data Plan: Follow Stanford’s CS236 Deep Generative Models and implement the various models from scratch in JAX
Data: Million song dataset Task: Audio reconstruction Frameworks: JAX, Pyro
Video | Model | Papers | |||
---|---|---|---|---|---|
Autoregressive Models | - [ ] | WaveNet | - MADE - Masked Autoencoder for Audio Representation learning - WaveNet - WaveNet Autoencoder for Audio Synthesis - WaveNet Autoencoder for Representation Learning - WaveNet Autoencoder for Representation Learning II | ||
VAEs | - [ ] | ||||
Normalizing Flows | - [ ] | ||||
GANs | - [ ] | ||||
EBMs | - [ ] | ||||
SBMs | - [ ] | ||||
Diffusion Models | - [ ] | ||||