THE FUTURE IS HERE

Diffusion models explained. How does OpenAI's GLIDE work?

Diffusion models beat GANs in image synthesis, GLIDE generates images from text descriptions, surpassing even DALL-E in terms of photorealism! Check out this video to learn how diffusion models work. Enjoy the visuals!
SPONSOR: Weights & Biases πŸ‘‰ https://wandb.me/ai-coffee-break

❓ Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community

Recommended videos:
πŸ“Ί DALL-E video: https://youtu.be/mvG2FGF0TvM
πŸ“Ί GAN explained video: https://youtu.be/_qB4B6ttXk8
πŸ“Ί CLIP video: https://youtu.be/dh8Rxhf7cLU

Papers:
πŸ“œ GLIDE paper: Nichol, Alex, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. “Glide: Towards photorealistic image generation and editing with text-guided diffusion models.” arXiv preprint arXiv:2112.10741 (2021). https://arxiv.org/abs/2112.10741
πŸ”— GLIDE mini, demo: https://huggingface.co/spaces/valhalla/glide-text2im
πŸ“œ Diffusion models for image generation: Dhariwal, Prafulla, and Alexander Nichol. “Diffusion models beat GANs on image synthesis.” Advances in Neural Information Processing Systems 34 (2021). https://arxiv.org/abs/2105.05233
πŸ“œ Original diffusion models paper: Sohl-Dickstein, Jascha, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. “Deep unsupervised learning using nonequilibrium thermodynamics.” In International Conference on Machine Learning, pp. 2256-2265. PMLR, 2015. https://arxiv.org/abs/1503.03585
πŸ”— Check out this awesome blogpost by Lilian Weng: https://lilianweng.github.io/lil-log/2021/07/11/diffusion-models.html
πŸ”— Flow-based models: https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html
πŸ”— DALL-E blog post: https://openai.com/blog/dall-e/

Outline:
00:00 Diffusion models are cool
00:33 Weights & Biases (Sponsor)
01:51 4 types of generative models (in 2022)
05:13 Diffusion models explained
08:27 Why are diffusion models good at photorealism? – Diffusion models beat GANs
10:36 GLIDE explained
12:16 Classifier-guided diffusion, CLIP-guided diffusion
13:56 Classifier-free guidance

Thanks to our Patrons who support us in Tier 2, 3, 4: πŸ™
Don Rosenthal, Dres. Trost GbR, banana.dev — Kyle Morris, Joel Ang, JuliΓ‘n Salazar, Edvard GrΓΈdem

β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€
πŸ”₯ Optionally, pay us a coffee to help with our Coffee Bean production! β˜•
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€

————————————
πŸ”— Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Video contains the rock emoji designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0

Music 🎡 : Tell Me That I Can’t (Instrumental) by NEFFEX