THE FUTURE IS HERE

Machine Learning Fundamentals: Bias and Variance

Bias and Variance are two fundamental concepts for Machine Learning, and their intuition is just a little different from what you might have learned in your statistics class. Here I go through two examples that make these concepts super easy to understand.

For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/

If you’d like to support StatQuest, please consider…

Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF – https://statquest.gumroad.com/l/wvtmc
Paperback – https://www.amazon.com/dp/B09ZCKR4H6
Kindle eBook – https://www.amazon.com/dp/B09ZG79HXC

Patreon: https://www.patreon.com/statquest
…or…
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join

…a cool StatQuest t-shirt or sweatshirt:
https://shop.spreadshirt.com/statquest-with-josh-starmer/

…buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/

…or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer

0:00 Awesome song and introduction
0:29 The data and the “true” model
1:23 Splitting the data into training and testing sets
1:40 Least Regression fit to the training data
2:16 Definition of Bias
2:33 Squiggly Line fit to the training data
3:40 Model performance with the testing dataset
4:06 Definition of Variance
5:10 Definition of Overfit

Correction:
4:06 I say that the difference in fits between the training dataset and the testing dataset is called Variance. However, I should have said that the difference is a _consequence_ of variance. Technically, variance refers to the amount by which the predictions would change if we fit the model to a different training data set.

#statquest #biasvariance #ML