Machine Learning Fundamentals: Bias and Variance
Bias and Variance are two fundamental concepts for Machine Learning, and their intuition is just a little different from what you might have learned in your statistics class. Here I go through two examples that make these concepts super easy to understand.
For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - https://statquest.gumroad.com/l/wvtmc
Paperback - https://www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - https://www.amazon.com/dp/B09ZG79HXC
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
...a cool StatQuest t-shirt or sweatshirt:
https://shop.spreadshirt.com/statquest-with-josh-starmer/
...buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
https://www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
0:00 Awesome song and introduction
0:29 The data and the "true" model
1:23 Splitting the data into training and testing sets
1:40 Least Regression fit to the training data
2:16 Definition of Bias
2:33 Squiggly Line fit to the training data
3:40 Model performance with the testing dataset
4:06 Definition of Variance
5:10 Definition of Overfit
Correction:
4:06 I say that the difference in fits between the training dataset and the testing dataset is called Variance. However, I should have said that the difference is a _consequence_ of variance. Technically, variance refers to the amount by which the predictions would change if we fit the model to a different training data set.
#statquest #biasvariance #ML
Bias and Variance are two fundamental concepts for Machine Learning, and their intuition is just a little different from what you might have learned in your statistics class. Here I go through two examples that make these concepts super easy to understand.
For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/
If you’d like to support StatQuest, please consider…
Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF – https://statquest.gumroad.com/l/wvtmc
Paperback – https://www.amazon.com/dp/B09ZCKR4H6
Kindle eBook – https://www.amazon.com/dp/B09ZG79HXC
Patreon: https://www.patreon.com/statquest
…or…
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
…a cool StatQuest t-shirt or sweatshirt:
https://shop.spreadshirt.com/statquest-with-josh-starmer/
…buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/
…or just donating to StatQuest!
https://www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
0:00 Awesome song and introduction
0:29 The data and the “true” model
1:23 Splitting the data into training and testing sets
1:40 Least Regression fit to the training data
2:16 Definition of Bias
2:33 Squiggly Line fit to the training data
3:40 Model performance with the testing dataset
4:06 Definition of Variance
5:10 Definition of Overfit
Correction:
4:06 I say that the difference in fits between the training dataset and the testing dataset is called Variance. However, I should have said that the difference is a _consequence_ of variance. Technically, variance refers to the amount by which the predictions would change if we fit the model to a different training data set.
#statquest #biasvariance #ML