the second thing that i want to say in the introduction is and what is machine learning what’s this course about right so let’s let’s let’s talk about that briefly if you open wikipedia and just read the the definition then the first sentence it says machine learning is the study of computer algorithms that improve automatically through experience that sounds good so what i think this sentence brings in mind is something like i don’t know a computer program playing chess or playing golf something like that or self-driving car right it it somehow experiences different chess games for example and learns and improves to play and then can beat hsoan players so so here’s the the one way to think about that as is to contrast it with more traditional approaches to solve this kind of problem so traditionally and by traditionally in this case i mean i don’t know years ago if you want to program so a a chess engine or if you want to program a computer to let’s say you present it with an image and the algorithm should say if it’s a cat or a doc then in computer vision years ago you would try to come up with rules that would be able to solve this problem right so maybe a cat has whiskers and an ears like that so you would try to build a pattern detector that will look for these ears and if they’re in the image then it’s a cat so you put in some data and you put in some rules that you as a programmer come up with so and then the the software gives you the answers in the machine learning it it’s a bit so upside down the same paradigm you put in the data and you put in the the goal or the answers if you want so you give a computer an image and say that’s a cat and here’s a dog and then you repeat this one million times and hopefully the afterwards so your your algorithm will be able to discriminate a cat from a dog without you ever specifically and designing a rule that would be able to do that in some cases maybe you don’t even know how to distinguish a cat from a dog or you can’t formulate it easily in words and the computer will just learn to do it okay.

here is another definition from a textbook that’s called machine learning the goal of machine learning is to develop methods that can automatically detect patterns and data and then to use the uncovered patterns to predict future data or other outcomes of interest that sounds a bit more general and then murphy the author continues to say machine learning is thus closely related to the fields of statistics but differs slightly in terms of emphasis so that’s important and i also want to discuss it briefly of course if you think what what does statistic do well it it detects patterns in the data and then tries to predict some outcomes of interest like the same definition would maybe apply to statistics or almost the same so what is this different in emphasis here between machine learning and statistics and i should say that this is of course a topic where you can find a lot of discussions so with you know some people saying statistics and machine learning is the same or like they intersect and the part of statistics that is not machine learning is useless so the part of machine learning that is not statistics is just engineering and and on that’s a very very hardly debated topic let me offer my perspective here so both statistics and machine learning aim to detect patterns in data as in that quote by building predictive models the emphasis is slightly different here so statistics will use this as a tool to learn something about the world in statistics it’s called statistical inference we want to infer some properties of the world that that reveal themselves through the noisy observations through the data the focus therefore is on simple models on interpretable models such that after you fit the model you can actually look at that and learn something right and since the models are relatively simple one can develop a lot of theoretical analysis work out how the estimation procedures will work under some asssoptions what are the statistical guarantees and so on so if you’re taking any statistics course in parallel but this and i know some of you are then then you will see how this actually applies to what you’re doing in that course machine learning uses the same approach so we’re building predictive models to detect patterns in the data but uses this as a tool to actually make useful predictions in in challenging situations and not so much to learn anything about the world therefore it focuses usually on complicated and competitive models deep neural network or something like that uses very large data set you as long as your as your model works in practice so great you achieved your goal you usually do not try to learn anything about cats and dogs but training on your network to classify that but you just want your your system and your so to to be actually useful so when somebody does so you know a google image search or stuff like that so so to larger stand we’re giving up here on the statistical inference but we’re focusing more on the prediction so here’s a nice paper that i reference here from years ago statistical modeling the two cultures but it’s very influential like more philosophical paper about these two different approaches so if you so one you can i think it’s still even though years passed and machine learning changed entirely over these years i think this conceptual distinction still holds so this of course also means at least in my opinion that there is no hard boundary it’s more like a spectrso and you can you can arrange different predictive models on the spectrso where on one side there is like a clearly statistics and on the other side there’s a clearly machine learning .
just to give you an example the clearly statistical tool would be a one sample t-test where you have a bunch of nsobers and you want to test if they are let’s say positive actually if the average is above or below zero using a hypothesis test in a way you are fitting a model the model has one parameter and that’s the average and then you want to infer whether this average is above or below zero on the other side of the spectrso is gpt the recent neural network gigantic neural network trained on all the text available on the internet and that is famously able to generate very very hsoan and looking texts so and it’s very very hard to say how exactly it works on the on the mechanistic level so at this point we’re definitely not trying to learn anything about the text but we’re just trying to generate so hsoan looking text so that’s clearly machine learning and everything in between i would say or a lot of things in between belong to both both fields and so this is a cartoon that then describes how the machine learning extreme will look like you just have an algorithm that’s completely non-transparent you put some data you get some answers so but you you’re not doing any inference because this it’s not a black box you know how how the algorithm works it’s a neural network for example but it’s so complicated that it doesn’t give you any new knowledge about your data whereas statistics that’s what statistics is after so a lot of things though are in between and for example one of the textbooks the the most the most well-known textbooks on machine learning is called the elements of statistical learning right so that’s a term that even tries to combine statistics.
