Concepts in Machine Learning-CST 383 KTU-Minor Notes-Dr Binu V P

Posts

Showing posts from January, 2022

Cross Validation

January 21, 2022

In this section we discuss how to evaluate a classifier $M$ using some performance measure $θ$. Typically, the input dataset $D$ is randomly split into a disjoint training set and testing set. The training set is used to learn the model $M$, and the testing set is used to evaluate the measure $θ$. However, how confident can we be about the classification performance? The results may be due to an artifact of the random split, for example, by random chance the testing set may have particularly easy (or hard) to classify points, leading to good (or poor) classifier performance. As such, a fixed, pre-defined partitioning of the dataset is not a good strategy for evaluating classifiers. Also note that, in general, $D$ is itself a d-dimensional multivariate random sample drawn from the true (unknown) joint probability density function $f(x)$ that represents the population of interest. Ideally, we would like to know the expected value $E[θ]$ of the performance measure over all possible te...

Supervised Learning-Linear Regression ,Logistic Regression

January 19, 2022

Regression Linear Regression Linear regression is perhaps one of the most well known and well understood algorithms in statistics and machine learning. Linear regression was developed in the field of statistics and is studied as a model for understanding the relationship between input and output numerical variables, but has been borrowed by machine learning. It is both a statistical algorithm and a machine learning algorithm. Linear regression is a linear model, e.g. a model that assumes a linear relationship between the input variables ($x$) and the single output variable ($y$). More specifically, that $y$ can be calculated from a linear combination of the input variables ($x$). When there is a single input variable ($x$), the method is referred to as simple linear regression. When there are multiple input variables, literature from statistics often refers to the method as multiple linear regression. Different techniques can be used to prepare or train the linear regression equation...