Linear Regression
The linear regression algorithm works by trying to predict the class label directly given an object. It tries to fit a (hyper)plane as good as possible through all the training examples, so that the output of the hyperplane $y$ predicts the class label. As you may have noticed the label $y$ will be predicted as a continuous value on the hyper plane. This makes it a pretty horrible fit for a (binary) classification problem, where the class label is discrete.
2020-01-12
4 min read
Naive Bayes
Naive Bayes All generative algorithms mentioned above try to model the multivariate class conditional distribution over all features. However when the shear amount of features becomes to large, the number of parameters needed to model the distribution will become infeasible in practice. A Gaussian function for example scales $O(n + n^2) \simeq O(n^2)$ with amount of features. The Naive Bayes classifier will therefore make a very strong assumption. All $x_i’s$ are considered conditionally independent given $y$.
2019-12-25
1 min read
Non Parametric Algorithms
As we have seen with the QDA and LDA classifiers, it is possible to obtain the posterior probabilities by estimating the class conditional probabilities using a (multivariate) Gaussian distribution. And we estimated the parameters of the Gaussian distribution, $\mu$ and $\Sigma$ based on set of (unbiased) examples. Instead of using this parametric approach, it is also possible to estimate the class conditional probabilities using Kernel density estimation Kernel density estimation: Here the main idea is that you approximate the distribution by a mixture of continuous distributions called kernels.
2019-12-22
3 min read