Posts

Hierarchical Clustering

Image
Given $n$ points in a d-dimensional space, the goal of hierarchical clustering is to create a sequence of nested partitions, which can be conveniently visualized via a tree or hierarchy of clusters, also called the cluster dendrogram. The clusters in the hierarchy range from the fine-grained to the coarse-grained – the lowest level of the tree (the leaves) consists of each point in its own cluster, whereas the highest level (the root) consists of all points in one cluster. Both of these may be considered to be trivial clusterings. At some intermediate level, we may find meaningful clusters. If the user supplies $k$, the desired number of clusters, we can choose the level at which there are $k$ clusters.There are two main algorithmic approaches to mine hierarchical clusters: agglomerative and divisive.  Agglomerative strategies work in a bottom-up manner.That is, starting with each of the $n$ points in a separate cluster, they repeatedly merge the most similar pair of clusters until

Decision Trees-ID3 Algorithm

Image
  Decision Trees Decision trees are very popular for predictive modeling and perform both, classification and regression. Decision trees are highly interpretable and provide a foundation for more complex algorithms, e.g., random forest. Example: Suppose there is a candidate who has a job offer and wants to decide whether he should accept the offer or Not. So, to solve this problem, the decision tree starts with the root node (Salary attribute by ASM). The root node splits further into the next decision node (distance from the office) and one leaf node based on the corresponding labels. The next decision node further gets split into one decision node (Cab facility) and one leaf node. Finally, the decision node splits into two leaf nodes (Accepted offers and Declined offer). Consider the below diagram: It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome. In a Decis

Density Estimation MLE and MAP

Image
Density Estimation Density estimation is a statistical technique used to estimate the probability density function of a random variable from a set of observed data points. It helps in understanding the underlying distribution of the data. Maximum Likelihood Estimation (MLE) is a method used to find the parameters of a statistical model that maximize the likelihood function, which measures how well the model explains the observed data. In the context of density estimation, MLE aims to find the parameters that make the observed data most probable. Maximum A Posteriori Estimation (MAP) , on the other hand, incorporates prior knowledge about the parameters by using a prior distribution. It combines this prior information with the likelihood function to obtain a posterior distribution. In the context of density estimation, MAP seeks to find the parameters that maximize the posterior probability given both the observed data and prior information. In summary, the key difference lies in the