Hierarchical Clustering
![Image](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkQXf60mQ2Qlv1RUCXXlj-ehtiU9CYCOuXO_tZMHplcxcpO9hjhGC7m5paGI569TV2uqzgt8svVR7s56LCoFnzcGcfOTmPhZ-2xh87yeIITtpipqL2i0TtDLdkHQOzkGkEWMW3-XXfUOP2OFdozg0vkv5wa1VnEAg79xerKYjfz6fPp8Q1k6dwHkfuJQ/w640-h606/ecld.png)
Given $n$ points in a d-dimensional space, the goal of hierarchical clustering is to create a sequence of nested partitions, which can be conveniently visualized via a tree or hierarchy of clusters, also called the cluster dendrogram. The clusters in the hierarchy range from the fine-grained to the coarse-grained – the lowest level of the tree (the leaves) consists of each point in its own cluster, whereas the highest level (the root) consists of all points in one cluster. Both of these may be considered to be trivial clusterings. At some intermediate level, we may find meaningful clusters. If the user supplies $k$, the desired number of clusters, we can choose the level at which there are $k$ clusters.There are two main algorithmic approaches to mine hierarchical clusters: agglomerative and divisive. Agglomerative strategies work in a bottom-up manner.That is, starting with each of the $n$ points in a separate cluster, they repeatedly merge the most similar pair of clusters until