Research > Cluster Weighted Modeling¶
Tutorial 1¶
4 Basic Types of Cluster Analysis used in Data Analytics
Notes:
Centroid Clustering
- Choose Number of Clusters (segmentation categories?)
- Determine a Centroid for each defined cluster
- Assign Data Points to a centroid, based on proximity
- Generate distinct Clusters
- Recenter the centroid once the clusters are formed (?)
Density Clustering
- Group Data Points based on their proximity to one another...distance from one point to another
- The more dense a grouping...the more likely that they belong in the same cluster
- Density clustering is able to define odd-shaped clusters that Centroid Clustering method would have been unable to identify
Distribution Clustering
- Looks at the probability that a Data Point belongs to a cluster
- Choose Number of Clusters
- Determine a Centroid for each defined cluster
- Distance from Centroid determines the probability of belonging to one of the cluster
Connectivity Clustering
- Each Data Point starts as their own cluster
- Determine how much one data point is related to another data point...based on the data point's 'behavior' or 'characteristics' or 'features'
- Ultimately depends on the desired number of clusters
Bayesian Optimization¶
I found this short explanation useful. This video was also super helpful.
Joint Probability = The probability of 2 events occurring together.