AddMaple won the Paddle AI Launchpad competition out of 77 startups!
AddMaple V3 is here! Learn more
K-means & k-medoids clustering
Clustering is about grouping similar things.
K-means and k-medoids do this in different ways. You will see it step by step, with simple visuals and short explanations.
One visual, two algorithms
See k-means and k-medoids side by side
Toggle the algorithm, change k, and step through the loop.
Step
Start
Controls
What you should notice
- • Points switch colors during assignment.
- • Centers move during update.
- • K-medoids keeps a real example at the center.
The algorithm loop
Two repeating phases: assignment and update
Both algorithms follow the same rhythm. The difference is how the center updates.
Phase 1
Assignment
Each point is assigned to its nearest center. That's why colors shift first.
Phase 2
Update
K-means moves the center to the average. K-medoids picks the best representative point.
Callout
K-means is sensitive to outliers. K-medoids is more robust when one extreme point shows up.
When should you use each?
A quick decision guide
Use the method that matches your data and your tolerance for outliers.
K-means
- • Numeric data only
- • Distances are meaningful
- • Faster and simpler
- • Sensitive to outliers
K-medoids
- • Works with arbitrary distances
- • Centers are real points
- • More robust to noise
- • Slightly more expensive
Gower distance
Comparing mixed data, one feature at a time
Mixed datasets include numbers, categories, and rankings. Gower distance handles them by normalizing each feature and averaging the result.
| Row | Age | Plan type | Satisfaction | Renewal | Weekly usage |
|---|---|---|---|---|---|
| A | 22 | Basic | 3 | Yes | 18 |
| B | 45 | Pro | 5 | No | 44 |
| C | 31 | Basic | 4 | Yes | Missing |
| D | 29 | Team | 2 | No | 25 |
| E | 52 | Pro | 1 | Yes | 30 |
| F | 37 | Team | 5 | Yes | 36 |
Pick two rows
Missing values are ignored when averaging the per-feature distances.
Per-feature contribution
Distance contribution: 0.77
Distance contribution: 1.00
Distance contribution: 0.50
Distance contribution: 1.00
Distance contribution: 1.00
Final Gower distance
0.85 (between 0 and 1)
Gower distance lets us compare mixed data in a consistent way.
Putting it together
Why k-medoids pairs well with Gower distance
K-medoids only needs a distance function. Gower gives a sensible distance for mixed data.
Choosing k
Too few vs too many clusters
Start simple, then adjust until the groups feel meaningful and stable.
As k increases, clusters get tighter. Look for the point where improvements slow down.
Too few clusters: different behaviors get lumped together.
Too many clusters: every small variation becomes its own group.
A good k balances clarity with usefulness. If you can explain each group in one sentence, you are close.