How do we use nominal (non-numeric or noncontinuous) categories as features?
› View/hide answer
Convert each possible value to a real number.
Why do we need to use scaling (normalization)?
› View/hide answer
To indicate the relative importance of each feature.
How does k-means clustering work?
› View/hide answer
A number 'k' points are chosen, randomly or otherwise, to be the initial centroids; all other points are assigned to their nearest centroid. A new, better centroid is then chosen for each cluster, and we rinse and repeat until the difference between our current set of clusters and the previous set is insignificant.