Another K Means example to learn from
I am a fan of K-means approaches to clustering data particularly when you have a theoretical reason to expect a certain number of clusters and you have a large data set. However, I think ploting the cluster means can be misleading. Reading though Hadley Wickham’s ggplot2 book he suggest the following, to which I add a few little change.
#First we run the kmeans analysis: In brackets is the dataset used #(in this case I only want variables #1 through 11 hence the [1:11]) #and the number of clusters I want produced (in this case 4). cl <-kmeans(mydata[1:11],4) #We will need to add an id variable for later use. In this case I have called it .row. clustT1WIN$.row <-rownames(clustT1WIN) #At this stage I also make a new variable indicating…
View original post 344 more words