I need to do a visual rappresentation of Hierarchical clustering using Complete Linkage by plotting an dendogram.
My data.frame is obtained from eurostat database (CP00 - HICP) and after some cleaning looks like:
dput(head(CP00))
structure(list(id = c("CP00", "CP00", "CP00", "CP00", "CP00",
"CP00"), country = c("Austria", "Austria", "Austria", "Austria",
"Austria", "Austria"), time = structure(c(10988, 11017, 11048,
11078, 11109, 11139), class = "Date"), values = c(1.9, 1.9, 1.8,
1.6, 2.4, 1.9)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
With 7344 observation.
Firstly, I computed the Dissimilarity matrix with and then the hierarchical clustering using complete linkage:
# Dissimilarity matrix
CP00_clst <- dist(CP00, method = "minkowski", p = 1.5)
# Hierarchical clustering using Complete Linkage
CP00_clst <- hclust(CP00_clst, method = "complete")
Finally, simply plotting with a title:
# Plot the obtained dendrogram
plot(CP00_clst, main = "Clusterin Countries based on HICP")
However, the result is what I need to have, such as a clear dendrogram. In addition, I need to divide the dendogram in 4 cluster.
This is my results:
My Result
This is the outcome that I need:
Outcome needed
I am new to R and probably there is something wrong in the dissimilarity matrix. Thank you for your help!
1条答案
按热度按时间vc6uscn91#
您是要将7344个实体绘制到树状图中,还是只绘制几个国家?
如果几个国家:对于
dist
函数,CP 00为长格式(每行每个对象1个值),但是,dist
函数需要宽格式(每行1个对象,多个属性作为列;参见https://www.statology.org/long-vs-wide-data/)。可以通过
stats
包中的rect.hclust
函数简单地添加矩形:https://rdrr.io/r/stats/rect.hclust.html