Cutting a graph in R -
i have following simple problem. have distance matrix number of nodes, , list of subsets of nodes such within each subset, every 2 nodes @ minimum distance dmin. is, each 2 nodes connected edge has associated value. want remove every edge value smaller dmin, , list resulting disconnected graphs.
essentially want clusters of data points close each other, not clustering algorithm, instead using threshold value distance.
my question is, naturally, how can accomplish in r. consider following matrix m:
b c d 1.0 0.9 0.2 0.3 b 0.9 1.0 0.4 0.1 c 0.2 0.4 1.0 0.7 d 0.3 0.1 0.7 1.0
there 4 nodes (a, b, c, d). search function or package given matrix (which in fact 1 - distance matrix) , threshold dmin, example dmin <- 0.5
, produce 2 sets: {a,b}
, {c,d}
. inefficient way of accomplishing follows:
clusters <- list() nodes <- colnames( m ) dmin <- 0.5 # loop on nodes for( n in nodes ) { found <- false # check whether node can associated 1 of existing # clusters for( c in names( clusters ) ) { if( any( m[ n, clusters[[c]] ] > 0.5 ) ) { clusters[[c]] <- c( clusters[[c]], n ) found <- true next } } # no luck? create new cluster node if( ! found ) clusters[[n]] <- c( n ) }
the result be
> clusters $a [1] "a" "b" $c [1] "c" "d"
from similarity matrix m
, can build adjacency matrix m > .5
, construct corresponding graph igraph
package , extract connected components.
m <- matrix(c(10,9,2,3, 9,10,4,1, 2,4,10,7, 3,1,7,10), 4, 4)/10 colnames(m) <- rownames(m) <- letters[1:4] library(igraph) g <- graph.adjacency( m > .5 ) plot(g) clusters(g)$membership # [1] 1 1 2 2 tapply(colnames(m), clusters(g)$membership, c) # $`1` # [1] "a" "b" # $`2` # [1] "c" "d"
Comments
Post a Comment