Graph partition algo with Neo4j graph database -
i know there has famous graph partition algo tools metis implemented karypis lab (http://glaros.dtc.umn.edu/gkhome/metis/metis/overview)
but wanna know there method partition graph stored in neo4j? or have dump neo4j's data , transform node , edge format manually fit metis input format?
regarding new-ish , interesting algorithms, no means exhaustive or state of art, these first places look:
specific algorithm: didic (distributed diffusive clustering) - used once in thesis (partitioning graph databases)
- you iterate on nodes, each node retrieve neighbors, in order spread of "some unit" neighbors
- easy implement.
- can made deterministic
- iterative - it's based on iterations (like super steps in pregel) can stop @ time. longer leave better result, in theory (though in cases, on graph shapes can unstable)
- when implemented ran 100 iterations on machine ~30gb ram, ~4 million nodes - took no more 2 days complete.
specific algorithm: evocut "finding sparse cuts locally using evolving sets" - local probabilistic algorithm microsoft - related these papers
- difficult implement
- local algorithm - bfs-like access patterns (random walks)
- it's been while since read paper, remember built on clean abstractions:
- evonibble (pluggable - decides how of neighborhood add current cluster
- evocut (calls evonibble multiple times find local cluster)
- evopartition (calls evocut repeatedly partition entire graph)
- not deterministic
general algorithm family: hierarchical graph clustering
from high level:
- coarsen graph collapsing nodes aggregate nodes
- coarsening strategy selectable
- find clusters in coarsened/smaller graph
- clustering strategy selectable
- incrementally decoarsen graph, refining @ clustering @ each step
- refining strategy selectable
notes:
- if graph changes (or results don't need right date) may possible coarsen once (or infrequently) work coarsened graph - save computation
- i don't know of specific algorithm recommend
general limitations - things few clustering algorithms do:
- node types not acknowledged - i.e., nodes treated equally
- relationship types not acknowledged - i.e., relationships treated equally
- relationship direction not acknowledged - i.e., relationships treated undirected
Comments
Post a Comment