Graph partition algo with Neo4j graph database -


i know there has famous graph partition algo tools metis implemented karypis lab (http://glaros.dtc.umn.edu/gkhome/metis/metis/overview)

but wanna know there method partition graph stored in neo4j? or have dump neo4j's data , transform node , edge format manually fit metis input format?

regarding new-ish , interesting algorithms, no means exhaustive or state of art, these first places look:

specific algorithm: didic (distributed diffusive clustering) - used once in thesis (partitioning graph databases)

  • you iterate on nodes, each node retrieve neighbors, in order spread of "some unit" neighbors
  • easy implement.
  • can made deterministic
  • iterative - it's based on iterations (like super steps in pregel) can stop @ time. longer leave better result, in theory (though in cases, on graph shapes can unstable)
  • when implemented ran 100 iterations on machine ~30gb ram, ~4 million nodes - took no more 2 days complete.

specific algorithm: evocut "finding sparse cuts locally using evolving sets" - local probabilistic algorithm microsoft - related these papers

  • difficult implement
  • local algorithm - bfs-like access patterns (random walks)
  • it's been while since read paper, remember built on clean abstractions:
    • evonibble (pluggable - decides how of neighborhood add current cluster
    • evocut (calls evonibble multiple times find local cluster)
    • evopartition (calls evocut repeatedly partition entire graph)
  • not deterministic

general algorithm family: hierarchical graph clustering

from high level:

  • coarsen graph collapsing nodes aggregate nodes
    • coarsening strategy selectable
  • find clusters in coarsened/smaller graph
    • clustering strategy selectable
  • incrementally decoarsen graph, refining @ clustering @ each step
    • refining strategy selectable

notes:

  • if graph changes (or results don't need right date) may possible coarsen once (or infrequently) work coarsened graph - save computation
  • i don't know of specific algorithm recommend

general limitations - things few clustering algorithms do:

  • node types not acknowledged - i.e., nodes treated equally
  • relationship types not acknowledged - i.e., relationships treated equally
  • relationship direction not acknowledged - i.e., relationships treated undirected

Comments

Popular posts from this blog

image - ClassNotFoundException when add a prebuilt apk into system.img in android -

I need to import mysql 5.1 to 5.5? -

Java, Hibernate, MySQL - store UTC date-time -