There are little statistical tools to perform exploratory data analysis in graphs databases.
The basic observation is the graph, but it may be also a connex component, or simply a node, an edge, a cycle, a path, a concentric layer, etc. [2].
Many univariate and multivariate distributions may be generated from these populations.
One of the most useful distributions is the number of connex components having a given radius R and a given diameter D.
It is recalled that the radius is the lower bound of the eccentricities of the nodes of a connex component, and the diameter is the upper bound, the eccentricity of a node being the upper bound of the distances from the node to all the nodes of the connex component.
It is known that D takes values between R and 2R, such that the bivariate distribution in the (R,D) plane takes place in an angular sector limited by the two lines D=R and D=2R.
Displaying the clusters in this bivariate distribution offers a schematic graphical summary of the population which is called the Radius Diameter Diagram [6,7].
The quantity I=(D-R)/R takes values in [0..1]. It is used as a shape index, and its distribution can be plotted (see example in [7]).
|
|