panther_similarity#
- panther_similarity(G, source, k=5, path_length=5, c=0.5, delta=0.1, eps=None)[source]#
Returns the Panther similarity of nodes in the graph
G
to nodev
.Panther is a similarity metric that says “two objects are considered to be similar if they frequently appear on the same paths.” [1].
- Parameters:
- GNetworkX graph
A NetworkX graph
- sourcenode
Source node for which to find the top
k
similar other nodes- kint (default = 5)
The number of most similar nodes to return
- path_lengthint (default = 5)
How long the randomly generated paths should be (
T
in [1])- cfloat (default = 0.5)
A universal positive constant used to scale the number of sample random paths to generate.
- deltafloat (default = 0.1)
The probability that the similarity \(S\) is not an epsilon-approximation to (R, phi), where \(R\) is the number of random paths and \(\phi\) is the probability that an element sampled from a set \(A \subseteq D\), where \(D\) is the domain.
- epsfloat or None (default = None)
The error bound. Per [1], a good value is
sqrt(1/|E|)
. Therefore, if no value is provided, the recommended computed value will be used.
- Returns:
- similaritydictionary
Dictionary of nodes to similarity scores (as floats). Note: the self-similarity (i.e.,
v
) will not be included in the returned dictionary.
References
[1] (1,2,3)Zhang, J., Tang, J., Ma, C., Tong, H., Jing, Y., & Li, J. Panther: Fast top-k similarity search on large networks. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Vol. 2015-August, pp. 1445–1454). Association for Computing Machinery. https://doi.org/10.1145/2783258.2783267.
Examples
>>> G = nx.star_graph(10) >>> sim = nx.panther_similarity(G, 0)