最佳答案
scipy.spatial.distance.pdist
returns a condensed distance matrix. From the documentation:
Returns a condensed distance matrix Y. For each and (where ), the metric dist(u=X[i], v=X[j]) is computed and stored in entry ij.
I thought ij
meant i*j
. But I think I might be wrong. Consider
X = array([[1,2], [1,2], [3,4]])
dist_matrix = pdist(X)
then the documentation says that dist(X[0], X[2])
should be dist_matrix[0*2]
. However, dist_matrix[0*2]
is 0 -- not 2.8 as it should be.
What's the formula I should use to access the similarity of a two vectors, given i
and j
?