最佳答案
scipy.spatial.distance.pdist returns a condensed distance matrix. From the documentation:
Returns a condensed distance matrix Y. For each and (where ), the metric dist(u=X[i], v=X[j]) is computed and stored in entry ij.
I thought ij meant i*j. But I think I might be wrong. Consider
X = array([[1,2], [1,2], [3,4]])
dist_matrix = pdist(X)
then the documentation says that dist(X[0], X[2]) should be dist_matrix[0*2]. However, dist_matrix[0*2] is 0 -- not 2.8 as it should be.
What's the formula I should use to access the similarity of a two vectors, given i and j?