python - Numpy - Speed up iteration comparison? -


the following use case:

i have numpy matrix/array few thousand 2d points. call a. eg:

[1 2] [300 400] .. [123 242] 

i have numpy matrix few 2d points above. call b.

basically, want iterate through a, iterate through b , compute distance between a[i] , b[j]. assign array. this:

for i, (x0, x1) in enumerate(zip(a[:,0],a[:,1])):     weight_distance = 0     j, (p0, p1) in enumerate(zip(a[:,0],a[:,1])):         weight_distance = weight_distance + distance((p0,p1),(x0,x1))     weight_array[i] = weight_distance 

but slow. might numpy way approach this?

what you're looking code in scipy.spatial.distance, particularly cdist function. can efficiently compute pairwise distances between arrays of points wide variety of metrics.

import numpy np scipy.spatial.distance import cdist  = np.random.random((1000, 2)) b = np.random.random((100, 2))  d = cdist(a, b, metric='euclidean') print(d.shape) # (1000, 100)  weights = d.sum(1) print(weights.shape)  # (1000,) 

here euclidean standard root-sum-square distance you're used to, , d[i, j] holds distance between a[i] , b[j], , summing along axis 1 gives desired weights.

there ways via broadcasting directly in numpy, approach use several large temporary arrays, , in general slower scipy cdist approach.


edit: thought may add note on numpy-only approach. looks this:

d2 = np.sqrt(((a[:, none, :] - b[none, :, :]) ** 2).sum(-1)) weights2 = d2.sum(1) np.allclose(weights, weights2)  # true 

let's break down:

  • a[:, none, :] adds new dimension a, shape [1000, 1, 2]. similar b[none, :, :], becomes [1, 100, 2]
  • a[:, none, :] - b[none, :, :] broadcasting operation results in array of differences, shape [1000, 100, 2]
  • we square every element of result.
  • the sum(-1) method on result sums across last dimension, resulting in array of shape [1000, 100]
  • we take square root of result, gives distance matrix
  • we sum along axis 1 weights

notice broadcasting approach creates not one, two temporary arrays of size 1000 * 100 * 2 along way, why less efficient purpose-built compiled function cdist.


Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -