如何在numpy数组中找到最近的值?例子:

np.find_nearest(array, value)

当前回答

下面是@Ari Onasafari的scipy版本,回答“在向量数组中找到最近的向量”

In [1]: from scipy import spatial

In [2]: import numpy as np

In [3]: A = np.random.random((10,2))*100

In [4]: A
Out[4]:
array([[ 68.83402637,  38.07632221],
       [ 76.84704074,  24.9395109 ],
       [ 16.26715795,  98.52763827],
       [ 70.99411985,  67.31740151],
       [ 71.72452181,  24.13516764],
       [ 17.22707611,  20.65425362],
       [ 43.85122458,  21.50624882],
       [ 76.71987125,  44.95031274],
       [ 63.77341073,  78.87417774],
       [  8.45828909,  30.18426696]])

In [5]: pt = [6, 30]  # <-- the point to find

In [6]: A[spatial.KDTree(A).query(pt)[1]] # <-- the nearest point 
Out[6]: array([  8.45828909,  30.18426696])

#how it works!
In [7]: distance,index = spatial.KDTree(A).query(pt)

In [8]: distance # <-- The distances to the nearest neighbors
Out[8]: 2.4651855048258393

In [9]: index # <-- The locations of the neighbors
Out[9]: 9

#then 
In [10]: A[index]
Out[10]: array([  8.45828909,  30.18426696])

其他回答

对于那些搜索多个最接近的,修改接受的答案:

import numpy as np
def find_nearest(array, value, k):
    array = np.asarray(array)
    idx = np.argsort(abs(array - value))[:k]
    return array[idx]

看到的: https://stackoverflow.com/a/66937734/11671779

这是在向量数组中找到最近向量的扩展。

import numpy as np

def find_nearest_vector(array, value):
  idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin()
  return array[idx]

A = np.random.random((10,2))*100
""" A = array([[ 34.19762933,  43.14534123],
   [ 48.79558706,  47.79243283],
   [ 38.42774411,  84.87155478],
   [ 63.64371943,  50.7722317 ],
   [ 73.56362857,  27.87895698],
   [ 96.67790593,  77.76150486],
   [ 68.86202147,  21.38735169],
   [  5.21796467,  59.17051276],
   [ 82.92389467,  99.90387851],
   [  6.76626539,  30.50661753]])"""
pt = [6, 30]  
print find_nearest_vector(A,pt)
# array([  6.76626539,  30.50661753])

对于大型数组,@Demitri给出的(优秀)答案比目前标记为最佳的答案快得多。我从以下两个方面调整了他的精确算法:

不管输入数组是否排序,下面的函数都有效。 下面的函数返回与最接近的值对应的输入数组的索引,这有点更一般。

请注意,下面的函数还处理了一个特定的边缘情况,这将导致@Demitri编写的原始函数中的错误。否则,我的算法和他的一样。

def find_idx_nearest_val(array, value):
    idx_sorted = np.argsort(array)
    sorted_array = np.array(array[idx_sorted])
    idx = np.searchsorted(sorted_array, value, side="left")
    if idx >= len(array):
        idx_nearest = idx_sorted[len(array)-1]
    elif idx == 0:
        idx_nearest = idx_sorted[0]
    else:
        if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]):
            idx_nearest = idx_sorted[idx-1]
        else:
            idx_nearest = idx_sorted[idx]
    return idx_nearest

如果你不想使用numpy,可以这样做:

def find_nearest(array, value):
    n = [abs(i-value) for i in array]
    idx = n.index(min(n))
    return array[idx]
import numpy as np
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

使用示例:

array = np.random.random(10)
print(array)
# [ 0.21069679  0.61290182  0.63425412  0.84635244  0.91599191  0.00213826
#   0.17104965  0.56874386  0.57319379  0.28719469]

print(find_nearest(array, value=0.5))
# 0.568743859261