简单的方法来测试一个数组中的每个元素是否位于两个值之间？

小开

最佳答案

一个解决办法是:

import numpy as np
a = np.array([1, 2, 3, 4, 5])
(a > 1) & (a < 5)
# array([False,  True,  True,  True, False])

小开

另一种方法是使用 numpy.any，这里有一个例子

import numpy as np
a = np.array([1,2,3,4,5])
np.any((a < 1)|(a > 5 ))

小开

还可以将矩阵居中并使用到0的距离

upper_limit = 5
lower_limit = 1
a = np.array([1,2,3,4,5])
your_mask = np.abs(a- 0.5*(upper_limit+lower_limit))<0.5*(upper_limit-lower_limit)

需要记住的一点是，比较在两边都是对称的，因此它可以执行 1<x<5或 1<=x<=5，但不能执行 1<=x<5

小开

在多维数组中，您可以使用建议的 np.any()选项或比较运算符，而使用 &和 and将产生错误。

示例(在多点数组上)使用比较运算符

import numpy as np


arr = np.array([[1,5,1],
[0,1,0],
[0,0,0],
[2,2,2]])

现在使用 ==检查数组值是否在某个范围内，即 A < arr < B，或者 !=检查数组值是否在某个范围之外，即 arr < A 和 arr > B:

(arr<1) != (arr>3)
> array([[False,  True, False],
[ True, False,  True],
[ True,  True,  True],
[False, False, False]])


(arr>1) == (arr<4)
> array([[False, False, False],
[False, False, False],
[False, False, False],
[ True,  True,  True]])

小开

将基于 NumPy 的方法与 Numba 加速的循环进行比较是很有趣的:

import numpy as np
import numba as nb




def between(arr, a, b):
return (arr > a) & (arr < b)




@nb.njit(fastmath=True)
def between_nb(arr, a, b):
shape = arr.shape
arr = arr.ravel()
n = arr.size
result = np.empty_like(arr, dtype=np.bool_)
for i in range(n):
result[i] = arr[i] > a or arr[i] < b
return result.reshape(shape)

计算和绘制的基准是:

import pandas as pd
import matplotlib.pyplot as plt




def benchmark(
funcs,
args=None,
kws=None,
ii=range(4, 24),
m=2 ** 15,
is_equal=np.allclose,
seed=0,
unit="ms",
verbose=True
):
labels = [func.__name__ for func in funcs]
units = {"s": 0, "ms": 3, "µs": 6, "ns": 9}
args = tuple(args) if args else ()
kws = dict(kws) if kws else {}
assert unit in units
np.random.seed(seed)
timings = {}
for i in ii:
n = 2 ** i
k = 1 + m // n
if verbose:
print(f"i={i}, n={n}, m={m}, k={k}")
arrs = np.random.random((k, n))
base = np.array([funcs[0](arr, *args, **kws) for arr in arrs])
timings[n] = []
for func in funcs:
res = np.array([func(arr, *args, **kws) for arr in arrs])
is_good = is_equal(base, res)
timed = %timeit -n 8 -r 8 -q -o [func(arr, *args, **kws) for arr in arrs]
timing = timed.best / k
timings[n].append(timing if is_good else None)
if verbose:
print(
f"{func.__name__:>24}"
f"  {is_good!s:5}"
f"  {timing * (10 ** units[unit]):10.3f} {unit}"
f"  {timings[n][0] / timing:5.1f}x")
return timings, labels




def plot(timings, labels, title=None, xlabel="Input Size / #", unit="ms"):
n_rows = 1
n_cols = 3
fig, axs = plt.subplots(n_rows, n_cols, figsize=(8 * n_cols, 6 * n_rows), squeeze=False)
units = {"s": 0, "ms": 3, "µs": 6, "ns": 9}
df = pd.DataFrame(data=timings, index=labels).transpose()
    

base = df[[labels[0]]].to_numpy()
(df * 10 ** units[unit]).plot(marker="o", xlabel=xlabel, ylabel=f"Best timing / {unit}", ax=axs[0, 0])
(df / base * 100).plot(marker='o', xlabel=xlabel, ylabel='Relative speed /labels %', logx=True, ax=axs[0, 1])
(base / df).plot(marker='o', xlabel=xlabel, ylabel='Speed Gain / x', ax=axs[0, 2])


if title:
fig.suptitle(title)
fig.patch.set_facecolor('white')

funcs = between, between_nb
timings, labels = benchmark(funcs, args=(0.25, 0.75), unit="µs", verbose=False)
plot(timings, labels, unit="µs")

结果是:

表明(在我的测试条件下) :

对于更大和更小的输入，Numba 的方法可以快20%
对于中等大小的输入，NumPy 方法通常更快