Numpy 并行化向量操作

Numpy是一个强大的Python库，用于存储和操作大型的多维数组。尽管它比其他类似的集合（如列表）更快更高效，但我们可以通过使用并行化机制进一步提高其性能。并行化意味着将任务分成多个进程，以实现一个共同的目标。Python提供了几种并行化numpy向量操作的方法，包括多进程和numexpr模块。

用于并行化NumPy向量操作的Python程序

让我们讨论一下并行化numpy向量的方法：

使用多进程

每个Python程序被认为是一个单独的进程，有时需要同时运行多个进程。为此，Python提供了一个名为multiprocessing的模块，其中包含一个名为 ‘Pool()’ 的内置方法，允许同时创建和执行多个任务。

示例

以下示例演示了如何使用多进程并行化对向量的每个元素进行平方的操作。

方法

第一步是导入numpy库，并使用np作为引用名称，并使用mp作为引用名称导入multiprocessing。

然后，创建一个带有参数的用户定义方法。

在此方法中，使用cpu_count()方法确定可用的CPU进程数。这个值将用于创建一个用于并行计算的工作进程池。
然后，使用Pool创建一个进程池，它以num_processes作为参数，指定可用的CPU进程数。
现在，使用map方法将square()方法应用于输入向量的每个元素。该方法将输入向量分成若干块，并将每个块分配给一个工作进程进行计算。map函数会自动将工作负载分配到可用的进程上，并按照输入向量的顺序返回结果。
一旦映射完成，我们将使用close()方法关闭进程池，并使用join()方法等待所有工作进程完成。
最后，返回结果，即通过并行计算获得的平方值的列表。

现在，创建一个numpy向量，并将其作为参数传递给该方法，并显示结果。

# importing required packages
import numpy as np
import multiprocessing as mp
# user-defined method to print square of vector
def square_vector_parallel(vector):
   num_processes = mp.cpu_count()
   pool = mp.Pool(processes = num_processes)
   result = pool.map(np.square, vector)
   pool.close()
   pool.join()
   return result
# creating a numpy vector
vec_tr = np.array([1, 2, 3, 4, 5])
# calling the method
result = square_vector_parallel(vec_tr)
# printing the result
print(result)

输出

[1, 4, 9, 16, 25]

使用numexpr

Python中的这个包具有并行计算的能力，并利用多个核心或SIMD指令，从而实现NumPy向量的快速高效性能。

示例1

在下面的示例中，我们将使用’numexpr’库对两个NumPy向量执行并行加法操作。

# importing required packages
import numpy as np
import numexpr as nex
# creating two numpy vectors  
a1 = np.array([5, 2, 7, 4, 5])
a2 = np.array([4, 8, 3, 9, 5])
# printing the result
print(nex.evaluate('a1 + a2'))

输出

[ 9 10 10 13 10]

示例2

这是另一个示例，演示了numexpr的使用。我们将创建一个自定义方法，它以一个向量作为参数。在这个方法内部，将表达式expr定义为’vector**2’，它会对输入向量的每个元素进行平方，并将其传递给numexpr的’evaluate()’方法来以并行化的方式评估表达式。

import numpy as np
import numexpr as nex
# user-defined method to print square of vector
def square_vector_parallel(vector):
   expr = 'vector**2'
   result = nex.evaluate(expr)
   return result
# creating a numpy vector
vec_tr = np.array([4, 8, 6, 9, 5])
# calling the method
result = square_vector_parallel(vec_tr)
# printing the result
print(result)

输出

[16 64 36 81 25]

示例3

在前一个示例的代码中，我们使用’set_num_threads()’方法明确地设置线程数为4。这使得我们能够在表达式的评估中进行线程级并行计算。

import numpy as np
import numexpr as nex
# user-defined method to print square of vector
def square_vector_parallel(vector):
# Set the number of threads to utilize
   nex.set_num_threads(4)  
   expr = 'vector**2'
   result = nex.evaluate(expr)
   return result
# creating a numpy vector
vec_tr = np.array([4, 8, 6, 9, 5])
# calling the method
result = square_vector_parallel(vec_tr)
# printing the result
print(result)