我正在努力理解Python中的线程。我看过文档和示例,但坦率地说,许多示例过于复杂,我很难理解它们。

如何清楚地显示为多线程划分的任务?


当前回答

这里有一个简单的示例:您需要尝试一些替代URL,并返回第一个URL的内容以进行响应。

import Queue
import threading
import urllib2

# Called by each thread
def get_url(q, url):
    q.put(urllib2.urlopen(url).read())

theurls = ["http://google.com", "http://yahoo.com"]

q = Queue.Queue()

for u in theurls:
    t = threading.Thread(target=get_url, args = (q,u))
    t.daemon = True
    t.start()

s = q.get()
print s

在这种情况下,线程被用作一种简单的优化:每个子线程都在等待URL解析和响应,以将其内容放入队列;每个线程都是一个守护进程(如果主线程结束,则不会保持进程运行——这比不结束更常见);主线程启动所有子线程,在队列中执行get以等待其中一个线程完成put,然后发出结果并终止(这将删除所有可能仍在运行的子线程,因为它们是守护进程线程)。

Python中线程的正确使用总是与I/O操作相关(因为CPython无论如何都不使用多个内核来运行CPU绑定的任务,线程的唯一原因是在等待一些I/O时不会阻塞进程)。顺便说一句,队列几乎总是将工作分配给线程和/或收集工作结果的最佳方式,而且它们本质上是线程安全的,因此它们使您不用担心锁、条件、事件、信号量和其他线程间协调/通信概念。

其他回答

注意:对于Python中的实际并行化,您应该使用多处理模块来分叉并行执行的多个进程(由于全局解释器锁,Python线程提供了交织,但实际上它们是串行执行的,而不是并行执行的,并且仅在交织I/O操作时有用)。

然而,如果您只是在寻找交错(或者正在执行可以并行化的I/O操作,尽管存在全局解释器锁),那么线程模块就是开始的地方。作为一个非常简单的例子,让我们考虑通过并行对子范围求和来对大范围求和的问题:

import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         super(SummingThread, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result

请注意,以上是一个非常愚蠢的示例,因为它绝对没有I/O,并且由于全局解释器锁,虽然在CPython中交错执行(增加了上下文切换的开销),但仍将串行执行。

对我来说,线程的最佳示例是监视异步事件。看看这个代码。

# thread_test.py
import threading
import time

class Monitor(threading.Thread):
    def __init__(self, mon):
        threading.Thread.__init__(self)
        self.mon = mon

    def run(self):
        while True:
            if self.mon[0] == 2:
                print "Mon = 2"
                self.mon[0] = 3;

您可以通过打开IPython会话并执行以下操作来使用此代码:

>>> from thread_test import Monitor
>>> a = [0]
>>> mon = Monitor(a)
>>> mon.start()
>>> a[0] = 2
Mon = 2
>>>a[0] = 2
Mon = 2

等几分钟

>>> a[0] = 2
Mon = 2

Python 3具有启动并行任务的功能。这使我们的工作更容易。

它有线程池和进程池。

以下内容提供了一个见解:

ThreadPoolExecutor示例(源代码)

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

ProcessPoolExecutor(源)

import concurrent.futures
import math

PRIMES = [
    112272535095293,
    112582705942171,
    112272535095293,
    115280095190773,
    115797848077099,
    1099726899285419]

def is_prime(n):
    if n % 2 == 0:
        return False

    sqrt_n = int(math.floor(math.sqrt(n)))
    for i in range(3, sqrt_n + 1, 2):
        if n % i == 0:
            return False
    return True

def main():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
            print('%d is prime: %s' % (number, prime))

if __name__ == '__main__':
    main()

借用本文,我们了解了如何在多线程、多处理和异步/异步之间进行选择及其用法。

Python 3有一个新的内置库,以实现并发和并行-concurrent.futures

因此,我将通过一个实验演示如何通过线程池运行四个任务(即.sleep()方法):

from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep, time

def concurrent(max_worker):
    futures = []
    tic = time()
    with ThreadPoolExecutor(max_workers=max_worker) as executor:
        futures.append(executor.submit(sleep, 2))  # Two seconds sleep
        futures.append(executor.submit(sleep, 1))
        futures.append(executor.submit(sleep, 7))
        futures.append(executor.submit(sleep, 3))
        for future in as_completed(futures):
            if future.result() is not None:
                print(future.result())
    print(f'Total elapsed time by {max_worker} workers:', time()-tic)

concurrent(5)
concurrent(4)
concurrent(3)
concurrent(2)
concurrent(1)

输出:

Total elapsed time by 5 workers: 7.007831811904907
Total elapsed time by 4 workers: 7.007944107055664
Total elapsed time by 3 workers: 7.003149509429932
Total elapsed time by 2 workers: 8.004627466201782
Total elapsed time by 1 workers: 13.013478994369507

[注]:

正如您在上面的结果中看到的,最好的情况是这四项任务有3名员工。如果有进程任务而不是I/O绑定或阻塞(多处理而不是线程),则可以将ThreadPoolExecutor更改为ProcessPoolExecutoor。

我想提供一个简单的例子,以及我在自己解决这个问题时发现有用的解释。

在这个答案中,您将找到一些关于Python的GIL(全局解释器锁)的信息,以及一个使用multiprocessing.dummy编写的简单日常示例,以及一些简单的基准测试。

全局解释器锁(GIL)

Python不允许真正意义上的多线程。它有一个多线程包,但是如果你想多线程来加快你的代码,那么使用它通常不是一个好主意。

Python有一个称为全局解释器锁(GIL)的构造。GIL确保在任何时候只能执行一个“线程”。一个线程获取GIL,做一些工作,然后将GIL传递给下一个线程。

这种情况发生得很快,因此在人眼看来,您的线程似乎是并行执行的,但它们实际上只是轮流使用相同的CPU内核。

所有这些GIL传递都增加了执行开销。这意味着如果你想让你的代码运行得更快,那么使用线程打包通常不是个好主意。

使用Python的线程包是有原因的。如果你想同时运行一些事情,而效率不是一个问题,那就很好,也很方便。或者,如果您运行的代码需要等待一些东西(比如一些I/O),那么这可能很有意义。但是线程库不允许您使用额外的CPU内核。

多线程可以外包给操作系统(通过执行多线程处理),以及一些调用Python代码的外部应用程序(例如,Spark或Hadoop),或者Python代码调用的一些代码(例如:您可以让Python代码调用一个C函数来完成昂贵的多线程任务)。

为什么这很重要

因为很多人在了解GIL是什么之前,会花很多时间在他们的Python多线程代码中寻找瓶颈。

一旦这些信息清楚,下面是我的代码:

#!/bin/python
from multiprocessing.dummy import Pool
from subprocess import PIPE,Popen
import time
import os

# In the variable pool_size we define the "parallelness".
# For CPU-bound tasks, it doesn't make sense to create more Pool processes
# than you have cores to run them on.
#
# On the other hand, if you are using I/O-bound tasks, it may make sense
# to create a quite a few more Pool processes than cores, since the processes
# will probably spend most their time blocked (waiting for I/O to complete).
pool_size = 8

def do_ping(ip):
    if os.name == 'nt':
        print ("Using Windows Ping to " + ip)
        proc = Popen(['ping', ip], stdout=PIPE)
        return proc.communicate()[0]
    else:
        print ("Using Linux / Unix Ping to " + ip)
        proc = Popen(['ping', ip, '-c', '4'], stdout=PIPE)
        return proc.communicate()[0]


os.system('cls' if os.name=='nt' else 'clear')
print ("Running using threads\n")
start_time = time.time()
pool = Pool(pool_size)
website_names = ["www.google.com","www.facebook.com","www.pinterest.com","www.microsoft.com"]
result = {}
for website_name in website_names:
    result[website_name] = pool.apply_async(do_ping, args=(website_name,))
pool.close()
pool.join()
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))

# Now we do the same without threading, just to compare time
print ("\nRunning NOT using threads\n")
start_time = time.time()
for website_name in website_names:
    do_ping(website_name)
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))

# Here's one way to print the final output from the threads
output = {}
for key, value in result.items():
    output[key] = value.get()
print ("\nOutput aggregated in a Dictionary:")
print (output)
print ("\n")

print ("\nPretty printed output: ")
for key, value in output.items():
    print (key + "\n")
    print (value)