我尝试了python请求库文档中提供的示例。
使用async.map(rs),我获得了响应代码,但我想获得所请求的每个页面的内容。例如,这是行不通的:
out = async.map(rs)
print out[0].content
我尝试了python请求库文档中提供的示例。
使用async.map(rs),我获得了响应代码,但我想获得所请求的每个页面的内容。例如,这是行不通的:
out = async.map(rs)
print out[0].content
当前回答
不幸的是,据我所知,请求库不具备执行异步请求的能力。您可以在请求周围包装async/await语法,但这将使底层请求的同步性不会降低。如果您想要真正的异步请求,则必须使用其他提供异步请求的工具。其中一个解决方案是aiohttp (Python 3.5.3+)。根据我在Python 3.7 async/await语法中使用它的经验,它工作得很好。下面我写了执行n个web请求的三个实现
使用Python请求库的纯同步请求(sync_requests_get_all) 同步请求(async_requests_get_all),使用Python 3.7中包装的Python请求库async/await语法和asyncio 一个真正的异步实现(async_aiohttp_get_all), Python aiohttp库封装在Python 3.7 async/await语法和asyncio中
"""
Tested in Python 3.5.10
"""
import time
import asyncio
import requests
import aiohttp
from asgiref import sync
def timed(func):
"""
records approximate durations of function calls
"""
def wrapper(*args, **kwargs):
start = time.time()
print('{name:<30} started'.format(name=func.__name__))
result = func(*args, **kwargs)
duration = "{name:<30} finished in {elapsed:.2f} seconds".format(
name=func.__name__, elapsed=time.time() - start
)
print(duration)
timed.durations.append(duration)
return result
return wrapper
timed.durations = []
@timed
def sync_requests_get_all(urls):
"""
performs synchronous get requests
"""
# use session to reduce network overhead
session = requests.Session()
return [session.get(url).json() for url in urls]
@timed
def async_requests_get_all(urls):
"""
asynchronous wrapper around synchronous requests
"""
session = requests.Session()
# wrap requests.get into an async function
def get(url):
return session.get(url).json()
async_get = sync.sync_to_async(get)
async def get_all(urls):
return await asyncio.gather(*[
async_get(url) for url in urls
])
# call get_all as a sync function to be used in a sync context
return sync.async_to_sync(get_all)(urls)
@timed
def async_aiohttp_get_all(urls):
"""
performs asynchronous get requests
"""
async def get_all(urls):
async with aiohttp.ClientSession() as session:
async def fetch(url):
async with session.get(url) as response:
return await response.json()
return await asyncio.gather(*[
fetch(url) for url in urls
])
# call get_all as a sync function to be used in a sync context
return sync.async_to_sync(get_all)(urls)
if __name__ == '__main__':
# this endpoint takes ~3 seconds to respond,
# so a purely synchronous implementation should take
# little more than 30 seconds and a purely asynchronous
# implementation should take little more than 3 seconds.
urls = ['https://postman-echo.com/delay/3']*10
async_aiohttp_get_all(urls)
async_requests_get_all(urls)
sync_requests_get_all(urls)
print('----------------------')
[print(duration) for duration in timed.durations]
在我的机器上,这是输出:
async_aiohttp_get_all started
async_aiohttp_get_all finished in 3.20 seconds
async_requests_get_all started
async_requests_get_all finished in 30.61 seconds
sync_requests_get_all started
sync_requests_get_all finished in 30.59 seconds
----------------------
async_aiohttp_get_all finished in 3.20 seconds
async_requests_get_all finished in 30.61 seconds
sync_requests_get_all finished in 30.59 seconds
其他回答
上面的答案都没有帮助我,因为他们假设你有一个预定义的请求列表,而在我的情况下,我需要能够侦听请求和异步响应(类似于它在nodejs中的工作方式)。
def handle_finished_request(r, **kwargs):
print(r)
# while True:
def main():
while True:
address = listen_to_new_msg() # based on your server
# schedule async requests and run 'handle_finished_request' on response
req = grequests.get(address, timeout=1, hooks=dict(response=handle_finished_request))
job = grequests.send(req) # does not block! for more info see https://stackoverflow.com/a/16016635/10577976
main()
handle_finished_request回调函数将在收到响应时被调用。注意:由于某些原因,超时(或无响应)在这里不会触发错误
这个简单的循环可以触发异步请求,类似于它在nodejs服务器中的工作方式
我对发布的大多数答案都有很多问题——他们要么使用了已弃用的库,这些库已经移植了有限的功能,要么提供了一个在执行请求时具有太多魔力的解决方案,使得错误处理变得困难。如果它们不属于上述类别之一,则它们是第三方库或已弃用。
有些解决方案完全适用于http请求,但解决方案不适用于任何其他类型的请求,这是可笑的。这里不需要高度定制的解决方案。
简单地使用python内置库asyncio就足以执行任何类型的异步请求,并为复杂的和特定于用例的错误处理提供足够的流动性。
import asyncio
loop = asyncio.get_event_loop()
def do_thing(params):
async def get_rpc_info_and_do_chores(id):
# do things
response = perform_grpc_call(id)
do_chores(response)
async def get_httpapi_info_and_do_chores(id):
# do things
response = requests.get(URL)
do_chores(response)
async_tasks = []
for element in list(params.list_of_things):
async_tasks.append(loop.create_task(get_chan_info_and_do_chores(id)))
async_tasks.append(loop.create_task(get_httpapi_info_and_do_chores(ch_id)))
loop.run_until_complete(asyncio.gather(*async_tasks))
它的工作原理很简单。您正在创建一系列希望异步发生的任务,然后请求一个循环执行这些任务并在完成时退出。不需要维护额外的库,也不缺少所需的功能。
我也尝试过使用python中的异步方法做一些事情,然而我使用twisted进行异步编程的运气要好得多。它的问题较少,并且有良好的文档记录。这里有一个类似于你在twisted中尝试的东西的链接。
http://pythonquirks.blogspot.com/2011/04/twisted-asynchronous-http-request.html
from threading import Thread
threads=list()
for requestURI in requests:
t = Thread(target=self.openURL, args=(requestURI,))
t.start()
threads.append(t)
for thread in threads:
thread.join()
...
def openURL(self, requestURI):
o = urllib2.urlopen(requestURI, timeout = 600)
o...
如果您想使用asyncio,则requests-async为请求提供async/await功能- https://github.com/encode/requests-async