使用Python请求的异步请求

我尝试了python请求库文档中提供的示例。

使用async.map(rs)，我获得了响应代码，但我想获得所请求的每个页面的内容。例如，这是行不通的:

out = async.map(rs)
print out[0].content

当前回答

我对发布的大多数答案都有很多问题——他们要么使用了已弃用的库，这些库已经移植了有限的功能，要么提供了一个在执行请求时具有太多魔力的解决方案，使得错误处理变得困难。如果它们不属于上述类别之一，则它们是第三方库或已弃用。

有些解决方案完全适用于http请求，但解决方案不适用于任何其他类型的请求，这是可笑的。这里不需要高度定制的解决方案。

简单地使用python内置库asyncio就足以执行任何类型的异步请求，并为复杂的和特定于用例的错误处理提供足够的流动性。

import asyncio

loop = asyncio.get_event_loop()

def do_thing(params):
    async def get_rpc_info_and_do_chores(id):
        # do things
        response = perform_grpc_call(id)
        do_chores(response)

    async def get_httpapi_info_and_do_chores(id):
        # do things
        response = requests.get(URL)
        do_chores(response)

    async_tasks = []
    for element in list(params.list_of_things):
       async_tasks.append(loop.create_task(get_chan_info_and_do_chores(id)))
       async_tasks.append(loop.create_task(get_httpapi_info_and_do_chores(ch_id)))

    loop.run_until_complete(asyncio.gather(*async_tasks))

它的工作原理很简单。您正在创建一系列希望异步发生的任务，然后请求一个循环执行这些任务并在完成时退出。不需要维护额外的库，也不缺少所需的功能。

2019-12-08 21:41:27

其他回答

from threading import Thread

threads=list()

for requestURI in requests:
    t = Thread(target=self.openURL, args=(requestURI,))
    t.start()
    threads.append(t)

for thread in threads:
    thread.join()

...

def openURL(self, requestURI):
    o = urllib2.urlopen(requestURI, timeout = 600)
    o...

2013-01-16 23:32:54

我知道这已经关闭了一段时间，但我认为推广另一种基于请求库的异步解决方案可能是有用的。

list_of_requests = ['http://moop.com', 'http://doop.com', ...]

from simple_requests import Requests
for response in Requests().swarm(list_of_requests):
    print response.content

文档在这里:http://pythonhosted.org/simple-requests/

2013-10-21 14:57:30

我赞同上述使用HTTPX的建议，但我经常以不同的方式使用它，所以我补充了我的答案。

我个人使用asyncio.run(在Python 3.7中引入)而不是asyncio。收集，也更喜欢aiostream方法，它可以与asyncio和httpx结合使用。

就像我刚刚发布的这个例子一样，这种风格对于异步处理一组url很有帮助，尽管(常见的)错误发生了。我特别喜欢这种风格如何阐明响应处理发生在哪里，以及如何简化错误处理(我发现异步调用倾向于提供更多的错误处理)。

发布一个简单的异步发出一堆请求的例子更容易，但通常您还想处理响应内容(用它计算一些东西，可能引用您请求的URL要处理的原始对象)。

这种方法的核心是:

async with httpx.AsyncClient(timeout=timeout) as session:
    ws = stream.repeat(session)
    xs = stream.zip(ws, stream.iterate(urls))
    ys = stream.starmap(xs, fetch, ordered=False, task_limit=20)
    process = partial(process_thing, things=things, pbar=pbar, verbose=verbose)
    zs = stream.map(ys, process)
    return await zs

地点:

Process_thing是一个异步响应内容处理函数 things是输入列表(URL字符串的URL生成器来自于此)，例如对象/字典列表 Pbar是一个进度条(例如tqdm.tqdm)[可选但有用]

所有这些都放在一个async_fetch_urlset异步函数中，然后通过调用一个名为fetch_things的同步“顶级”函数来运行，该函数运行协程[这是async函数返回的内容]并管理事件循环:

def fetch_things(urls, things, pbar=None, verbose=False):
    return asyncio.run(async_fetch_urlset(urls, things, pbar, verbose))

由于作为输入传递的列表(这里是things)可以就地修改，因此可以有效地获得返回的输出(就像我们从同步函数调用中习惯的那样)

2021-08-09 16:52:02

我测试了两个请求——未来请求和请求请求。Grequests速度更快，但会带来猴子补丁和依赖关系的其他问题。请求-期货比请求慢几倍。我决定编写自己的请求，并简单地将请求包装到ThreadPoolExecutor中，它几乎和grequest一样快，但没有外部依赖。

import requests
import concurrent.futures

def get_urls():
    return ["url1","url2"]

def load_url(url, timeout):
    return requests.get(url, timeout = timeout)

with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:

    future_to_url = {executor.submit(load_url, url, 10): url for url in     get_urls()}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            resp_err = resp_err + 1
        else:
            resp_ok = resp_ok + 1

2015-11-18 10:08:01

声明:下面的代码为每个函数创建了不同的线程。

这对于某些情况可能是有用的，因为它使用起来更简单。但要知道，它不是异步的，但使用多线程会给人一种异步的错觉，尽管decorator建议这样做。

可以使用以下装饰器在函数执行完成后给出回调，回调必须处理函数返回的数据。

请注意，在函数被修饰后，它将返回一个Future对象。

import asyncio

## Decorator implementation of async runner !!
def run_async(callback, loop=None):
    if loop is None:
        loop = asyncio.get_event_loop()

    def inner(func):
        def wrapper(*args, **kwargs):
            def __exec():
                out = func(*args, **kwargs)
                callback(out)
                return out

            return loop.run_in_executor(None, __exec)

        return wrapper

    return inner

实现示例:

urls = ["https://google.com", "https://facebook.com", "https://apple.com", "https://netflix.com"]
loaded_urls = []  # OPTIONAL, used for showing realtime, which urls are loaded !!


def _callback(resp):
    print(resp.url)
    print(resp)
    loaded_urls.append((resp.url, resp))  # OPTIONAL, used for showing realtime, which urls are loaded !!


# Must provide a callback function, callback func will be executed after the func completes execution
# Callback function will accept the value returned by the function.
@run_async(_callback)
def get(url):
    return requests.get(url)


for url in urls:
    get(url)

如果你想看到实时加载的url，你可以在最后添加以下代码:

while True:
    print(loaded_urls)
    if len(loaded_urls) == len(urls):
        break

2020-12-30 15:29:59

使用Python请求的异步请求

推荐文章

最新文章

标签