我有一个用例,将有数据流到来,我不能以相同的速度消费它,需要一个缓冲区。这可以使用SNS-SQS队列来解决。我后来才知道,Kinesis解决了同样的目的,所以有什么不同?为什么我应该喜欢(或不应该喜欢)运动?


当前回答

另一件事:Kinesis可以触发Lambda,而SQS不能。因此,对于SQS,您要么必须提供一个EC2实例来处理SQS消息(并在失败时处理它),要么必须有一个预定的Lambda(它不能扩大或缩小—每分钟只能得到一个)。

编辑:这个答案不再正确。自2018年6月起,SQS可以直接触发Lambda

https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html

其他回答

摘自AWS文档:

We recommend Amazon Kinesis Streams for use cases with requirements that are similar to the following: Routing related records to the same record processor (as in streaming MapReduce). For example, counting and aggregation are simpler when all records for a given key are routed to the same record processor. Ordering of records. For example, you want to transfer log data from the application host to the processing/archival host while maintaining the order of log statements. Ability for multiple applications to consume the same stream concurrently. For example, you have one application that updates a real-time dashboard and another that archives data to Amazon Redshift. You want both applications to consume data from the same stream concurrently and independently. Ability to consume records in the same order a few hours later. For example, you have a billing application and an audit application that runs a few hours behind the billing application. Because Amazon Kinesis Streams stores data for up to 7 days, you can run the audit application up to 7 days behind the billing application. We recommend Amazon SQS for use cases with requirements that are similar to the following: Messaging semantics (such as message-level ack/fail) and visibility timeout. For example, you have a queue of work items and want to track the successful completion of each item independently. Amazon SQS tracks the ack/fail, so the application does not have to maintain a persistent checkpoint/cursor. Amazon SQS will delete acked messages and redeliver failed messages after a configured visibility timeout. Individual message delay. For example, you have a job queue and need to schedule individual jobs with a delay. With Amazon SQS, you can configure individual messages to have a delay of up to 15 minutes. Dynamically increasing concurrency/throughput at read time. For example, you have a work queue and want to add more readers until the backlog is cleared. With Amazon Kinesis Streams, you can scale up to a sufficient number of shards (note, however, that you'll need to provision enough shards ahead of time). Leveraging Amazon SQS’s ability to scale transparently. For example, you buffer requests and the load changes as a result of occasional load spikes or the natural growth of your business. Because each buffered request can be processed independently, Amazon SQS can scale transparently to handle the load without any provisioning instructions from you.

对我来说,最大的优势是Kinesis是一个可重玩的队列,而SQS不是。因此,您可以有多个Kinesis的相同消息的消费者(或在不同时间的相同消费者),而在SQS中,一旦消息被ack,它就从队列中消失了。 因此,SQS更适合工作者队列。

Kinesis用例

日志和事件数据收集 实时分析 移动数据采集 “物联网”数据馈送

SQS用例

应用程序集成 解耦microservices 将任务分配给多个工作节点 将实时用户请求与密集的后台工作分离 批处理消息以供将来处理

这些技术的语义不同,因为它们是为支持不同的场景而设计的:

SNS/SQS:流中的项之间没有关联 运动:流中的项目相互关联

让我们通过例子来理解其中的区别。

Suppose we have a stream of orders, for each order we need to reserve some stock and schedule a delivery. Once this is complete, we can safely remove the item from the stream and start processing the next order. We are fully done with the previous order before we start the next one. Again, we have the same stream of orders, but now our goal is to group orders by destinations. Once we have, say, 10 orders to the same place, we want to deliver them together (delivery optimization). Now the story is different: when we get a new item from the stream, we cannot finish processing it; rather we "wait" for more items to come in order to meet our goal. Moreover, if the processor process crashes, we must "restore" the state (so no order will be lost).

一旦一个项目的处理不能与另一个项目的处理分离,我们就必须有运动语义,以便安全地处理所有的情况。

Kinesis支持多消费者功能,这意味着相同的数据记录可以在同一时间或24小时内不同的时间在不同的消费者处处理,SQS中的类似行为可以通过写入多个队列实现,消费者可以从多个队列中读取。然而,再次写入多个队列将在系统中增加几秒(几毫秒)的延迟。

其次,Kinesis提供了路由功能,可以使用分区键将数据记录选择性地路由到不同的分片,这些分区键可以由特定的EC2实例处理,并可以启用微批处理计算{计数和聚合}。

使用任何AWS软件都很容易,但使用SQS是最简单的。使用Kinesis,需要提前提供足够的碎片,动态增加碎片数量以管理尖峰负载,并减少以节省管理所需的成本。这是运动的疼痛,SQS不需要这样的事情。SQS是无限可伸缩的。