Skip to main content

Event Streaming Configurations

This page describes in detail the behaviour of the parameters used for event streaming, and our recommendations for various streaming configurations.

batch_limit

Defines the maximum number of events that the server will send in each batch during streaming.

  • Low value (e.g., 1-10):
    • If the batch_limit is low, the application will receive small batches of events frequently. This will cause the application to make frequent calls to commit offsets, especially if commit_timeout is set low (see below).
    • With commit_timeout: If commit_timeout is low (e.g., 5 seconds), the client needs to commit the small batches quickly to avoid reaching the commit timeout, potentially causing rebalancing if it fails to do so in time.
    • Pros:
      • Lower latency: Faster processing of individual events since batches are smaller and processed more frequently.
      • Reduced memory usage: Less memory is required to hold smaller batches of events.
    • Cons:
      • Decreased throughput: More frequent HTTP requests or commit operations can lead to increased overhead and lower overall throughput.
      • Higher overhead: Increased number of network calls can strain resources, especially under high load.
    • Example
      • With a low commit_timeout (5 seconds), the client must commit very frequently, increasing the risk of timeouts and frequent rebalancing.
      • The application fetches and processes up to 10 events in a single batch.
      • More frequent commits are made, increasing the number of HTTP calls.
      • Suitable for scenarios requiring low latency in event processing.
    • Use case: Real-time applications where immediate processing and responsiveness are crucial, such as fraud detection systems.
  • High value (e.g., 100-1000):
    • A higher batch_limit leads to fewer, larger batches, reducing the number of network requests. However, if commit_timeout is low, the client must commit larger batches within the timeout window, which could increase latency in processing.
    • If commit_timeout is high (e.g., 30-60 seconds), larger batches can be processed and committed without the pressure of frequent timeouts, resulting in smoother streaming and higher throughput.
    • Pros:
      • Increased throughput: Processing more events in a single batch reduces the number of HTTP requests or commit operations, enhancing overall throughput.
      • Efficiency: Less overhead from network calls and cursor commits, leading to more efficient resource utilization.
    • Cons:
      • Latency: Higher latency in processing individual events, as the application waits to accumulate a large number of events before processing.
      • Memory usage: Increased memory consumption as more events are held in memory before processing.
    • Example
      • With a high commit_timeout (60 seconds), the application can process and commit larger batches with minimal rebalancing and smoother operation.
      • The application fetches and processes up to 400 events in a single batch.
      • Fewer commits are made, reducing the frequency of HTTP calls to commit offsets.
      • Suitable for scenarios where high throughput is prioritized over low latency.
    • Use case: Processing bulk data operations where individual event latency is less critical, such as nightly batch processing tasks.

max_uncommitted_events

Defines the maximum number of uncommitted events allowed before the server pauses and waits for a commit.

  • Low value (e.g., 10):
    • The server frequently pauses the stream to wait for a commit after just a few events. This can cause delays in the stream, especially if the batch_limit is large or if the commit operation takes time.
    • With max_uncommitted_events: If commit_timeout is low, the client will struggle to commit events fast enough to avoid pausing. Frequent rebalancing can occur if the client doesn’t meet the commit deadline.
    • Pros:
      • Lower risk of data loss: Fewer events are buffered, minimizing the potential loss in case of failures.
      • Controlled memory usage: Limits the number of events held in memory, preventing excessive memory consumption.
    • Cons:
      • Increased commit frequency: More frequent commits may be necessary, leading to higher overhead and potentially lower throughput.
      • Reduced flexibility: Less capacity to handle bursts of high event rates without committing more frequently.
    • Example
      • With a low commit_timeout (5 seconds), the stream will frequently pause, and the application must be highly responsive in committing to avoid timeouts and session closures.
      • The application buffers up to 20 events across all partitions before committing.
      • Ensures that offsets are committed frequently, reducing the risk of significant data loss.
    • Use case: Critical real-time applications where data integrity is paramount, such as financial transaction processing systems.
  • High value (e.g., 1000):
    • The client has more flexibility to process and commit events without frequent pauses, leading to better throughput. The risk is that if the application crashes, a larger number of uncommitted events may be lost or reprocessed.
    • A high commit_timeout allows the client more time to handle the larger number of uncommitted events. However, if commit_timeout is too low, the client may not be able to commit large batches in time, risking session termination and rebalancing.
    • Pros:
      • Increased flexibility: Allows more events to be buffered, accommodating spikes in event rates without immediate commits.
      • Reduced commit frequency: Fewer commits are required, lowering overhead and improving throughput.
    • Cons:
      • Higher risk of data loss: In the event of a crash or failure, more uncommitted events may be lost or need to be reprocessed.
      • Increased memory usage: More events are held in memory, potentially leading to higher memory consumption.
    • Example
      • With a high commit_timeout (60 seconds), the application can commit larger chunks at a slower pace without significant risk of stream pausing.
      • The application allows up to 1000 events to be buffered across all partitions before requiring a commit. This reduces the frequency of commit operations, enhancing throughput.
    • Use case: Applications that can tolerate some level of data loss in exchange for higher performance, such as logging systems where occasional duplicates are acceptable.

batch_flush_timeout

Defines the maximum time the server will wait before flushing a batch of events, even if it hasn't reached the batch_limit.

  • Low value (e.g., 5 seconds):
    • The server flushes smaller batches more frequently. With a low commit_timeout, this can lead to more frequent commits as smaller batches will need to be committed quickly.
    • If commit_timeout is low, the client may struggle to keep up with frequent commits. However, if commit_timeout is high, the client has more time to process and commit the smaller batches, avoiding rebalancing or session termination.
    • Pros:
      • Lower risk of data loss: Fewer events are buffered, minimizing the potential loss in case of failures.
      • Controlled memory usage: Limits the number of events held in memory, preventing excessive memory consumption.
    • Cons:
      • Increased commit frequency: More frequent commits may be necessary, leading to higher overhead and potentially lower throughput.
      • Reduced flexibility: Less capacity to handle bursts of high event rates without committing more frequently.
    • Example
      • With a low commit_timeout (5 seconds), the client is at risk of frequent rebalancing because it needs to commit small, frequent batches quickly.
      • The application flushes events as soon as 5 seconds have passed, regardless of whether the batch_limit is reached.
      • Smaller batches are processed more frequently, reducing latency.
    • Use case: Real-time applications where timely processing is essential, such as live analytics dashboards or real-time monitoring systems.
  • High value (e.g., 60 seconds):
    • The server waits longer before flushing, leading to larger batches but longer delays between receiving data. This can lead to fewer commits, but the size of the batches can make committing slower if commit_timeout is low.
    • If commit_timeout is low, large batches may overwhelm the client, leading to commit failures. With a higher commit_timeout, the client has adequate time to commit larger batches without risking session closure.
    • Pros:
      • Increased flexibility: Allows more events to be buffered, accommodating spikes in event rates without immediate commits.
      • Reduced commit frequency: Fewer commits are required, lowering overhead and improving throughput.
    • Cons:
      • Higher risk of data loss: In the event of a crash or failure, more uncommitted events may be lost or need to be reprocessed.
      • Increased memory usage: More events are held in memory, potentially leading to higher memory consumption.
    • Example
      • With a high commit_timeout (60 seconds), the client can handle larger, less frequent batches more efficiently without worrying about session timeouts.
      • The application waits up to 60 seconds to accumulate events before flushing.
      • Larger batches are likely to be processed, enhancing throughput.
    • Use case: Applications where high throughput is critical and some latency is acceptable, such as bulk data ingestion systems.

commit_timeout

Defines the maximum amount of time the server will wait for the client to commit after sending a batch. If the client doesn’t commit in time, the stream is terminated, and the partitions may be reassigned.

  • Low value (e.g., 5-10 seconds):
    • The client must be highly responsive in committing events. If the client takes too long (due to network delays or heavy processing), it may frequently trigger session terminations and rebalancing.
    • When batch_limit or batch_flush_timeout are high, the client might not have enough time to commit large batches within the short timeout, risking session closures.
    • Pros:
      • Lower latency: Faster detection of consumers that are slow or unable to commit within the allowed timeframe, leading to quicker rebalancing.
      • Improved responsiveness: The system quickly reassigns uncommitted events if a consumer fails to commit, improving overall responsiveness in failure scenarios.
    • Cons:
      • Higher risk of rebalancing: More frequent rebalancing, as consumers may not always be able to commit within the shorter time window, leading to potential interruptions.
      • Increased overhead: Constant rebalancing can increase the number of partition reassignments, adding to the system’s overhead and potentially impacting throughput.
    • Example
      • In this case, the client must commit events very quickly, and if the batches are too large (due to a high batch_limit or max_uncommitted_events), the client may fail to commit in time, leading to frequent rebalancing.
      • The application expects commit acknowledgments within 5 seconds after processing a batch of events.
      • Faster detection of unresponsive consumers, leading to quicker rebalancing.
    • Use case: Ideal for low-latency, real-time applications where event processing is fast, such as monitoring systems or real-time analytics platforms.
  • High value (e.g., 30-60 seconds):
    • The client has more time to process and commit events, reducing the risk of session closures. This is generally better for throughput and stability, especially in applications that need to process large batches or handle fluctuating traffic.
    • A high commit_timeout allows for larger batch_limit and max_uncommitted_events values, since the client has more time to process and commit events.
    • Pros:
      • Increased flexibility: Allows more time for slow consumers to process and commit events, reducing the risk of stream rebalancing.
      • Fewer rebalances: Less frequent rebalancing since the timeout gives more time for commit operations to complete.
    • Cons:
      • Increased latency: The application might wait longer for commit operations to complete, potentially delaying the detection of slow or failed consumers.
      • Risk of stale connections: Prolonged commit times may hold resources unnecessarily if a consumer has stalled or failed but hasn’t reached the timeout.
    • Example
      • The client has more time to handle large batches or spikes in traffic, reducing the risk of stream termination due to uncommitted events.
      • The application allows up to 60 seconds for a commit acknowledgment after processing a batch of events.
      • Slow consumers are given enough time to process and commit, reducing the likelihood of rebalancing.
    • Use case: Suitable for applications where event processing may take longer, such as large data processing jobs or tasks that involve external dependencies like database writes.