Skip to main content

Webhook best practices

This page outlines practical recommendations to avoid common pitfalls and keep webhook integrations reliable at scale.
The guidance reflects how the Notifications service actually dispatches webhooks.

Template configuration

  • Prefer specific triggers over broad ones

    • Very broad events (for example, “account activity” style events) can generate high volumes of webhook calls.
    • Retries for transient failures may amplify traffic bursts.
    • For high-volume use cases, consider:
      • Narrowing conditions.
      • Using separate templates and/or endpoints to isolate load.
  • Use HTTPS endpoints only

    • Webhook destinations must use https. Non-HTTPS URLs are rejected before dispatch.
    • Avoid private IPs or internal-only hosts that cannot be reached from Mambu’s infrastructure.
  • Validate destinations and headers

    • Use valid, production URLs—not placeholder or test values.
    • Avoid putting secrets in URLs (for example, query parameters). Put credentials or tokens into headers instead.
    • Do not rely on redirects: 3xx responses are not treated as success. Use the final URL directly.
    • Custom headers defined on the template are forwarded to your endpoint (except for some reserved internal headers which are stripped for security).
  • Deactivate templates you no longer need

    • Deactivating a template stops new notifications from being produced upstream.
    • Messages already queued for delivery may still be dispatched by the Notifications service, so expect a short tail of in-flight requests.
  • Multiple templates to the same URL

    • If several templates post to the same endpoint, size and tune that endpoint for the combined throughput and potential retry bursts.
    • Consider separating high-volume and low-volume integrations by URL.

Endpoint design

  • Respond quickly with a 2xx status

    • Only HTTP 2xx responses are treated as success.
    • 3xx, 4xx, 5xx, timeouts, and network/TLS errors are treated as failures.
    • Process the webhook asynchronously on your side (for example, enqueue to a job queue) and return 2xx as soon as you accept the payload.
    • Requests are sent with a timeout; heavy synchronous work increases the chance of timeouts and repeated delivery attempts.
  • Idempotency and deduplication

    • Webhook requests include x-notifications-idempotency-key.
    • Automatic retries of the same delivery use the same idempotency key.
    • Manual “resend” actions generate a new idempotency key.
    • Use this header to:
      • Detect duplicate deliveries.
      • Make handlers idempotent (for example, ignore a second request with a key that has already been processed successfully).
  • Design for at-least-once delivery

    • Because of retries in case of transient failures, delivery is at-least-once.
    • Your receiver must tolerate duplicates:
      • Use x-notifications-idempotency-key and/or identifiers in the payload to implement safe deduplication.
      • Avoid non-idempotent side effects (for example, double-charging or double-posting).
  • Request method and payload

    • The HTTP method and body are defined by the notification template/message.
    • Ensure your endpoint:
      • Supports the configured HTTP method.
      • Validates and safely processes the payload format you expect (for example, JSON).
  • Security and authentication

    • Require HTTPS.
    • Use header-based authentication (for example, API keys, HMAC signatures, or OAuth tokens) rather than credentials in URLs.
    • Rotate credentials regularly and update template headers accordingly.
    • Validate and sanitize incoming payloads before further processing.
  • Avoid redirects and heavy synchronous logic

    • Do not rely on redirect chains; 3xx responses are not treated as success.
    • Keep synchronous logic small and fail-fast; push heavy work to background jobs.

Monitoring and alerting

  • Use status codes as signals

    • 2xx – Success. No retry.
    • 4xx – Client errors (typically non-retryable).
      • Often indicates misconfiguration, missing/invalid credentials, or authorization issues.
      • Fix the problem quickly to restore delivery.
    • 5xx, timeouts, and network/TLS errors – Treated as transient.
      • These are retried with backoff and contribute to protection mechanisms (circuit breaker).
  • Track latency and timeouts

    • Monitor response times for your webhook endpoints.
    • Sustained increases in latency lead to timeouts and retries.
    • Alert on:
      • Rising average or P95/P99 latency.
      • Increases in timeout rates.
  • Instrument your receiver logs

    Log enough information to understand what happened without storing unnecessary sensitive data. For each request, log for example:

    • Timestamp.
    • Endpoint/path and HTTP method.
    • HTTP status code.
    • The x-notifications-idempotency-key value.
    • High-level processing result (accepted / rejected / failed).
    • A correlation ID or your own trace ID if you add one via custom headers.

    Avoid logging full payloads if they may contain personal or sensitive data; consider structured, redacted logging.

  • Watch for failure streaks

    • Spikes in non-2xx responses (especially 5xx) and sustained failure streaks usually indicate:
      • Receiver outages or dependency failures.
      • Misconfigurations (for 4xx).
    • These patterns also influence retry behavior and may contribute to circuit-breaker protection, temporarily reducing or pausing deliveries to protect your systems.

Failure handling, retries, and protection

  • Retries

    • Certain failures (for example, 5xx, timeouts, and network errors) are retried according to internal policies with backoff.
    • This means you may see repeated attempts for the same webhook until either:
      • It succeeds with a 2xx, or
      • Retry limits are reached / protection mechanisms apply.
  • Circuit breaker

    • When repeated failures occur for a specific destination (for example, persistent 5xx or timeouts), a protection mechanism may temporarily pause or slow deliveries for that template and tenant.
    • This helps:
      • Prevent overwhelming an unstable endpoint.
      • Protect the overall system.
    • Once the destination recovers and failures cease, deliveries resume automatically according to the circuit breaker’s rules.
  • What this means for you

    • Keep your endpoint healthy and fast.
    • Use clear 2xx/4xx/5xx responses to signal the real state.
    • Fix persistent 4xx errors quickly.
    • Scale receiver capacity or apply graceful degradation during incidents.

Security recommendations

  • Use HTTPS everywhere for webhook endpoints.
  • Prefer short-lived tokens or signed headers rather than long-lived static secrets.
  • Restrict access to your endpoints using:
    • Network controls (IP allowlists, if appropriate).
    • Application-level authentication and authorization.
  • Validate payloads strictly and apply input validation to avoid injection or deserialization issues.

Troubleshooting tips

  • No events are received

    • Check that:
      • The relevant webhook template is Active.
      • The triggering business event actually occurs.
      • The destination URL is correct and reachable from the public internet.
    • Verify that your endpoint is returning 2xx and not silently failing with 4xx/5xx.
  • Lots of 4xx responses

    • Likely misconfiguration:
      • Wrong URL or path.
      • Missing or invalid authentication.
      • Authorization rules blocking the request.
    • Update configuration and redeploy. Consider returning 2xx if you accept the event and handle business-level errors asynchronously.
  • Lots of 5xx, timeouts, or connection errors

    • Indicates that your endpoint or its dependencies are unhealthy or overloaded.
    • Scale horizontally, add caching, or temporarily reduce downstream work.
    • Expect the Notifications service to retry with backoff and, in persistent cases, to apply circuit-breaker protection.
  • Duplicates seen in logs or processing

    • Use x-notifications-idempotency-key and/or your own business identifiers to avoid processing the same event twice.
    • Confirm your handlers are idempotent (safe to call multiple times with the same key).

See also: