Webhook Retry Policy
3 minute read
In this guide we cover:
- Why we attempt to retry
- How the retry process works
- Available interventions in the retry process
Why do we retry?
Marketplacer considers a successful webhook event to be one that generates a HTTP 2xx response back to Marketplacer. Webhook events are considered to be unsuccessful if:
- The request timed out (currently 30s)
- The request fails due to a network error
- The request returns a non 2xx response code
In the case of a failure we may enter into a retry process, we do this to give the webhook event the best chance possible of being successfully delivered. Additionally, if we receive a 429 error we will also decrement the webhook configuration’s concurrent sending limit if it is greater than 1.
We consider a delivery attempt to be retriable if:
- The request failed due to a network error or timeout
- The request failed due to a HTTP 5xx or HTTP 429 error
We consider a delivery attempt to be not retriable if:
- The request failed due to too many HTTP 3xx redirects
- The request failed due to a HTTP 4xx (excluding 429) error
If a delivery attempt is not retriable, it will go straight to the ‘failed’ state and will not be retried. Additionally, the webhook will be disabled. You will see a notification of this on your operator dashboard the next time you log in.
The rest of this document details how we retry.
Retry Process
The retry process is shown below:
Some points to note about the retry process:
- We use the “exponential backoff” pattern to retry:
- The time interval between retries increases in an exponential way
- Early retries are close together (e.g within minutes of each other)
- Later retires are far apart (e.g. within hours or days of each other)
- The minimum interval between retries is 1 minute.
- The maximum interval between retries is 4 days.
- The number of retries is configured in the webhook definition.
- The default is 25 retries.
- The permitted range is from 1 to 50 retries.
- Retries will stop 30 days after the payload was generated, regardless of how many attempts have been made.
- The complete retry cycle (25 attempts) is expected to span ~30 days
- The retry cycle can be cancelled
- The retry cycle can be requeued (restarted)
- If you change the webhook config in between retries, (e.g. the endpoint) the new config will take effect on the next retry
Retry Formula
The retry formula can currently be expressed as:(retry_count ** 4) seconds
(minimum 60 seconds, maximum 4 days)
Retry Interventions
In some circumstances you may be able to intervene in the webhook event / retry process, the ability to intervene is dependent on the “state” of the webhook event.
Note
Interventions are currently “manual”, i.e. they can only be performed via the UI, (not the API). For more information on how you can “intervene” refer to the Errors and Faults article.Possible states are described below:
- Unsent: The webhook event has been triggered but we have not attempted to send yet.
- This state only occurs prior to the 1st send attempt (not subsequent retries - see Rejected state) and should only exist for short period of time.
- Sent: The webhook has successfully been sent.
- Cancelled: The retry process has been stopped (by someone).
- Failed: The retry process has ended unsuccessfully.
- Rejected: The retry process is on-going and so far we have only received error responses from the webhook endpoint.
- You will see the current number of retry attempts along with this status.
- Skipped: [Not shown above] If deduplication is turned on we have determined to skip sending this event.
- This happens outside the Retry Cycle so is not shown in the retry flow.
An example of the interventions you can perform for each state are shown below: