Skip to main content

Recovery And Retries

Recovery keeps operation work from getting stuck when a worker, provider, or container path does not finish cleanly.

How To Use This Page

  • Check whether an operation is still waiting before retrying manually.
  • Use terminal status and recent events to decide whether a retry is safe.
  • Expect recovery to ignore stale results once newer durable state exists.

Status And Event Map

Operation statuses: queued, running, waiting, blocked, completed, failed, canceled.

StatusWhat It MeansUser Review
queuedWork has been accepted but has not started.Wait for the timeline to move or check queue health.
runningA bot, tool, provider, or container is actively working.Review progress events before retrying.
waitingThe operation is waiting on a callback, schedule, user action, or follow-up turn.Check what dependency is named in the timeline.
blockedWork intentionally stopped until a person or external condition changes.Read the blocking reason before taking action.
completedWork reached a successful terminal state.Inspect produced artifacts, blobs, links, or messages.
failedWork reached an error terminal state.Capture the failing step, visible error, and affected refs.
canceledWork was stopped before completion.Confirm whether a replacement operation exists.

Review Checklist

  • Start from the operation or artifact visible in the app.
  • Follow events in timestamp order.
  • Open produced artifacts or blobs before sharing conclusions.
  • Capture status, route, artifact, operation, and visible error details when escalating.

Media To Add

  • Timeline: failed callback, recovery wake, retried work, and final status. It helps support teams distinguish safe retry from duplicate work. Source: test operation recovery case.