again

package module
v1.2.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 28, 2026 License: MPL-2.0 Imports: 10 Imported by: 0

README

go-again

Go CodeQL

go-again provides:

  • again: a thread-safe retry helper with exponential backoff, jitter, timeout, cancellation, hooks, and a temporary-error registry.
  • pkg/scheduler: a lightweight HTTP scheduler with pluggable state storage (in-memory by default, SQLite built-in) that reuses the retrier for retryable requests and optional callbacks.

Status

As of February 27, 2026, the core retrier hardening work and the scheduler extension described in PRD.md are implemented and covered by tests, including race checks.

Features

Retrier (github.com/hyp3rd/go-again)
  • Configurable MaxRetries, Interval, Jitter, BackoffFactor, and Timeout
  • Do and DoWithContext retry APIs
  • Temporary error filtering via explicit error list and/or Registry
  • Retry-all behavior when no temporary errors are supplied and the registry is empty
  • Cancellation via caller context and Retrier.Cancel() / Retrier.Stop()
  • Errors trace (Attempts, Last) plus Errors.Join()
  • DoWithResult[T] helper
  • Optional slog logger and retry hooks
Scheduler (github.com/hyp3rd/go-again/pkg/scheduler)
  • Interval scheduling with StartAt, EndAt, and MaxRuns
  • HTTP request execution (GET, POST, PUT)
  • Retry integration via RetryPolicy
  • Optional callback with bounded response-body capture
  • URL validation by default (via sectools) with override/disable support
  • Custom HTTP client, logger, concurrency limit, and scheduler-state storage backend

Installation

go get github.com/hyp3rd/go-again

Requires Go 1.26+ (see go.mod).

Retrier Quick Start

package main

import (
 "context"
 "errors"
 "fmt"
 "net/http"
 "time"

 again "github.com/hyp3rd/go-again"
)

func main() {
 retrier, err := again.NewRetrier(
  context.Background(),
  again.WithMaxRetries(3),                 // retries after the first attempt
  again.WithInterval(100*time.Millisecond),
  again.WithJitter(50*time.Millisecond),
  again.WithTimeout(2*time.Second),
 )
 if err != nil {
  panic(err)
 }

 retrier.Registry.LoadDefaults()
 retrier.Registry.RegisterTemporaryError(http.ErrAbortHandler)

 var attempts int
 errs := retrier.Do(context.Background(), func() error {
  attempts++
  if attempts < 3 {
   return http.ErrAbortHandler
  }

  return nil
 })
 defer retrier.PutErrors(errs)

 if errs.Last != nil {
  fmt.Println("failed:", errs.Last)
  return
 }

 fmt.Println("success after attempts:", attempts)
 _ = errors.Join(errs.Attempts...) // equivalent to errs.Join()
}
Retrier Notes
  • MaxRetries counts retries after the first attempt (total attempts = MaxRetries + 1).
  • If temporaryErrors is omitted and Registry has entries, the registry is used as the retry filter.
  • If temporaryErrors is omitted and the registry is empty, all errors are retried until success/timeout/cancel/max-retries.
  • Do checks cancellation between attempts. For long-running work, use DoWithContext.
  • Cancel() and Stop() cancel the retrier's internal lifecycle context; they are terminal for that retrier instance.

Context-Aware Retrying

Use DoWithContext when the operation itself accepts a context and should stop promptly on cancellation:

// assuming `retrier` was created as in the previous example
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

errs := retrier.DoWithContext(ctx, func(ctx context.Context) error {
 select {
 case <-time.After(250 * time.Millisecond):
  return nil
 case <-ctx.Done():
  return ctx.Err()
 }
})
defer retrier.PutErrors(errs)

The retryable function should observe ctx.Done(); if it ignores context cancellation, the work may continue running after the retrier returns.

Scheduler Quick Start

The scheduler runs jobs immediately when scheduled (or at StartAt if set), then continues every Schedule.Every until MaxRuns, EndAt, removal, or Stop().

Request and callback URLs are validated by default using sectools (HTTPS only, no userinfo, and no private/localhost hosts unless configured otherwise).

package main

import (
 "context"
 "net/http"
 "time"

 again "github.com/hyp3rd/go-again"
 "github.com/hyp3rd/go-again/pkg/scheduler"
)

func main() {
 retrier, _ := again.NewRetrier(
  context.Background(),
  again.WithMaxRetries(5),
  again.WithInterval(10*time.Millisecond),
  again.WithJitter(10*time.Millisecond),
  again.WithTimeout(5*time.Second),
 )

 s := scheduler.NewScheduler(
  scheduler.WithConcurrency(8),
 )
 defer s.Stop()

_, _ = s.Schedule(scheduler.Job{
  Schedule: scheduler.Schedule{
   Every:   1 * time.Minute,
   MaxRuns: 1,
  },
  Request: scheduler.Request{
   Method: http.MethodPost,
   URL:    "https://example.com/endpoint",
   Body:   []byte(`{"ping":"pong"}`),
  },
  Callback: scheduler.Callback{
   URL: "https://example.com/callback",
  },
  RetryPolicy: scheduler.RetryPolicy{
   Retrier:          retrier,
   RetryStatusCodes: []int{http.StatusTooManyRequests, http.StatusInternalServerError},
 },
})
}

Scheduler Examples

Example: Schedule Once + Callback

Runnable version:

go run ./__examples/scheduler

Source: __examples/scheduler/scheduler.go

s := scheduler.NewScheduler(
 scheduler.WithHTTPClient(server.Client()),
 scheduler.WithURLValidator(nil), // allow local endpoints for example usage
)
defer s.Stop()

jobID, err := s.Schedule(scheduler.Job{
 Schedule: scheduler.Schedule{Every: 10 * time.Millisecond, MaxRuns: 1},
 Request: scheduler.Request{Method: http.MethodGet, URL: server.URL + "/target"},
 Callback: scheduler.Callback{URL: server.URL + "/callback"},
})
if err != nil {
 panic(err)
}

payload := <-callbackCh
fmt.Println("job:", jobID, "success:", payload.Success, "status:", payload.StatusCode)
Example: Query Status and History
// after Schedule(...)
status, ok := s.JobStatus(jobID)
if ok {
 fmt.Println("state:", status.State, "runs:", status.Runs, "active:", status.ActiveRuns)
}

history, ok := s.JobHistory(jobID)
if ok {
 for _, run := range history {
  fmt.Println("run#", run.Sequence, "status:", run.Payload.StatusCode, "success:", run.Payload.Success)
 }
}

filtered := s.QueryJobStatuses(scheduler.JobStatusQuery{
 States: []scheduler.JobState{scheduler.JobStateRunning, scheduler.JobStateScheduled},
 Offset: 0,
 Limit:  50,
})
fmt.Println("filtered statuses:", len(filtered))

recentRuns, ok := s.QueryJobHistory(jobID, scheduler.JobHistoryQuery{
 FromSequence: 10,
 Limit:        5,
})
if ok {
 fmt.Println("recent retained runs:", len(recentRuns))
}
Example: Durable Scheduler State with SQLite

Runnable version:

go run ./__examples/scheduler_sqlite

Source: __examples/scheduler_sqlite/scheduler_sqlite.go

dbPath := filepath.Join(os.TempDir(), "go-again-scheduler-example.db")
storage, err := scheduler.NewSQLiteJobsStorage(dbPath)
if err != nil {
 panic(err)
}
defer storage.Close()

s := scheduler.NewScheduler(
 scheduler.WithJobsStorage(storage),
 scheduler.WithURLValidator(nil),
)
defer s.Stop()

jobID, err := s.Schedule(scheduler.Job{
 Schedule: scheduler.Schedule{Every: 20 * time.Millisecond, MaxRuns: 1},
 Request: scheduler.Request{Method: http.MethodGet, URL: target.URL},
})
if err != nil {
 panic(err)
}
fmt.Println("scheduled job:", jobID)
Example: Fail-Closed Scheduler Construction

Use NewSchedulerWithError(...) when constructor-time URL validator initialization errors must fail startup.

s, err := scheduler.NewSchedulerWithError(
 scheduler.WithConcurrency(8),
)
if err != nil {
 // fail startup instead of warning + degraded mode
 return err
}
defer s.Stop()
Scheduler Options
  • WithHTTPClient(client) sets the HTTP client used for requests and callbacks.
  • WithLogger(logger) sets the scheduler logger.
  • WithConcurrency(n) limits concurrent executions when n > 0.
  • WithJobsStorage(storage) sets pluggable scheduler state storage (active jobs plus status/history; default: in-memory).
  • WithHistoryLimit(limit) sets retained per-job history length (default 20).
  • WithURLValidator(validator) overrides URL validation. Pass nil to disable validation.
  • NewSchedulerWithError(...) returns constructor errors (including startup state reconciliation failures and default URL validator initialization failure).
Scheduler Behavior Notes
  • Supported methods for requests and callbacks: GET, POST, PUT.
  • Callbacks are skipped when Callback.URL is empty.
  • Callback method defaults to POST.
  • Callback.MaxBodyBytes defaults to 4096.
  • Request.Timeout and Callback.Timeout apply per HTTP request/callback (not the schedule lifetime).
  • If RetryPolicy.Retrier is nil, the scheduler creates a default retrier and loads registry defaults.
  • Calling Schedule after Stop() returns scheduler.ErrSchedulerStopped.
  • Schedule returns scheduler.ErrStorageOperation when required scheduler-state persistence fails.
  • NewSchedulerWithError(...) should be preferred for fail-closed startup behavior in security-sensitive paths.
  • JobCount() and JobIDs() provide lightweight read-only scheduler introspection.
  • JobStatus(id), JobStatuses(), and JobHistory(id) provide status and retained run history snapshots.
  • QueryJobStatuses(JobStatusQuery) adds ID/state filters with pagination (Offset, Limit) over status snapshots.
  • QueryJobHistory(id, JobHistoryQuery) adds history filtering (FromSequence) and tail limiting (Limit) while preserving ascending sequence order.
  • Default InMemoryJobsStorage is process-local; use WithJobsStorage(...) for custom durable/backed storage.
  • NewSQLiteJobsStorage(path) provides a built-in durable storage implementation for WithJobsStorage(...); call Close() when finished.
  • On scheduler startup, recovered active-job registrations from storage are reconciled: scheduled/running states are marked canceled, then active-job IDs are cleared. Jobs are not auto-resumed.
  • Non-fatal storage write failures during runtime transitions are logged (warn) and execution continues.
  • Non-fatal request/callback response body read/close failures are logged (warn) and execution continues.
  • NewSchedulerWithError(...) fails constructor-time reconciliation errors; NewScheduler() logs a warning and continues.
  • NewScheduler() logs a warning and continues if default URL validator initialization fails; use NewSchedulerWithError() to fail closed.
Custom URL Validation (Allow Local HTTPS)
validator, _ := validate.NewURLValidator(
 validate.WithURLAllowPrivateIP(true),
 validate.WithURLAllowLocalhost(true),
 validate.WithURLAllowIPLiteral(true),
)

s := scheduler.NewScheduler(
 scheduler.WithURLValidator(validator),
)
Disable URL Validation (Allow Non-HTTPS)
s := scheduler.NewScheduler(
 scheduler.WithURLValidator(nil),
)

Examples

Run the example programs directly:

go run ./__examples/chan
go run ./__examples/context
go run ./__examples/scheduler
go run ./__examples/scheduler_sqlite
go run ./__examples/timeout
go run ./__examples/validate

Development Commands

make test
make test-race
make lint
make sec

Benchmark (direct Go command):

go test -bench=. -benchtime=3s -benchmem -run=^$ -memprofile=mem.out ./...

Known Limitations / Gaps

  • Scheduler.Stop() cancels the scheduler lifecycle; the same instance is not intended to be reused afterward.
  • Retrier.Cancel() / Retrier.Stop() are terminal for the retrier instance.
  • DoWithContext can only stop work promptly if the retryable function respects the provided context.
  • NewScheduler() (non-error constructor) intentionally degrades to warning-only behavior if default URL validator initialization fails; use NewSchedulerWithError() when you need constructor-time failure.

Performance

go-again adds retry orchestration overhead but is designed to keep allocations low. See the benchmark in tests/retrier_test.go and run the benchmark command above in your environment for current numbers.

Documentation

License

The code and documentation in this project are released under Mozilla Public License 2.0.

Author

I'm a surfer, a crypto trader, and a software architect with 15 years of experience designing highly available distributed production environments and developing cloud-native apps in public and private clouds. Feel free to hook me up on LinkedIn.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrInvalidRetrier is the error returned when the retrier is invalid.
	ErrInvalidRetrier = ewrap.New("invalid retrier")
	// ErrMaxRetriesReached is the error returned when the maximum number of retries is reached.
	ErrMaxRetriesReached = ewrap.New("maximum number of retries reached")
	// ErrTimeoutReached is the error returned when the timeout is reached.
	ErrTimeoutReached = ewrap.New("operation timeout reached")
	// ErrOperationStopped is the error returned when the retry is stopped.
	ErrOperationStopped = ewrap.New("operation stopped")
	// ErrNilRetryableFunc is the error returned when the retryable function is nil.
	ErrNilRetryableFunc = ewrap.New("failed to invoke the function. It appears to be nil")
)
View Source
var ErrOperationFailed = ewrap.New("failed")

ErrOperationFailed is the error returned when the operation fails.

Functions

This section is empty.

Types

type Errors added in v1.0.7

type Errors struct {
	// Attempts holds the trace of each attempt in order.
	Attempts []error
	// Last holds the last error returned by the retry function.
	Last error
}

Errors holds the error returned by the retry function along with the trace of each attempt.

func DoWithResult added in v1.1.2

func DoWithResult[T any](ctx context.Context, r *Retrier, fn func() (T, error), temporaryErrors ...error) (T, *Errors)

DoWithResult retries a function that returns a result and an error.

func (*Errors) Join added in v1.1.2

func (e *Errors) Join() error

Join aggregates all attempt errors into one.

func (*Errors) Reset added in v1.1.2

func (e *Errors) Reset()

Reset clears stored errors.

type Hooks added in v1.1.2

type Hooks struct {
	// OnRetry is called before waiting for the next retry interval.
	OnRetry func(attempt int, err error)
}

Hooks defines callback functions invoked by the retrier.

type Option added in v1.0.5

type Option func(*Retrier)

Option is a function type that can be used to configure the `Retrier` struct.

func WithBackoffFactor added in v1.0.9

func WithBackoffFactor(factor float64) Option

WithBackoffFactor returns an option that sets the backoff factor.

func WithHooks added in v1.1.2

func WithHooks(h Hooks) Option

WithHooks sets hooks executed during retries.

func WithInterval added in v1.0.5

func WithInterval(interval time.Duration) Option

WithInterval returns an option that sets the interval.

func WithJitter added in v1.0.5

func WithJitter(jitter time.Duration) Option

WithJitter returns an option that sets the jitter.

func WithLogger added in v1.1.2

func WithLogger(logger *slog.Logger) Option

WithLogger sets a slog logger.

func WithMaxRetries added in v1.0.5

func WithMaxRetries(num int) Option

WithMaxRetries returns an option that sets the maximum number of retries after the first attempt.

func WithTimeout added in v1.0.5

func WithTimeout(timeout time.Duration) Option

WithTimeout returns an option that sets the timeout.

type Registry added in v1.0.9

type Registry struct {
	// contains filtered or unexported fields
}

Registry holds a set of temporary errors.

func NewRegistry added in v1.0.1

func NewRegistry() *Registry

NewRegistry creates a new Registry.

func (*Registry) Clean added in v1.0.9

func (r *Registry) Clean()

Clean removes all temporary errors from the registry.

func (*Registry) IsTemporaryError added in v1.1.0

func (r *Registry) IsTemporaryError(err error, errs ...error) bool

IsTemporaryError reports whether err matches any of the temporary errors.

func (*Registry) Len added in v1.1.1

func (r *Registry) Len() int

Len returns the number of registered temporary errors.

func (*Registry) ListTemporaryErrors added in v1.0.9

func (r *Registry) ListTemporaryErrors() []error

ListTemporaryErrors returns all temporary errors in the registry.

func (*Registry) LoadDefaults added in v1.0.9

func (r *Registry) LoadDefaults() *Registry

LoadDefaults loads a set of default temporary errors.

func (*Registry) RegisterTemporaryError added in v1.0.9

func (r *Registry) RegisterTemporaryError(err error)

RegisterTemporaryError registers a temporary error.

func (*Registry) RegisterTemporaryErrors added in v1.0.9

func (r *Registry) RegisterTemporaryErrors(errs ...error)

RegisterTemporaryErrors registers multiple temporary errors.

func (*Registry) UnRegisterTemporaryError added in v1.0.9

func (r *Registry) UnRegisterTemporaryError(err error)

UnRegisterTemporaryError removes a temporary error.

func (*Registry) UnRegisterTemporaryErrors added in v1.0.9

func (r *Registry) UnRegisterTemporaryErrors(errs ...error)

UnRegisterTemporaryErrors removes multiple temporary errors.

type Retrier

type Retrier struct {
	// MaxRetries is the maximum number of retries after the first attempt.
	MaxRetries int
	// Jitter is the amount of jitter to apply to the retry interval.
	Jitter time.Duration
	// BackoffFactor is the factor to apply to the retry interval.
	BackoffFactor float64
	// Interval is the interval between retries.
	Interval time.Duration
	// Timeout is the timeout for the retry function.
	Timeout time.Duration
	// Registry is the registry for temporary errors.
	Registry *Registry

	// Logger used for logging attempts.
	Logger *slog.Logger
	// Hooks executed during retries.
	Hooks Hooks
	// contains filtered or unexported fields
}

Retrier is a type that retries a function until it returns a nil error or the maximum number of retries is reached.

func NewRetrier

func NewRetrier(ctx context.Context, opts ...Option) (retrier *Retrier, err error)

NewRetrier returns a new Retrier configured with the given options. If no options are provided, the default options are used. The default options are:

  • MaxRetries: 5 (retries after the first attempt)
  • Jitter: 1 * time.Second
  • Interval: 500 * time.Millisecond
  • Timeout: 20 * time.Second

func (*Retrier) Cancel added in v1.0.4

func (r *Retrier) Cancel()

Cancel cancels the retries notifying the `Do` function to return.

func (*Retrier) Do added in v1.0.7

func (r *Retrier) Do(ctx context.Context, retryableFunc RetryableFunc, temporaryErrors ...error) (errs *Errors)

Do retries a `retryableFunc` until it returns a nil error or the maximum number of retries is reached.

  • If the maximum number of retries is reached, the function returns an `Errors` object.
  • If the `retryableFunc` returns a nil error, the function assigns an `Errors.Last` before returning.
  • If the `retryableFunc` returns a temporary error, the function retries the function.
  • If the `retryableFunc` returns a non-temporary error, the function assigns the error to `Errors.Last` and returns.
  • If the `temporaryErrors` list is empty and the registry has entries, only those errors are retried.
  • If the `temporaryErrors` list is empty and the registry is empty, all errors are retried.
  • The context is checked between attempts; long-running functions should handle cancellation themselves.

func (*Retrier) DoWithContext added in v1.2.0

func (r *Retrier) DoWithContext(ctx context.Context, retryableFunc RetryableFuncWithContext, temporaryErrors ...error) (errs *Errors)

DoWithContext retries a context-aware function until it succeeds, a terminal error is encountered, the provided context is canceled, the retrier timeout elapses, or the maximum number of retries is reached. It is the context-aware counterpart of Do.

The behavior of DoWithContext mirrors Do:

  • The retryableFunc is invoked at most MaxRetries+1 times (the initial attempt plus up to MaxRetries retries) until it returns a nil error.
  • Any error returned by retryableFunc that matches one of the temporaryErrors is treated as transient. That error is appended to errs.Attempts and the call is retried according to the configured backoff, jitter, and interval settings.
  • Any error that does not match temporaryErrors is considered terminal. In that case, the retry loop stops immediately and errs.Last is set to that error without performing further retries.
  • If the Retrier is configured with a registry, attempt and error information are recorded in the registry in the same way as for Do.

When the maximum number of retries (MaxRetries) is reached without a successful (nil) result, the last attempt error is wrapped with ErrMaxRetriesReached and appended to errs.Attempts. In this case, errs.Last contains the last error returned by retryableFunc.

Context and timeout handling:

  • The provided ctx is observed on each attempt and during backoff delays. If ctx is canceled or its deadline is exceeded, DoWithContext stops retrying and returns immediately, with errs.Last set to the corresponding context error or any wrapped form produced by the internals of the retrier.
  • In addition to ctx, the Retrier's Timeout field enforces an overall timeout for the entire operation. If this timeout elapses first, DoWithContext stops retrying and returns with errs.Last set to ErrTimeoutReached (wrapped with the attempt number).

Use DoWithContext when the operation being retried accepts a context and must support cancellation. Use Do for retrying functions that do not take a context.

func (*Retrier) PutErrors added in v1.1.2

func (r *Retrier) PutErrors(errs *Errors)

PutErrors returns an Errors object to the pool after resetting it.

func (*Retrier) SetRegistry

func (r *Retrier) SetRegistry(reg *Registry) error

SetRegistry sets the registry for temporary errors. Use this function to set a custom registry if: - you want to add custom temporary errors. - you want to remove the default temporary errors. - you want to replace the default temporary errors with your own. - you have initialized the Retrier without using the constructor `NewRetrier`.

func (*Retrier) Stop added in v1.1.2

func (r *Retrier) Stop()

Stop stops the retries.

func (*Retrier) Validate added in v1.0.8

func (r *Retrier) Validate() error

Validate validates the Retrier. This method will check if:

  • `MaxRetries` is less than zero
  • `Interval` is greater than or equal to `Timeout`
  • The total time consumed by all retries (`Interval` multiplied by `MaxRetries`) should be less than `Timeout`.

type RetryableFunc added in v1.0.5

type RetryableFunc func() error

RetryableFunc signature of retryable function.

type RetryableFuncWithContext added in v1.2.0

type RetryableFuncWithContext func(ctx context.Context) error

RetryableFuncWithContext is a retryable function that observes context cancellation.

type TimerPool added in v1.0.9

type TimerPool struct {
	// contains filtered or unexported fields
}

TimerPool is a pool of timers.

func NewTimerPool added in v1.0.4

func NewTimerPool(size int, timeout time.Duration) *TimerPool

NewTimerPool creates a new timer pool.

func (*TimerPool) Close added in v1.0.9

func (p *TimerPool) Close()

Close closes the pool.

func (*TimerPool) Drain added in v1.0.9

func (p *TimerPool) Drain()

Drain drains the pool.

func (*TimerPool) Get added in v1.0.9

func (p *TimerPool) Get() *time.Timer

Get retrieves a timer from the pool.

func (*TimerPool) Len added in v1.0.9

func (p *TimerPool) Len() int

Len returns the number of timers in the pool.

func (*TimerPool) Put added in v1.0.9

func (p *TimerPool) Put(timer *time.Timer)

Put returns a timer back into the pool.

Directories

Path Synopsis
__examples
chan command
context command
scheduler command
timeout command
validate command
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL