Skip to content

Commit

Permalink
Release 0.5.8
Browse files Browse the repository at this point in the history
  • Loading branch information
adamw committed Dec 31, 2024
1 parent daefc1e commit 5769869
Show file tree
Hide file tree
Showing 3 changed files with 82 additions and 9 deletions.
2 changes: 1 addition & 1 deletion generated-doc/out/utils/repeat.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Similarly to the `retry` API, the `operation` can be defined:
The `repeat` config requires a `Schedule`, which indicates how many times and with what interval should the `operation`
be repeated.

In addition, it is possible to define a custom `shouldContinueOnSuccess` strategy for deciding if the operation
In addition, it is possible to define a custom `shouldContinueOnResult` strategy for deciding if the operation
should continue to be repeated after a successful result returned by the previous operation (defaults to `_: T => true`).

If an operation returns an error, the repeat loop will always be stopped. If an error handling within the operation
Expand Down
81 changes: 80 additions & 1 deletion generated-doc/out/utils/retries.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,86 @@ retry(RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_.getMessage !=
retryEither(RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_ != "fatal error")))(eitherOperation)

// custom error mode
retryWithErrorMode(UnionMode[String])(RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_ != "fatal error")))(unionOperation)
retryWithErrorMode(UnionMode[String])(
RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_ != "fatal error")))(unionOperation)
```

See the tests in `ox.resilience.*` for more.

## Adaptive retries

A retry strategy, backed by a token bucket. Every retry costs a certain amount of tokens from the bucket, and every success causes some tokens to be added back to the bucket. If there are not enought tokens, retry is not attempted.

This way retries don't overload a system that is down due to a systemic failure (such as a bug in the code, excessive load etc.): retries will be attempted only as long as there are enought tokens in the bucket, then the load on the downstream system will be reduced so that it can recover. In contrast, using a "normal" retry strategy, where every operation is retries up to 3 times, a failure causes the load on the system to increas 4 times.

For transient failures (component failure, infrastructure issues etc.), retries still work "normally", as the bucket has enough tokens to cover the cost of multiple retries.

### Inspiration

* [`AdaptiveRetryStrategy`](https://github.com/aws/aws-sdk-java-v2/blob/master/core/retries/src/main/java/software/amazon/awssdk/retries/AdaptiveRetryStrategy.java) from `aws-sdk-java-v2`
* *["Try again: The tools and techniques behind resilient systems" from re:Invent 2024](https://www.youtube.com/watch?v=rvHd4Y76-fs)

### Configuration

To use adaptive retries, create an instance of `AdaptiveRetry`. These instances are thread-safe and are designed to be shared. Typically, a single instance should be used to proxy access to a single constrained resource.

`AdaptiveRetry` is parametrized with:

* `tokenBucket: Tokenbucket`: instances of `TokenBucket` can be shared across multiple instances of `AdaptiveRetry`
* `failureCost: Int`: number of tokens that are needed for retry in case of failure
* `successReward: Int`: number of tokens that are added back to token bucket after success

`RetryConfig` and `ResultPolicy` are defined the same as with "normal" retry mechanism, all the configuration from above also applies here.

Instance with default configuration can be obtained with `AdaptiveRetry.default` (bucket size = 500, cost for failure = 5 and reward for success = 1).

### API

`AdaptiveRetry` exposes three variants of retrying, which correspond to the three variants discussed above: `retry`, `retryEither` and `retryWithErrorMode`.

`retry` will attempt to retry an operation if it throws an exception; `retryEither` will additionally retry, if the result is a `Left`. Finally `retryWithErrorMode` is the most flexible, and allows retrying operations using custom failure modes (such as union types).

The methods have an additional parameter, `shouldPayPenaltyCost`, which determines if result `T` should be considered failure in terms of paying cost for retry. Penalty is paid only if it is decided to retry operation, the penalty will not be paid for successful operation.

### Examples

If you want to use this mechanism you need to run operation through instance of `AdaptiveRetry`:

```scala
import ox.UnionMode
import ox.resilience.AdaptiveRetry
import ox.resilience.{ResultPolicy, RetryConfig}
import ox.scheduling.{Jitter, Schedule}
import scala.concurrent.duration.*

def directOperation: Int = ???
def eitherOperation: Either[String, Int] = ???
def unionOperation: String | Int = ???

val adaptive = AdaptiveRetry.default

// various configs with custom schedules and default ResultPolicy
adaptive.retry(RetryConfig.immediate(3))(directOperation)
adaptive.retry(RetryConfig.delay(3, 100.millis))(directOperation)
adaptive.retry(RetryConfig.backoff(3, 100.millis))(directOperation) // defaults: maxDelay = 1.minute, jitter = Jitter.None
adaptive.retry(RetryConfig.backoff(3, 100.millis, 5.minutes, Jitter.Equal))(directOperation)

// result policies
// custom success
adaptive.retry[Int](
RetryConfig(Schedule.Immediate(3), ResultPolicy.successfulWhen(_ > 0)))(directOperation)
// fail fast on certain errors
adaptive.retry(
RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_.getMessage != "fatal error")))(directOperation)
adaptive.retryEither(
RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_ != "fatal error")))(eitherOperation)

// custom error mode
adaptive.retryWithErrorMode(UnionMode[String])(
RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_ != "fatal error")))(unionOperation)

// consider "throttling error" not as a failure that should incur the retry penalty
adaptive.retryWithErrorMode(UnionMode[String])(
RetryConfig(Schedule.Immediate(3), ResultPolicy.retryWhen(_ != "fatal error")),
shouldPayFailureCost = _.fold(_ != "throttling error", _ => true))(unionOperation)
```
8 changes: 1 addition & 7 deletions generated-doc/out/utils/scheduled.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,7 @@ The `scheduled` config consists of:
- `Interval` - default for `repeat` operations, where the sleep is calculated as the duration provided by schedule
minus the duration of the last operation (can be negative, in which case the next operation occurs immediately).
- `Delay` - default for `retry` operations, where the sleep is just the duration provided by schedule.
- `onOperationResult` - a callback function that is invoked after each operation. Used primarily for `onRetry` in `retry` API.

In addition, it is possible to define strategies for handling the results and errors returned by the `operation`:
- `shouldContinueOnError` - defaults to `_: E => false`, which allows to decide if the scheduler loop should continue
after an error returned by the previous operation.
- `shouldContinueOnSuccess` - defaults to `_: T => true`, which allows to decide if the scheduler loop should continue
after a successful result returned by the previous operation.
- `afterAttempt` - a callback function that is invoked after each operation and determines if the scheduler loop should continue. Used for `onRetry`, `shouldContinueOnError`, `shouldContinueOnResult` and adaptive retries in `retry` API. Defaults to always continuing.

## Schedule

Expand Down

0 comments on commit 5769869

Please sign in to comment.