When an existing implementation already encodes the behavior you want, make Claude prove it understood the reference — as a reviewable artifact — before a single line gets ported.
This Rust crate in vendor/rate-limiter implements the exact backoff behavior I want. Read it and reimplement the same semantics in our TypeScript API client — but first show me a semantics map so I can confirm you understood it.
What the crate does, how each behavior maps to TypeScript, and every place the port cannot be a literal translation. Nothing gets implemented until you sign off on this page.
bucket.rs): every outgoing request withdraws 1 token from a bucket that refills at rate_per_sec = 8 up to burst_capacity = 40. The bucket starts full, so a cold client can burst 40 requests instantly.last_refill only advances when at least one whole token is minted, so sub-token elapsed time carries forward.backoff.rs): on retryable failure, the next delay is drawn uniformly from [base, min(prev × 3, cap)] with base = 250ms, cap = 30s. Each draw seeds the next — delays wander rather than doubling in lockstep, which desynchronizes clients after an outage.budget.rs): a second, separate bucket. Every successful first-try request deposits 1; every retry attempt withdraws 10. When the balance can’t cover a withdrawal, retries are refused entirely (first-try requests still go out). This caps retry amplification at ~10% during a sustained outage.acquire(), next_delay(), try_withdraw() and the caller decides what counts as retryable. The TS port keeps that boundary — Acme’s isRetryable() in src/api/errors.ts stays the single source of truth.Highlighted regions correspond across columns — hover one to light up its counterpart and the gotcha note it maps to.
fn refill(&mut self, now: Instant) {
let elapsed = now
.saturating_duration_since1(self.last_refill);
let new_tokens = elapsed.as_nanos() as u64
* self.rate_per_sec as u64
/ 1_000_000_0002;
if new_tokens > 03 {
self.tokens = (self.tokens + new_tokens)
.min(self.burst_capacity);
self.last_refill = now;
}
}
private refill(now: number): void {
// now comes from performance.now(), not Date.now()
const elapsedMs = Math.max(0, now - this.lastRefill)1;
const newTokens = Math.floor(
(elapsedMs * this.ratePerSec) / 1000
)2;
if (newTokens > 03) {
this.tokens = Math.min(
this.tokens + newTokens,
this.burstCapacity
);
this.lastRefill = now;
}
}
saturating_duration_since clamps negative elapsed time to zero. Date.now() can jump backwards under NTP correction; the port uses monotonic performance.now() and keeps Math.max(0, …) as a belt-and-suspenders match.u64 division truncates; JS division doesn’t. Math.floor restores truncation. Millis instead of nanos is safe: at rate = 8, elapsedMs * 8 stays far below 2⁵³, so no precision loss.last_refill only advances when a whole token is minted. Dropping this guard (an easy “simplification”) silently discards sub-token progress on every call — at low rates the bucket would never refill under frequent polling. Preserved exactly, plus a regression test.fn next_delay(&mut self) -> Duration {
let hi = (self.prev_delay_ms.saturating_mul(3)4)
.min(self.cap_ms);
let lo = self.base_ms;
let ms = self.rng
.gen_range(lo..=hi.max(lo))5;
self.prev_delay_ms = ms;6
Duration::from_millis(ms)
}
nextDelay(): number {
const hi = Math.min(this.prevDelayMs * 34, this.capMs);
const lo = this.baseMs;
const span = Math.max(hi, lo) - lo;
const ms = lo + Math.floor(
this.random() * (span + 1)
)5;
this.prevDelayMs = ms;6
return ms;
}
saturating_mul(3) guards u64 overflow. In JS this can’t overflow — capMs = 30_000 bounds the product long before 2⁵³ — so the guard is deliberately dropped (see §3, “dropped” column).lo..=hi includes both endpoints. Naive lo + random() * (hi - lo) never yields hi. The + 1 inside Math.floor restores inclusivity — a one-character bug magnet, called out so you can veto or bless it.prev — this is what makes it decorrelated rather than plain expo-backoff-with-jitter. reset() restores prevDelayMs = baseMs on success, matching the crate’s Backoff::reset. this.random is injectable for deterministic tests (crate uses a seeded SmallRng in its tests).pub fn try_withdraw(&self) -> bool {
let mut b = self.inner.lock().unwrap()7;
b.deposit_drip(Instant::now());
if b.balance >= WITHDRAW_COST {
b.balance -= WITHDRAW_COST;8
true
} else {
false // refuse retry, don’t queue9
}
}
tryWithdraw(): boolean {
// no lock: single-threaded event loop7 —
// but NO await between check and debit.
this.depositDrip(this.clock());
if (this.balance >= WITHDRAW_COST) {
this.balance -= WITHDRAW_COST;8
return true;
}
return false9;
}
Mutex exists because Rust callers withdraw from worker threads. Acme’s client runs on one event loop, so the lock disappears — but the atomicity it provided must be preserved by convention: tryWithdraw is synchronous end-to-end, and I’ll add an eslint no-await-in-budget boundary comment plus a test that the method never returns a Promise.await (e.g., to log) between the check and the debit, letting two in-flight retries both pass the check.false immediately — the request either goes out as a first try or fails fast. The port must not “helpfully” enqueue the retry for later; that would rebuild the retry storm the budget exists to prevent.lastRefill advances only on mint[base, min(prev×3, cap)], inclusive both endsburst_at_t0)Instant → performance.now() — both monotonic; ms resolution is sufficient at rate 8/su64 nanos → number ms — all products provably < 2⁵³; Math.floor replays integer divisionMutex<Budget> → plain fields — atomicity by sync-only convention + testSmallRng → injected random() — defaults to Math.random, seeded stub in testssaturating_mul overflow guards — unreachable once cap is applied in ms rangeSend + Sync impls, Arc cloning — no threads to share acrosstokio/async-std feature flags — port is runtime-agnostic by constructiontelemetry.track() instead; hook points kept| Edge case | Rust crate | TypeScript port | Match |
|---|---|---|---|
| Clock skew system clock steps back 5s mid-session |
Instant is monotonic — unaffected. saturating_duration_since is a second fence. |
performance.now() is monotonic — unaffected. Math.max(0,…) kept as the same second fence. |
identical |
| Burst at t=0 fresh client fires 45 requests at once |
First 40 admitted instantly (bucket starts full); 41–45 rejected until refill. Crate test: burst_at_t0. |
Same: 40 admitted, 5 rejected with RateLimited. Port test replays the crate’s fixture numbers verbatim. |
identical |
| Budget exhaustion sustained 100% failure for 60s |
Balance drains to 0 in ~10 retries; further retries refused, first-tries continue. Retry rate settles at deposit÷cost = 10% of success rate (0 here). | Same economics, same settle point. Difference: refusal surfaces as RetryBudgetExhausted error so Acme’s upload queue can show “reconnecting” instead of failing silently. |
equivalent* |
| Slow drip rate 8/s but polled every 20ms (0.16 tokens/poll) |
Guard carries sub-token time forward; a token mints every ~125ms regardless of poll cadence. | Same, via the preserved newTokens > 0 guard (Pair A, note 3). Regression test asserts mint cadence at 20ms polling. |
identical |
| Delay at cap 10th consecutive failure |
Draw range collapses toward [250ms, 30s]; delay stays ≤ 30s but still jittered — never a fixed 30s (avoids re-synchronizing clients). |
Same bounds, same non-degenerate jitter — verified with a seeded RNG replaying the crate’s cap_still_jitters test vector. |
identical |
* “equivalent” = same decision, different surface. The crate returns bare false; the port wraps it in a typed error because Acme’s UI needs to distinguish “rate limited” from “offline”. Flag this row if you’d rather keep the bare boolean.
Reply “semantics confirmed” and I’ll implement tokenBucket.ts, backoff.ts, and budget.ts with the crate’s 14 tests translated first. Or correct any row above — quote its number (e.g. “note 5”, “budget exhaustion row”) and I’ll revise the map before writing code.