Rate Limiting (Partial)

Token bucket (in-memory) per provider key concept for smoothing bursts.

// let limiter = RateLimit::per_minute(3000).burst(600);
// let client = AiClientBuilder::new(Provider::OpenAI).rate_limit(limiter).build()?;

Adaptive concurrency is implemented. Distributed state is planned for future releases.

Build: b635c6a · 2025-09-17T16:29:52.989Z · v0.21