Rate Limiting (Partial)

Token bucket (in-memory) per provider key concept for smoothing bursts.

// let limiter = RateLimit::per_minute(3000).burst(600);
// let client = AiClientBuilder::new(Provider::OpenAI).rate_limit(limiter).build()?;

Adaptive concurrency is implemented. Distributed state is planned for future releases.

Build: 3de64ed · 2025-09-09T12:50:59.651Z · v0.21