Rate Limiting Examples - Inspiration & Best Practices
Discover rate limiting examples and learn how platforms restrict API requests to prevent abuse, ensure stability, and guarantee fair usage.
Rate limiting restricts the number of requests a client can make within a given time period. It protects your API against abuse, prevents backend service overload, and guarantees fair usage for all customers. From simple token buckets to complex sliding window algorithms — the right rate limiting strategy depends on your use case, scaling requirements, and user patterns. Below we show how various platforms implement rate limiting effectively.
Public API with tiered rate limits
A data provider implemented differentiated rate limits based on the subscription tier of API consumers. Free tier customers are allowed 100 requests per minute, Pro tier 1,000, and Enterprise tier 10,000. A sliding window log algorithm accurately counts requests over a moving time window. Response headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) give clients visibility into their usage and remaining quota.
- Sliding window log algorithm for accurate request counting
- Differentiated limits per subscription tier
- Standard rate limit response headers for client information
- Graceful degradation with 429 Too Many Requests and Retry-After header
Authentication endpoint with brute-force protection
A fintech platform implemented aggressive rate limiting on login endpoints to prevent brute-force attacks. After 5 failed attempts per IP address, the IP is blocked for 15 minutes. After 10 failed attempts per account, the account is temporarily locked and the owner is notified. A progressive delay exponentially increases the wait time after each failed attempt, making automated attacks impractical.
- IP-based throttling after repeated failed login attempts
- Account-level lockout with automatic owner notification
- Progressive delay with exponential backoff per failed attempt
- Captcha challenge after suspicious patterns before lockout
GraphQL API with complexity-based limiting
A GraphQL API implemented rate limiting based on query complexity rather than request count. Each query is analysed and assigned a complexity score based on nested fields, list lengths, and joins. Users have a complexity budget per minute: a simple query costs 1 point, a deep-nested query with multiple relations 50 points. This prevents a single complex query from overloading the server while simple queries flow freely.
- Complexity scoring per GraphQL query based on nesting and fields
- Complexity budget per minute instead of request counting
- Query depth limiting as additional abuse protection
- Persisted queries for pre-approved query patterns
Webhook delivery with retry-aware throttling
An integration platform sends webhooks to thousands of customer endpoints. Sender-side rate limiting prevents any single endpoint from being overwhelmed: maximum 10 webhooks per second per endpoint. On a 429 response from the endpoint, the system switches to exponential backoff with jitter. A circuit breaker temporarily stops all deliveries to an endpoint that repeatedly fails, resuming automatically after a cooldown period.
- Per-endpoint rate limiting on the sender side
- Exponential backoff with jitter on 429 responses
- Circuit breaker for endpoints that repeatedly fail
- Priority queue for critical webhooks over routine events
CDN with DDoS protection via rate limiting
A content platform implemented multi-layer rate limiting as DDoS protection. At the network level, a connection-rate limiter restricts the number of new TCP connections per IP. At the application level, a token bucket limits the number of requests per second. Suspicious traffic is redirected to a challenge page. Whitelisted IP ranges from known partners bypass the limits, while known botnets are on a deny list.
- Multi-layer rate limiting at network and application level
- Token bucket algorithm for burst-tolerant request limiting
- Challenge page for suspicious traffic before hard blocking
- Allow/deny lists for known partners and botnets
Key takeaways
- Choose the right rate limiting algorithm for your use case: token bucket for bursts, sliding window for accuracy.
- Differentiated limits per subscription tier encourage upgrades and protect your platform.
- Standard response headers give clients the information to handle rate limits correctly.
- Brute-force protection on authentication endpoints requires more aggressive limits than regular API endpoints.
- Multi-layer rate limiting combines network and application protection for optimal DDoS resistance.
How MG Software can help
MG Software implements rate limiting strategies that protect your API without hindering legitimate users. From designing tier-based limits to implementing complexity-based GraphQL throttling and DDoS protection — we ensure your platform remains stable and secure under all conditions.
Frequently asked questions
Related articles
What is Rate Limiting? - Explanation & Meaning
Learn what rate limiting is, how it protects APIs and services from overload, and why rate limiting is essential for reliable software systems.
Best DDoS Protection 2026
Compare the best DDoS protection solutions of 2026. From Cloudflare to AWS Shield — keep your application online during volumetric and application-layer attacks.
API Integration Examples - Practical Integrations for Businesses
Discover practical API integration examples for businesses. Learn how REST APIs, webhooks, and middleware streamline your business processes and automation.
Single Sign-On Examples - Inspiration & Best Practices
Discover single sign-on examples and learn how organisations implement SSO for secure and seamless authentication. SAML, OAuth, and OIDC in practice.