Monthly Quotas
Understand your monthly request allowance and how to manage your API usage.
Monthly Quota vs Rate Limiting: These are separate concepts. Monthly quotas control your total requests per billing period. Rate limiting controls how fast you can make requests (per second).
What is a Monthly Quota?
Your monthly quota is the total number of API requests you can make in a billing period. Unlike rate limiting (which controls request speed), quotas control your total usage volume.
Quotas by Plan
Your monthly quota is determined by your subscription plan. Upgrading your plan increases your allowance and unlocks overage pricing.
| Plan | Monthly Quota | Rate Limit | Overage |
|---|---|---|---|
| Free | 5,000/month | 10/sec | Blocked (upgrade required) |
| Starter ($29/mo) | 50,000/month | 25/sec | $1.50 per 1,000 requests |
| Growth ($99/mo) | 250,000/month | 50/sec | $1.00 per 1,000 requests |
| Business ($299/mo) | 1,000,000/month | 100/sec | $0.70 per 1,000 requests |
| Enterprise | Unlimited | Custom | N/A |
Paid plans include overage. When you exceed your monthly quota on paid plans, you can continue making requests and pay overage fees. Business offers the best rate at $0.70/1K. View all plans →
Quota Headers
Every API response includes headers showing your quota status:
| Header | Description |
|---|---|
X-Quota-Limit | Your daily request quota |
X-Quota-Remaining | Requests remaining today |
X-Quota-Reset | Unix timestamp when quota resets (midnight UTC) |
Example Response Headers
HTTP/1.1 200 OK
X-Quota-Limit: 100
X-Quota-Remaining: 73
X-Quota-Reset: 1704153600
Content-Type: application/json When Quota is Exceeded
When you exceed your monthly quota, the API returns a 429 Too Many Requests
response with a Retry-After value:
HTTP/1.1 429 Too Many Requests
Retry-After: 86400
Content-Type: application/json
{
"error": "quota_exceeded",
"message": "Monthly quota exceeded.",
"quota_limit": 5000,
"quota_used": 5000,
"retry_after": 86400
} Monitoring Your Usage
Check Headers in Responses
import requests
response = requests.get(
"https://api.wikirest.com/v1/search",
headers={"X-API-Key": API_KEY},
params={"q": "test"}
)
quota_remaining = int(response.headers.get("X-Quota-Remaining", 0))
quota_limit = int(response.headers.get("X-Quota-Limit", 0))
usage_percent = ((quota_limit - quota_remaining) / quota_limit) * 100
print(f"Quota: {quota_remaining}/{quota_limit} remaining ({usage_percent:.1f}% used)") Dashboard
View your usage statistics in real-time on the API Dashboard, including:
- Current quota usage
- Usage history over time
- Breakdown by endpoint
- Per-API-key usage (if you have multiple keys)
Quota Warnings
When you approach your quota limit (80%+ used), the API includes a warning header:
X-Quota-Warning: Approaching daily limit (85% used) Email Notifications
Pro users can configure email alerts when approaching quota limits:
- At 75% usage
- At 90% usage
- When quota is exceeded
Configure notifications in your account settings.
Optimizing Your Usage
1. Cache Responses
Wikipedia content is relatively static. Cache responses to avoid redundant API calls:
import redis
import json
import hashlib
redis_client = redis.Redis()
CACHE_TTL = 3600 # 1 hour
def cached_search(query, limit=10):
cache_key = f"wiki:{hashlib.md5(f'{query}:{limit}'.encode()).hexdigest()}"
# Try cache first
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
# Fetch from API
response = requests.get(
"https://api.wikirest.com/v1/search",
headers={"X-API-Key": API_KEY},
params={"q": query, "limit": limit}
).json()
# Cache the response
redis_client.setex(cache_key, CACHE_TTL, json.dumps(response))
return response 2. Fetch More Per Request
Reduce the number of API calls by fetching more data in each request:
# Instead of this (5 API calls = 5 quota):
for i in range(5):
results = search(query, limit=10, offset=i*10)
# Do this (1 API call = 1 quota):
results = search(query, limit=50) 3. Use the Lucky Endpoint
If you just need the best match, use /v1/lucky instead of searching and then fetching:
# Instead of this (2 API calls):
search_results = search("Albert Einstein")
page = get_page(search_results["hits"][0]["page_id"])
# Do this (1 API call):
page = lucky("Albert Einstein") 4. Use Format Options
Use format=concat to get all text in one field instead of processing chunks:
# Get page with concatenated text - no need to fetch chunks separately
response = requests.get(
f"https://api.wikirest.com/v1/page/736",
headers={"X-API-Key": API_KEY},
params={"format": "concat"}
)
full_text = response.json()["text"] Quota vs Rate Limiting
| Aspect | Monthly Quota | Rate Limiting |
|---|---|---|
| What it controls | Total requests per month | Requests per second |
| Time window | Billing period | 1 second |
| When it resets | Billing cycle | Every second |
| If exceeded | Blocked (or overage on paid) | Blocked for seconds |
| How to resolve | Wait or upgrade plan | Slow down and retry |
Learn more about rate limiting →
Need More Quota?
Your monthly quota is fully adjustable based on your plan. Choose the tier that fits your usage:
Free
5K req/mo
Great for testing and learning the API.
Starter $29
50K req/mo
For small projects and MVPs.
Enterprise Customization Options
- • Custom monthly quotas - From 100K to unlimited requests per month
- • Flexible reset times - Choose your own reset schedule
- • Overage handling - Pay-as-you-go or soft limits instead of hard blocks
- • Multiple API keys - Separate quotas per key or shared pool
- • Dedicated infrastructure - Guaranteed capacity for your traffic