Monthly Quotas

Understand your monthly request allowance and how to manage your API usage.

Monthly Quota vs Rate Limiting: These are separate concepts. Monthly quotas control your total requests per billing period. Rate limiting controls how fast you can make requests (per second).

What is a Monthly Quota?

Your monthly quota is the total number of API requests you can make in a billing period. Unlike rate limiting (which controls request speed), quotas control your total usage volume.

Quotas by Plan

Your monthly quota is determined by your subscription plan. Upgrading your plan increases your allowance and unlocks overage pricing.

Plan	Monthly Quota	Rate Limit	Overage
Free	5,000/month	10/sec	Blocked (upgrade required)
Starter ($29/mo)	50,000/month	25/sec	$1.50 per 1,000 requests
Growth ($99/mo)	250,000/month	50/sec	$1.00 per 1,000 requests
Business ($299/mo)	1,000,000/month	100/sec	$0.70 per 1,000 requests
Enterprise	Unlimited	Custom	N/A

Paid plans include overage. When you exceed your monthly quota on paid plans, you can continue making requests and pay overage fees. Business offers the best rate at $0.70/1K. View all plans →

Quota Headers

Every API response includes headers showing your quota status:

Header	Description
`X-Quota-Limit`	Your daily request quota
`X-Quota-Remaining`	Requests remaining today
`X-Quota-Reset`	Unix timestamp when quota resets (midnight UTC)

Example Response Headers

HTTP/1.1 200 OK
X-Quota-Limit: 100
X-Quota-Remaining: 73
X-Quota-Reset: 1704153600
Content-Type: application/json

When Quota is Exceeded

When you exceed your monthly quota, the API returns a 429 Too Many Requests response with a Retry-After value:

HTTP/1.1 429 Too Many Requests
Retry-After: 86400
Content-Type: application/json

{
  "error": "quota_exceeded",
  "message": "Monthly quota exceeded.",
  "quota_limit": 5000,
  "quota_used": 5000,
  "retry_after": 86400
}

Monitoring Your Usage

Check Headers in Responses

import requests

response = requests.get(
    "https://api.wikirest.com/v1/search",
    headers={"X-API-Key": API_KEY},
    params={"q": "test"}
)

quota_remaining = int(response.headers.get("X-Quota-Remaining", 0))
quota_limit = int(response.headers.get("X-Quota-Limit", 0))
usage_percent = ((quota_limit - quota_remaining) / quota_limit) * 100

print(f"Quota: {quota_remaining}/{quota_limit} remaining ({usage_percent:.1f}% used)")

Dashboard

View your usage statistics in real-time on the API Dashboard, including:

Current quota usage
Usage history over time
Breakdown by endpoint
Per-API-key usage (if you have multiple keys)

Quota Warnings

When you approach your quota limit (80%+ used), the API includes a warning header:

X-Quota-Warning: Approaching daily limit (85% used)

Email Notifications

Pro users can configure email alerts when approaching quota limits:

At 75% usage
At 90% usage
When quota is exceeded

Configure notifications in your account settings.

Optimizing Your Usage

1. Cache Responses

Wikipedia content is relatively static. Cache responses to avoid redundant API calls:

import redis
import json
import hashlib

redis_client = redis.Redis()
CACHE_TTL = 3600  # 1 hour

def cached_search(query, limit=10):
    cache_key = f"wiki:{hashlib.md5(f'{query}:{limit}'.encode()).hexdigest()}"

    # Try cache first
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)

    # Fetch from API
    response = requests.get(
        "https://api.wikirest.com/v1/search",
        headers={"X-API-Key": API_KEY},
        params={"q": query, "limit": limit}
    ).json()

    # Cache the response
    redis_client.setex(cache_key, CACHE_TTL, json.dumps(response))
    return response

2. Fetch More Per Request

Reduce the number of API calls by fetching more data in each request:

# Instead of this (5 API calls = 5 quota):
for i in range(5):
    results = search(query, limit=10, offset=i*10)

# Do this (1 API call = 1 quota):
results = search(query, limit=50)

3. Use the Lucky Endpoint

If you just need the best match, use /v1/lucky instead of searching and then fetching:

# Instead of this (2 API calls):
search_results = search("Albert Einstein")
page = get_page(search_results["hits"][0]["page_id"])

# Do this (1 API call):
page = lucky("Albert Einstein")

4. Use Format Options

Use format=concat to get all text in one field instead of processing chunks:

# Get page with concatenated text - no need to fetch chunks separately
response = requests.get(
    f"https://api.wikirest.com/v1/page/736",
    headers={"X-API-Key": API_KEY},
    params={"format": "concat"}
)
full_text = response.json()["text"]

Quota vs Rate Limiting

Aspect	Monthly Quota	Rate Limiting
What it controls	Total requests per month	Requests per second
Time window	Billing period	1 second
When it resets	Billing cycle	Every second
If exceeded	Blocked (or overage on paid)	Blocked for seconds
How to resolve	Wait or upgrade plan	Slow down and retry

Learn more about rate limiting →

Need More Quota?

Your monthly quota is fully adjustable based on your plan. Choose the tier that fits your usage:

Free

5K req/mo

Great for testing and learning the API.

Starter $29

50K req/mo

For small projects and MVPs.

Popular

Growth $99

250K req/mo

For growing applications.

View pricing →

Business $299

1M req/mo

For production workloads with 99.5% SLA.