WikiRest Docs

Query Syntax Guide

Learn how to write effective search queries to get the best results from the WikiRest API.

Basic Queries

The simplest query is just a word or phrase. The API performs full-text search across Wikipedia article titles, sections, and content.

Single Word

GET /v1/search?q=quantum

Searches for articles containing the word "quantum".

Multiple Words

GET /v1/search?q=quantum computing

Searches for articles containing both "quantum" and "computing". Results are ranked by relevance, with documents containing both words near each other scoring higher.

Exact Phrases

GET /v1/search?q="quantum entanglement"

Use double quotes to search for an exact phrase. This returns only results where the words appear together in the specified order.

Filtering Results

Filter by Page ID

GET /v1/search?q=history&page_id=12345

Search within a specific Wikipedia page. Use this to find specific sections within a known article.

Pagination

Control the number of results and navigate through large result sets using limit and offset.

Limit Results

GET /v1/search?q=python&limit=5

Return only the top 5 results. Valid range: 1-50, default: 10.

Paginate Results

# Page 1 (first 10 results)
GET /v1/search?q=python&limit=10&offset=0

# Page 2 (results 11-20)
GET /v1/search?q=python&limit=10&offset=10

# Page 3 (results 21-30)
GET /v1/search?q=python&limit=10&offset=20

Highlighting

By default, matching text is highlighted in the results. You can control this behavior.

Disable Highlighting

GET /v1/search?q=einstein&highlight=false

Adjust Text Length

# Longer text snippets (up to 2000 characters)
GET /v1/search?q=einstein&crop_length=1000

# Shorter snippets (minimum 50 characters)
GET /v1/search?q=einstein&crop_length=100

The crop_length parameter controls how much surrounding context is included. Valid range: 50-2000, default: 400.

Understanding Ranking

Search results are ranked using Meilisearch's default ranking algorithm, which considers:

Factor Description
Typo tolerance Small typos are accepted (e.g., "quantm" matches "quantum")
Word proximity Documents with query words close together rank higher
Attribute importance Matches in titles rank higher than matches in body text
Word position Matches at the beginning of text rank higher
Exactness Exact matches rank higher than partial matches

Search Tips

Best Practices

  • Use specific, descriptive terms for better results
  • Quote exact phrases when you need precise matches
  • Start with broader queries and narrow down if needed
  • Use page_id filter when searching within a known article

Examples by Use Case

Finding Definitions

# Find definition-like content
GET /v1/search?q="is a" quantum mechanics

Finding Historical Events

# Search for events with dates
GET /v1/search?q=world war 1914

Technical Topics

# Search for technical explanations
GET /v1/search?q="machine learning" algorithm neural network

Getting More Context

# Longer text snippets for more context
GET /v1/search?q=photosynthesis&crop_length=800

Understanding Response Fields

Each search result contains the following fields:

Field Description
id Unique chunk identifier (format: pageId_chunkIndex)
page_id Wikipedia page ID
title Article title
section Section heading (if applicable)
text The text content of this chunk
chunk_id Position of this chunk within the page
_formatted Highlighted version of fields (when highlighting enabled)
source Attribution information with Wikipedia URLs
license Content license (CC BY-SA 4.0)

Query Limits

Keep these limits in mind when building your queries:

Parameter Minimum Maximum Default
q (query length) 1 character 500 characters -
limit 1 50 10
offset 0 1,000,000 0
crop_length 50 2000 400

Performance Tips

Optimize your queries for the best performance when working with large result sets.

Performance Best Practices

  • Use specific queries: Broad single-word searches like "the" will be slower than specific multi-word queries.
  • Limit results appropriately: Request only the number of results you need. Use limit=10 instead of limit=50 if you only display 10.
  • Paginate efficiently: For deep pagination (high offset values), consider using keyset pagination with IDs instead of large offsets.
  • Cache responses: Cache API responses on your server to reduce redundant requests.
  • Batch requests: If you need multiple searches, batch them with a small delay rather than firing simultaneously.

Efficient Pagination

For large result sets, prefer fetching consecutive pages rather than random access:

# Good: Sequential pagination
page1 = search("topic", limit=10, offset=0)
page2 = search("topic", limit=10, offset=10)
page3 = search("topic", limit=10, offset=20)

# Avoid: Very high offsets (slower performance)
page100 = search("topic", limit=10, offset=990)

Minimize Response Size

# For quick results, use smaller crop_length
GET /v1/search?q=topic&crop_length=100

# Disable highlighting if not needed
GET /v1/search?q=topic&highlight=false

Connection Management

When making multiple API calls:

  • Use HTTP/2 or connection pooling to reuse connections
  • Implement exponential backoff for retries
  • Set reasonable timeouts (we recommend 10-30 seconds)
  • Handle rate limiting gracefully using the X-RateLimit-* headers

Caching Strategies

Cache Duration Use Case
1-5 minutes High-traffic search queries, trending topics
1-24 hours Individual chunks, page content
1-7 days Reference content, rarely changing articles

Monitor the X-RateLimit-Remaining header to stay within your quota and avoid service interruptions.

Next Steps

Was this page helpful?

Help us improve our documentation