Query Syntax Guide

Learn how to write effective search queries to get the best results from the WikiRest API.

Basic Queries

The simplest query is just a word or phrase. The API performs full-text search across Wikipedia article titles, sections, and content.

Single Word

GET /v1/search?q=quantum

Searches for articles containing the word "quantum".

Multiple Words

GET /v1/search?q=quantum computing

Searches for articles containing both "quantum" and "computing". Results are ranked by relevance, with documents containing both words near each other scoring higher.

Exact Phrases

GET /v1/search?q="quantum entanglement"

Use double quotes to search for an exact phrase. This returns only results where the words appear together in the specified order.

Filtering Results

Filter by Page ID

GET /v1/search?q=history&page_id=12345

Search within a specific Wikipedia page. Use this to find specific sections within a known article.

Pagination

Control the number of results and navigate through large result sets using limit and offset.

Limit Results

GET /v1/search?q=python&limit=5

Return only the top 5 results. Valid range: 1-50, default: 10.

Paginate Results

# Page 1 (first 10 results)
GET /v1/search?q=python&limit=10&offset=0

# Page 2 (results 11-20)
GET /v1/search?q=python&limit=10&offset=10

# Page 3 (results 21-30)
GET /v1/search?q=python&limit=10&offset=20

Highlighting

By default, matching text is highlighted in the results. You can control this behavior.

Disable Highlighting

GET /v1/search?q=einstein&highlight=false

Adjust Text Length

# Longer text snippets (up to 2000 characters)
GET /v1/search?q=einstein&crop_length=1000

# Shorter snippets (minimum 50 characters)
GET /v1/search?q=einstein&crop_length=100

The crop_length parameter controls how much surrounding context is included. Valid range: 50-2000, default: 400.

Understanding Ranking

Search results are ranked using Meilisearch's default ranking algorithm, which considers:

Factor	Description
Typo tolerance	Small typos are accepted (e.g., "quantm" matches "quantum")
Word proximity	Documents with query words close together rank higher
Attribute importance	Matches in titles rank higher than matches in body text
Word position	Matches at the beginning of text rank higher
Exactness	Exact matches rank higher than partial matches

Search Tips

Best Practices

Use specific, descriptive terms for better results
Quote exact phrases when you need precise matches
Start with broader queries and narrow down if needed
Use page_id filter when searching within a known article

Examples by Use Case

Finding Definitions

# Find definition-like content
GET /v1/search?q="is a" quantum mechanics

Finding Historical Events

# Search for events with dates
GET /v1/search?q=world war 1914

Technical Topics

# Search for technical explanations
GET /v1/search?q="machine learning" algorithm neural network

Getting More Context

# Longer text snippets for more context
GET /v1/search?q=photosynthesis&crop_length=800

Understanding Response Fields

Each search result contains the following fields:

Field	Description
`id`	Unique chunk identifier (format: `pageId_chunkIndex`)
`page_id`	Wikipedia page ID
`title`	Article title
`section`	Section heading (if applicable)
`text`	The text content of this chunk
`chunk_id`	Position of this chunk within the page
`_formatted`	Highlighted version of fields (when highlighting enabled)
`source`	Attribution information with Wikipedia URLs
`license`	Content license (CC BY-SA 4.0)

Query Limits

Keep these limits in mind when building your queries:

Parameter	Minimum	Maximum	Default
`q` (query length)	1 character	500 characters	-
`limit`	1	50	10
`offset`	0	1,000,000	0
`crop_length`	50	2000	400

Performance Tips

Optimize your queries for the best performance when working with large result sets.

Performance Best Practices

Use specific queries: Broad single-word searches like "the" will be slower than specific multi-word queries.
Limit results appropriately: Request only the number of results you need. Use limit=10 instead of limit=50 if you only display 10.
Paginate efficiently: For deep pagination (high offset values), consider using keyset pagination with IDs instead of large offsets.
Cache responses: Cache API responses on your server to reduce redundant requests.
Batch requests: If you need multiple searches, batch them with a small delay rather than firing simultaneously.

Efficient Pagination

For large result sets, prefer fetching consecutive pages rather than random access:

# Good: Sequential pagination
page1 = search("topic", limit=10, offset=0)
page2 = search("topic", limit=10, offset=10)
page3 = search("topic", limit=10, offset=20)

# Avoid: Very high offsets (slower performance)
page100 = search("topic", limit=10, offset=990)

Minimize Response Size

# For quick results, use smaller crop_length
GET /v1/search?q=topic&crop_length=100

# Disable highlighting if not needed
GET /v1/search?q=topic&highlight=false

Connection Management

When making multiple API calls:

Use HTTP/2 or connection pooling to reuse connections
Implement exponential backoff for retries
Set reasonable timeouts (we recommend 10-30 seconds)
Handle rate limiting gracefully using the X-RateLimit-* headers

Caching Strategies

Cache Duration	Use Case
1-5 minutes	High-traffic search queries, trending topics
1-24 hours	Individual chunks, page content
1-7 days	Reference content, rarely changing articles

Monitor the X-RateLimit-Remaining header to stay within your quota and avoid service interruptions.