Query Syntax Guide
Learn how to write effective search queries to get the best results from the WikiRest API.
Basic Queries
The simplest query is just a word or phrase. The API performs full-text search across Wikipedia article titles, sections, and content.
Single Word
GET /v1/search?q=quantum Searches for articles containing the word "quantum".
Multiple Words
GET /v1/search?q=quantum computing Searches for articles containing both "quantum" and "computing". Results are ranked by relevance, with documents containing both words near each other scoring higher.
Exact Phrases
GET /v1/search?q="quantum entanglement" Use double quotes to search for an exact phrase. This returns only results where the words appear together in the specified order.
Filtering Results
Filter by Page ID
GET /v1/search?q=history&page_id=12345 Search within a specific Wikipedia page. Use this to find specific sections within a known article.
Pagination
Control the number of results and navigate through large result sets using
limit and offset.
Limit Results
GET /v1/search?q=python&limit=5 Return only the top 5 results. Valid range: 1-50, default: 10.
Paginate Results
# Page 1 (first 10 results)
GET /v1/search?q=python&limit=10&offset=0
# Page 2 (results 11-20)
GET /v1/search?q=python&limit=10&offset=10
# Page 3 (results 21-30)
GET /v1/search?q=python&limit=10&offset=20 Highlighting
By default, matching text is highlighted in the results. You can control this behavior.
Disable Highlighting
GET /v1/search?q=einstein&highlight=false Adjust Text Length
# Longer text snippets (up to 2000 characters)
GET /v1/search?q=einstein&crop_length=1000
# Shorter snippets (minimum 50 characters)
GET /v1/search?q=einstein&crop_length=100 The crop_length parameter controls how much surrounding context is included. Valid range: 50-2000, default: 400.
Understanding Ranking
Search results are ranked using Meilisearch's default ranking algorithm, which considers:
| Factor | Description |
|---|---|
| Typo tolerance | Small typos are accepted (e.g., "quantm" matches "quantum") |
| Word proximity | Documents with query words close together rank higher |
| Attribute importance | Matches in titles rank higher than matches in body text |
| Word position | Matches at the beginning of text rank higher |
| Exactness | Exact matches rank higher than partial matches |
Search Tips
Best Practices
- Use specific, descriptive terms for better results
- Quote exact phrases when you need precise matches
- Start with broader queries and narrow down if needed
- Use
page_idfilter when searching within a known article
Examples by Use Case
Finding Definitions
# Find definition-like content
GET /v1/search?q="is a" quantum mechanics Finding Historical Events
# Search for events with dates
GET /v1/search?q=world war 1914 Technical Topics
# Search for technical explanations
GET /v1/search?q="machine learning" algorithm neural network Getting More Context
# Longer text snippets for more context
GET /v1/search?q=photosynthesis&crop_length=800 Understanding Response Fields
Each search result contains the following fields:
| Field | Description |
|---|---|
id | Unique chunk identifier (format: pageId_chunkIndex) |
page_id | Wikipedia page ID |
title | Article title |
section | Section heading (if applicable) |
text | The text content of this chunk |
chunk_id | Position of this chunk within the page |
_formatted | Highlighted version of fields (when highlighting enabled) |
source | Attribution information with Wikipedia URLs |
license | Content license (CC BY-SA 4.0) |
Query Limits
Keep these limits in mind when building your queries:
| Parameter | Minimum | Maximum | Default |
|---|---|---|---|
q (query length) | 1 character | 500 characters | - |
limit | 1 | 50 | 10 |
offset | 0 | 1,000,000 | 0 |
crop_length | 50 | 2000 | 400 |
Performance Tips
Optimize your queries for the best performance when working with large result sets.
Performance Best Practices
- Use specific queries: Broad single-word searches like "the" will be slower than specific multi-word queries.
- Limit results appropriately: Request only the number of results you need. Use
limit=10instead oflimit=50if you only display 10. - Paginate efficiently: For deep pagination (high offset values), consider using keyset pagination with IDs instead of large offsets.
- Cache responses: Cache API responses on your server to reduce redundant requests.
- Batch requests: If you need multiple searches, batch them with a small delay rather than firing simultaneously.
Efficient Pagination
For large result sets, prefer fetching consecutive pages rather than random access:
# Good: Sequential pagination
page1 = search("topic", limit=10, offset=0)
page2 = search("topic", limit=10, offset=10)
page3 = search("topic", limit=10, offset=20)
# Avoid: Very high offsets (slower performance)
page100 = search("topic", limit=10, offset=990) Minimize Response Size
# For quick results, use smaller crop_length
GET /v1/search?q=topic&crop_length=100
# Disable highlighting if not needed
GET /v1/search?q=topic&highlight=false Connection Management
When making multiple API calls:
- Use HTTP/2 or connection pooling to reuse connections
- Implement exponential backoff for retries
- Set reasonable timeouts (we recommend 10-30 seconds)
- Handle rate limiting gracefully using the
X-RateLimit-*headers
Caching Strategies
| Cache Duration | Use Case |
|---|---|
| 1-5 minutes | High-traffic search queries, trending topics |
| 1-24 hours | Individual chunks, page content |
| 1-7 days | Reference content, rarely changing articles |
Monitor the X-RateLimit-Remaining header to stay within your quota and avoid service interruptions.