Try it
Use operators AND, OR, NOT, and parentheses ( ). Terms are case-insensitive; operators are case-insensitive too.
Example queries:
information AND retrieval
(web OR internet) AND crawling
ranking AND NOT pagerank
"inverted index"
How the demo works
- Tokenize each document into terms (lowercased, stripped of punctuation).
- Build an inverted index:
term → {docIDs}. - Parse the query to a syntax tree (supports quotes for phrases).
- Evaluate the tree using set operations over posting lists.
- Highlight matched terms/phrases and list matching documents.
Study cheatsheet
Core concepts
- Tokenization & normalization
- Dictionary & postings lists
- Conjunctive vs disjunctive queries
- Skip pointers & efficient merging
Operator precedence
NOT > AND > OR (unless parentheses override)
Phrases in "double quotes" are treated as exact sequences.
Complexity intuition
Conjunctive queries cost ~ the sum of traversed posting lengths; ordering terms by increasing df can reduce work.