Full-Text Search in the Browser
How ctrodb implements client-side full-text search without external services
Full-text search is one of those features that feels simple until you implement it. This post explains how ctrodb's FTS plugin works under the hood.
The naive approach
Without an index, searching means iterating every record and running String.includes():
const results = allRecords.filter((r) =>
r.title.toLowerCase().includes(query.toLowerCase())
)
This works for small datasets. For hundreds or thousands of records, it gets slow.
Token-based indexing
ctrodb's FTS plugin takes a different approach. When a record is created or updated, it extracts tokens from searchable fields and stores a token-to-document mapping:
Token "database" -> [doc1, doc3, doc7]
Token "schema" -> [doc1, doc4]
Token "react" -> [doc2, doc5]
A search for "database schema" finds docs that have both tokens (AND logic).
Tokenizer behavior
The tokenizer:
- Lowercases the input
- Splits on non-alphanumeric characters
- Removes duplicates
- Filters out stop words (a, an, the, and, ...)
- Returns unique tokens
import { tokenize } from "ctrodb"
tokenize("Hello World! The database")
// ["hello", "world", "database"]
Storage strategy
The index lives in a _ctrodb_fts collection alongside your data. Each entry is:
{
id: "articles:database",
token: "database",
collection: "articles",
docIds: [1, 3, 7]
}
This works with both MemoryAdapter and IndexedDBAdapter.
Query execution
When you call .search("title", "database schema"):
- The tokenizer splits the query into tokens:
["database", "schema"] - For each token, the indexer looks up the corresponding entry
- Doc IDs that appear in ALL token entry sets are returned
- These IDs are passed to the adapter for record retrieval
Performance characteristics
- Indexing: O(tokens per record) — happens on every create/update/delete
- Search: O(tokens in query) — constant time lookups per token
- Storage: O(unique tokens x docs per token) — scales with vocabulary size
When to use it
The FTS plugin works best for:
- Searchable text fields (titles, descriptions, body content)
- Datasets up to tens of thousands of records
- Apps where exact substring matching is sufficient
It does not support fuzzy search, stemming, or relevance scoring. For those, consider a dedicated search service.
Related posts
Client-Side Full-Text Search with ctrodb
Build a complete search experience in the browser using ctrodb's inverted index engine, tokenizer, and search API.
PluginsExtending ctrodb with Custom Plugins
Leverage ctrodb's plugin system to add custom validation rules, lifecycle hooks, and data transformations.
TransactionsTransactions and Data Integrity in ctrodb
Ensure data consistency with ctrodb's transaction system, rollback support, and comprehensive error types.