AI educationIntermediate

Vector Search vs Keyword Search in Study Tools

Keyword search finds exact words; vector search finds nearby meaning. Strong study systems usually need both.

Vector searchKeyword searchRAGHybrid retrieval

Site connection

Lykke uses document embeddings, hybrid search, and reranking to retrieve course evidence before generating study artifacts.

Visual model

Two signals, one ranked evidence set

The ranked evidence demo shows why exact keyword matches and vector similarity should both influence retrieval.

Interactive

Hybrid retrieval turns a vague study question into ranked evidence

Lecture: embeddingskeyword 0.34 / vector 0.94
0.87
Canvas calendar exportkeyword 0.76 / vector 0.42
0.79
Syllabus policieskeyword 0.38 / vector 0.62
0.72
Study guide draftkeyword 0.28 / vector 0.68
0.61

Two Kinds of Relevance

Keyword search is lexical. It cares whether the same tokens appear. Vector search is semantic. It cares whether two chunks mean similar things under an embedding model.

Keyword winsCourse codes, theorem names, exact acronyms, dates, formulas.
Vector winsParaphrases, conceptual questions, vague student language.
Hybrid winsMost real study questions, because students mix exact terms and fuzzy intent.

Why Exact Matching Still Matters

A vector model may understand that 'midterm review' and 'exam prep' are related, but it can blur exact names. If a student asks for 'CS 111 Project 2 rubric,' lexical matching should strongly preserve those terms.

Full-text indexes such as SQLite FTS5 are designed to efficiently find documents containing specific terms. That is still a basic superpower.

Why Semantic Search Matters

Students rarely ask in the same words used by a slide deck. A lecture may say 'gradient-based optimization' while the student asks 'how does the model learn from error?' Embeddings can connect those meanings.

The best pipeline retrieves both lexical and semantic candidates, merges them, reranks them, and keeps source labels attached.

Query typeBest first signal
Exact assignment nameKeyword
Vague conceptual questionVector
Formula with notationKeyword plus metadata
Study-plan requestVector plus calendar metadata
Acronym or abbreviationKeyword

Common Pitfalls

  • Using only vector search and missing exact course identifiers.
  • Using only keyword search and missing paraphrases.
  • Merging rankings without deduplication.
  • Letting semantically similar but stale course content outrank the current assignment.

Quick check

Quiz

Why is hybrid search useful in course tools?
  1. It combines exact terms with semantic similarity
  2. It avoids all indexing
  3. It removes source metadata
  4. It only works for images

Course questions often contain both exact identifiers and fuzzy intent.

Sources and Further Reading

Related Explainers