Context-Aware Personalized Search Re-Ranking with Lightweight Hash-Based Word Correlation Vectors

Why I Built This: When “Good Search” Still Feels Wrong

Most enterprise search stacks do a decent job at matching words. OpenSearch (and similar engines) commonly rely on lexical ranking families like TF‑IDF/BM25. These are strong baselines, but they tend to miss something important: user-specific intent.

Three failure modes show up repeatedly in real systems:

Ambiguity: the same term can mean different things to different users (or even the same user on different days).
Intent drift: users’ interests change over time; yesterday’s context matters.
One-size-fits-all ranking: a uniform ranking is applied across all users even though patterns differ by role, team, or experience level.

A typical reaction is: “Let’s use embeddings” (dense retrieval, sentence transformers, vector databases, LTR models). That can work—but it also brings cost, latency, new infra, and sometimes reduced explainability.

This case study explores a different approach:

A lightweight, explainable personalization reranker that learns from a user’s own query history using hash-based word correlation vectors—no pretraining, no embeddings, and no external vector store.

Design Goals & Constraints

This architecture was driven by very practical constraints:

Low memory per user: must scale to many users without storing full query logs.
Low operational complexity: avoid adding heavy ML services or vector DBs.
Explainability: keep a clear “why did this rank higher?” story.
Compatibility: should work as a reranking layer on top of standard OpenSearch results.
Incremental learning: adapt in real time as users search, not through offline training.

High-Level Architecture

At a high level, the system wraps around OpenSearch like this:

User submits a query
The system updates a per-user word map (a compact model of word-to-word correlations)
OpenSearch returns top‑N results using its standard relevance score
We compute a similarity score between:
- the user’s current query context, and
- each candidate result title (or snippet)
We blend the similarity with the OpenSearch score and re-rank.

Visually:

Learning path (write): Query → tokenize → hash → update sparse vectors
Serving path (read): Query + Top‑N results → vectorize → cosine similarity → blend scores → re-rank

Data Model: The Per-User Word Store

The core state is a per-user map:

Map<String, float[]> userWordMap
Key: word (or hashed representation of the word)
Value: a fixed-size float vector representing correlations to other words.

Each vector is intentionally:

Fixed-size (bounded memory)
Sparse (most entries stay 0)
Incrementally updated on every query.

Vector Interpretation

Think of each word as having a “memory” of other words it has appeared near in the user’s searches.

Vector index = hash slot of a co-occurring word
Vector value = strength of correlation (based on proximity within queries)

This is not an embedding in the neural sense. It’s closer to a user-personalized, hashed co-occurrence graph stored as sparse arrays.

Step 1 — Tokenization and Hashing

When a query arrives, we tokenize it into words, then hash each token into a bounded range:

int hash(String word) {
  return abs(MurmurHash3(word) % VECTOR_SIZE);
}

The disclosure example uses a vector size such as 100 (and notes the hash values are illustrative).

Why hashing?

Keeps memory bounded regardless of vocabulary growth
Avoids maintaining a global dictionary
Enables fast $O(1)$ updates into a fixed vector slot
Trade-off: Hashing can cause collisions. In practice, you manage this by tuning vector size and observing impact.

Step 2 — Learning: Updating Word Correlation Vectors

Here’s the learning rule:
For each query, for each pair of words, update the vector of the source word at the target word’s hash slot.

The update weight is inverse distance within the query:

float weight = 1.0 / distance;
vector[targetHash] += weight;

So, closer words get stronger correlation updates.

Concrete Example (Intuition)

Query: “create oci object storage bucket”

If “create” is 1 position away from “oci”, distance = 1 implies weight = 1.0
If “create” is 4 positions away from “bucket”, distance = 4 implies weight = 0.25

This builds a correlation profile that captures the structure of the query, not just the bag-of-words.

Why Inverse Distance?

Inverse distance is simple and explainable:

It encodes that adjacency matters (“object storage” is a meaningful phrase).
It is stable and fast.
It avoids needing any ML model training.

Step 3 — Storage Behavior Over Time (Incremental Learning)

Each time a user searches:

New word -> create a new zero vector and start learning.
Existing word -> update its vector “in-place”.

Over time, the userWordMap becomes a compact summary of the user’s search behavior: which words tend to appear together (and how tightly). This is what enables personalization without storing full query history.

Serving: Re-Ranking Search Results Using User Context

Learning builds the user context. Serving uses it.

Serving Step 1 — Get Candidate Results

OpenSearch returns top‑N results with its semantic/lexical score (BM25 or your stack’s scoring). We do not replace OpenSearch. We treat it as the retrieval stage and add personalization after.

Serving Step 2 — Vectorize the Current Query and Result Titles

We tokenize:

The user’s current query
Each result’s title (or another short text field)

Then for each word token, we look up its vector from the userWordMap. At this point, we have vector sets representing:

Current query context
Each candidate result’s title context

Serving Step 3 — Cosine Similarity

We compute similarity between the query context and candidate result context using cosine similarity.

Cosine similarity is a natural fit because:

Vectors are sparse and high-dimensional.
Direction (co-occurrence pattern) matters more than raw magnitude.
It’s easy to implement and optimize.

Serving Step 4 — Blend with OpenSearch Score

Final ranking score is a linear blend:

finalScore = α * openSearchScore + β * similarityScore

Example values:

α = 0.6
β = 0.4

Then we sort by finalScore and return the re-ranked results. This blending is important:

OpenSearch score preserves global relevance (query/document match).
Similarity score injects user-specific context.
α , β lets you tune how strong personalization should be.

Walkthrough Example: Why “oci object storage” Rises to the Top

The disclosure provides a scenario like:

Query history: * “put data to oci object storage”
- “get data from oci object storage bucket”
Later query: “storage”

OpenSearch might return generic storage results:

“File storage encryption”
“persistent volume storage”
“oci object storage”

After reranking, “oci object storage” rises because the user’s history strongly correlates oci -> object -> storage.

The “Aha” Moment: Even when the query is short and ambiguous (“storage”), the system can infer that this user often means “OCI object storage”.

Memory & Performance Characteristics

This design is intentionally lightweight.

Memory Optimization

Key properties from the implementation notes:

Only distinct words stored per user.
No redundant storage of duplicate queries.
Vector size is bounded and fixed.
Updates are sparse (most entries are zeros).

This is why the approach can be attractive where embeddings + external vector stores are too heavy.

Runtime Overhead

There is overhead: cosine similarity per result. The disclosure notes this is mitigated by limiting to top‑N results.

In practice, you can also:

Only rerank top‑K (e.g., 20–50).
Cache vectorized titles.
Prune low-activity users.
Periodically decay old correlations.

Strengths: Why This Works in Real Systems

No training pipeline: The model updates in real time from queries—no offline training loop, no model registry, no feature store.
No external vector store: Everything is stored as small per-user sparse vectors.
Explainable personalization: If a result rises, you can often explain it: “This user has historically searched oci with storage frequently.”
Works with existing OpenSearch: It’s a reranker, not a replacement. That reduces adoption friction.

Limitations & Practical Considerations

No architecture is free. The disclosure highlights several realistic constraints:

Cold start: If the user has no history, similarity score will be weak. In that case, rely mostly on OpenSearch score (α close to 1) and gradually increase β as history accumulates.
Query quality dependency: If user queries are inconsistent or noisy, learned correlations may be noisy.
Runtime cost: Cosine similarity for each candidate adds computation. Limiting to top‑N helps.
Hash collisions: Hashing is bounded but can collide.

Mitigations:

Increase vector size (within memory budget).
Use multiple hashes (like a Bloom-style approach).
Monitor impact with offline evaluation.

Why I Built This: When “Good Search” Still Feels Wrong#

Design Goals & Constraints#

High-Level Architecture#

Data Model: The Per-User Word Store#

Vector Interpretation#

Step 1 — Tokenization and Hashing#

Why hashing?#

Step 2 — Learning: Updating Word Correlation Vectors#

Concrete Example (Intuition)#

Why Inverse Distance?#

Step 3 — Storage Behavior Over Time (Incremental Learning)#

Serving: Re-Ranking Search Results Using User Context#

Serving Step 1 — Get Candidate Results#

Serving Step 2 — Vectorize the Current Query and Result Titles#

Serving Step 3 — Cosine Similarity#

Serving Step 4 — Blend with OpenSearch Score#

Walkthrough Example: Why “oci object storage” Rises to the Top#

Memory & Performance Characteristics#

Memory Optimization#

Runtime Overhead#

Strengths: Why This Works in Real Systems#

Limitations & Practical Considerations#