Context-Aware Personalized Search Re-Ranking with Lightweight Hash-Based Word Correlation Vectors
Why I Built This: When “Good Search” Still Feels Wrong Most enterprise search stacks do a decent job at matching words. OpenSearch (and similar engines) commonly rely on lexical ranking families like TF‑IDF/BM25. These are strong baselines, but they tend to miss something important: user-specific intent. Three failure modes show up repeatedly in real systems: Ambiguity: the same term can mean different things to different users (or even the same user on different days). Intent drift: users’ interests change over time; yesterday’s context matters. One-size-fits-all ranking: a uniform ranking is applied across all users even though patterns differ by role, team, or experience level. A typical reaction is: “Let’s use embeddings” (dense retrieval, sentence transformers, vector databases, LTR models). That can work—but it also brings cost, latency, new infra, and sometimes reduced explainability. ...