Vector space mismatch after embedding model upgrade breaks search relevance
criticalWhen an embedding model is upgraded without re-embedding the existing dataset, query vectors generated by the new model do not align with the vector space created by the old model. Cosine similarity comparisons between new queries and old stored embeddings yield misleading results, causing search, recommendations, and RAG applications to return irrelevant results instead of semantically relevant matches.
Re-embed the entire dataset with the new model to ensure query and stored embeddings exist in the same vector space. To minimize disruption: (1) use collection aliases to maintain the old collection while building a new one, then switch atomically by updating the alias pointer, or (2) add multiple embedding versions within a single collection to test new embeddings alongside old ones with live queries before full cutover. Rollback is instantaneous by reverting the alias pointer.