There's no easy way to index ColBert multi-vectors in a scalable way that I know... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		VoVAllen on Dec 5, 2024 \| parent \| context \| favorite \| on: VectorChord: Store 400k Vectors for $1 in PostgreS... There's no easy way to index ColBert multi-vectors in a scalable way that I know of. Vespa seems to rely heavily on binary quantization, which can cost a lot in recall loss. And for most cases, using ColBert as a reranker is good enough, as the pgvector example you posted.

tarasglek on Dec 5, 2024 [–]

Seems like like doing a proper relational 1:N chunk:multiple-vectors foreign key, binarization and a clever join or multistage CTE would get us pretty close to useful.

I am ok with it being less efficient as the dev ux will be amazing. Vespa ops (even in their cloud) are a complete nightmare compared to postgres

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact