Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's no easy way to index ColBert multi-vectors in a scalable way that I know of. Vespa seems to rely heavily on binary quantization, which can cost a lot in recall loss. And for most cases, using ColBert as a reranker is good enough, as the pgvector example you posted.


Seems like like doing a proper relational 1:N chunk:multiple-vectors foreign key, binarization and a clever join or multistage CTE would get us pretty close to useful.

I am ok with it being less efficient as the dev ux will be amazing. Vespa ops (even in their cloud) are a complete nightmare compared to postgres




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: