Show HN: Semantic search over Hacker News, built on pgvector https://ift.tt/4n6xMr2
Show HN: Semantic search over Hacker News, built on pgvector I built https://ift.tt/SkOoTq0 — a semantic search engine over Hacker News posts. Instead of keyword matching, it finds results by meaning, so you can search things like "best way to handle authentication in microservices" and get relevant threads even if they don't contain those exact words. How it works: Indexed HN posts and comments into PostgreSQL with pgvector (HNSW index) Embeddings generated with OpenAI's embedding model Queries run as nearest-neighbor vector searches — typical response under 50ms The whole thing runs on a single Postgres instance, no separate vector DB I built this partly because I wanted a better way to search HN, and partly to dogfood my own project — Rivestack ( https://rivestack.io ), a managed PostgreSQL service with pgvector baked in. I wanted to see how pgvector holds up with a real dataset at a reasonable scale. A few things I learned along the way: HNSW vs IVFFlat matters a lot at this scale. HNSW gave me much better recall with acceptable index build times. Storing embeddings alongside relational data in the same DB simplifies things enormously — no syncing between a vector store and your main DB. pgvector has gotten surprisingly fast in recent versions. For most use cases, you really don't need a dedicated vector database. The search is free to use. Rivestack has a free tier too if anyone wants to try something similar. Happy to answer questions about the architecture, pgvector tuning, or anything else. https://ift.tt/UjMQ3af February 22, 2026 at 09:03PM
0 टिप्पणियाँ:
एक टिप्पणी भेजें