How to Set Up Graph RAG on Vercel and Neon
Vanilla vector RAG breaks down on questions whose answers span multiple documents - "which of our customers in healthcare use the same vendor as Acme?" Vector similarity finds the closest chunks, but it doesn't connect them.
Graph RAG fixes this by extracting entities and relations from your corpus into a knowledge graph and expanding retrieval through the graph. Here's how to do it on Neon Postgres + Vercel AI SDK, with no separate graph database.
If you haven't shipped vanilla RAG yet, start with How to Set Up RAG on Vercel and Neon first - this guide builds on that schema.
1. Schema: graph beside the vectors
Same Neon database, four extra tables:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id bigserial PRIMARY KEY,
content text,
embedding vector(1536)
);
CREATE TABLE entities (
id bigserial PRIMARY KEY,
name text NOT NULL,
type text NOT NULL,
UNIQUE (name, type)
);
CREATE TABLE relations (
subject_id bigint REFERENCES entities(id),
predicate text NOT NULL,
object_id bigint REFERENCES entities(id),
PRIMARY KEY (subject_id, predicate, object_id)
);
CREATE TABLE mentions (
doc_id bigint REFERENCES documents(id),
entity_id bigint REFERENCES entities(id),
PRIMARY KEY (doc_id, entity_id)
);
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
2. Extract entities and relations
Use generateObject from the AI SDK to extract typed (subject, predicate, object) triples per chunk. Production-grade extraction also needs entity resolution so "Acme Corp" and "Acme" map to the same node:
import { generateObject } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'
const Schema = z.object({
triples: z.array(z.object({
subject: z.string(),
subjectType: z.string(),
predicate: z.string(),
object: z.string(),
objectType: z.string(),
})),
})
export async function extract(text: string) {
const { object } = await generateObject({
model: openai('gpt-4o-mini'),
schema: Schema,
prompt: `Extract entities and relations as triples from:\n\n${text}`,
})
return object.triples
}
3. Persist the graph
Upsert entities, then relations, then mentions linking the doc to its entities:
async function upsertEntity(name: string, type: string) {
const [{ id }] = await sql`
INSERT INTO entities (name, type) VALUES (${name}, ${type})
ON CONFLICT (name, type) DO UPDATE SET name = EXCLUDED.name
RETURNING id`
return id as number
}
export async function persist(docId: number, triples: Triple[]) {
for (const t of triples) {
const s = await upsertEntity(t.subject, t.subjectType)
const o = await upsertEntity(t.object, t.objectType)
await sql`INSERT INTO relations VALUES (${s}, ${t.predicate}, ${o})
ON CONFLICT DO NOTHING`
await sql`INSERT INTO mentions VALUES (${docId}, ${s})
ON CONFLICT DO NOTHING`
await sql`INSERT INTO mentions VALUES (${docId}, ${o})
ON CONFLICT DO NOTHING`
}
}
4. Hybrid retrieval: vectors + graph expansion
The trick: retrieve top-K by vector similarity, then pull in 1-hop neighbors of any entity mentioned in those chunks. This is what makes multi-hop reasoning work:
export async function graphRetrieve(query: string, k = 5) {
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: query,
})
const vec = JSON.stringify(embedding)
const seedDocs = await sql`
SELECT id, content FROM documents
ORDER BY embedding <=> ${vec} LIMIT ${k}`
const ids = seedDocs.map((d) => d.id)
const expanded = await sql`
SELECT DISTINCT d.content
FROM mentions m1
JOIN relations r ON r.subject_id = m1.entity_id OR r.object_id = m1.entity_id
JOIN mentions m2 ON m2.entity_id IN (r.subject_id, r.object_id)
JOIN documents d ON d.id = m2.doc_id
WHERE m1.doc_id = ANY(${ids}) AND d.id <> ALL(${ids})
LIMIT 5`
return [...seedDocs, ...expanded]
}
5. Wire it into the chat route
Same as vanilla RAG, just swap retrieve for graphRetrieve:
const docs = await graphRetrieve(lastMessage)
const context = docs.map((d) => d.content).join('\n---\n')
const result = streamText({
model: openai('gpt-4o'),
system: `Answer using ONLY the context.\n\n${context}`,
messages: convertToModelMessages(messages),
})
This is the minimum viable version
Production Graph RAG also needs entity resolution, schema constraints on predicates, contradiction handling for facts that change over time, and good evals. Each of those is a project on its own.
If you'd rather not own all of that, TypeGraph ships entity resolution, contradiction detection, and graph-augmented retrieval as a TypeScript SDK that runs on the same Neon Postgres - and it's MIT licensed.
Comparing options? See the 5 best open source Graph RAG tools.