Home
Integration

Build Ask AI with Sourcey and LangChain

Implement your own Ask AI feature over a Sourcey docs site.

If you want an Ask AI feature over your docs, this is the thin layer.

Sourcey already ships the files a retriever needs:

  • search-index.json
  • llms-full.txt
  • stable page URLs

That means you do not need a hosted index just to make your docs usable from LangChain. Publish the docs site. Point the retriever at it. Done.

Install

Python:

pip install -U langchain-sourcey

JavaScript:

npm install langchain-sourcey @langchain/core

What it reads

Both langchain-sourcey packages work against a published Sourcey docs root.

It uses:

  • search-index.json to find candidate pages
  • llms-full.txt to hydrate full page content
  • the page URL as the LangChain citation source

If llms-full.txt is missing, it falls back to the matched page HTML.

Quickstart

Python:

from langchain_sourcey import SourceyRetriever

retriever = SourceyRetriever(
    site_url="https://sourcey.com/docs",
    top_k=3,
)

docs = retriever.invoke("mcp integration")

for doc in docs:
    print(doc.metadata["title"])
    print(doc.metadata["source"])
    print(doc.page_content[:160])
    print()

site_url should be the root of a published Sourcey build:

  • https://sourcey.com/docs
  • https://sourcey.com/cheesestore
  • https://cheesestore.github.io

JavaScript:

import { SourceyRetriever } from "langchain-sourcey";

const retriever = new SourceyRetriever({
  siteUrl: "https://sourcey.com/docs",
  topK: 3,
});

const docs = await retriever.invoke("mcp integration");

for (const doc of docs) {
  console.log(doc.metadata.title);
  console.log(doc.metadata.source);
  console.log(doc.pageContent.slice(0, 160));
  console.log();
}

Implement Ask AI

Install a chat model package. This example uses OpenAI.

Python:

pip install -U langchain-openai
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

from langchain_sourcey import SourceyRetriever

retriever = SourceyRetriever(site_url="https://sourcey.com/docs", top_k=3)

prompt = ChatPromptTemplate.from_template(
    """Answer the question using the documentation context below.

{context}

Question: {question}"""
)

chain = (
    RunnablePassthrough.assign(context=(lambda x: x["question"]) | retriever)
    | prompt
    | ChatOpenAI(model="gpt-4.1-mini")
    | StrOutputParser()
)

print(chain.invoke({"question": "How does Sourcey document MCP servers?"}))

JavaScript:

npm install @langchain/openai
import type { Document } from "@langchain/core/documents";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { RunnablePassthrough, RunnableSequence } from "@langchain/core/runnables";
import { ChatOpenAI } from "@langchain/openai";
import { SourceyRetriever } from "langchain-sourcey";

const retriever = new SourceyRetriever({
  siteUrl: "https://sourcey.com/docs",
  topK: 3,
});

const prompt = ChatPromptTemplate.fromTemplate(
  `Answer the question using the documentation context below.

{context}

Question: {question}`
);

const formatDocs = (docs: Document[]) =>
  docs.map((doc) => doc.pageContent).join("\n\n");

const chain = RunnableSequence.from([
  {
    context: retriever.pipe(formatDocs),
    question: new RunnablePassthrough(),
  },
  prompt,
  new ChatOpenAI({ model: "gpt-4.1-mini" }),
  new StringOutputParser(),
]);

console.log(await chain.invoke("How does Sourcey document MCP servers?"));

The contract

If you want this to work cleanly on your own docs site, keep these stable:

  • publish search-index.json
  • publish llms-full.txt
  • set siteUrl so page URLs are canonical

That is the whole trick. Sourcey emits the retrieval surface as part of the normal docs build, so the LangChain integration stays thin.

Package