Overview
TheGraphRetriever
from the langchain-graph-retriever
package provides a LangChain
retriever that combines unstructured similarity search
on vectors with structured traversal of metadata properties. This enables graph-based
retrieval over an existing vector store.
Integration details
Retriever | Source | PyPI Package | Latest | Project Page |
---|---|---|---|---|
GraphRetriever | github.com/datastax/graph-rag | langchain-graph-retriever | Graph RAG |
Benefits
- Link based on existing metadata: Use existing metadata fields without additional processing. Retrieve more from an existing vector store!
- Change links on demand: Edges can be specified on-the-fly, allowing different relationships to be traversed based on the question.
- Pluggable Traversal Strategies: Use built-in traversal strategies like Eager or MMR, or define custom logic to select which nodes to explore.
- Broad compatibility: Adapters are available for a variety of vector stores with support for additional stores easily added.
Setup
Installation
This retriever lives in thelangchain-graph-retriever
package.
Instantiation
The following examples will show how to perform graph traversal over some sample Documents about animals.Prerequisites
Populating the Vector store
This section shows how to populate a variety of vector stores with the sample data. For help on choosing one of the vector stores below, or to add support for your vector store, consult the documentation about Adapters and Supported Stores.Install the Then create a vector store and load the test documents:For the
langchain-graph-retriever
package with the astra
extra:ASTRA_DB_API_ENDPOINT
and ASTRA_DB_APPLICATION_TOKEN
credentials,
consult the AstraDB Vector Store Guide.:::note
For faster initial testing, consider using the InMemory Vector Store.
:::Graph Traversal
This graph retriever starts with a single animal that best matches the query, then traverses to other animals sharing the samehabitat
and/or origin
.
start_k=1
), retrieves 5 documents (k=5
) and limits the search to documents
that are at most 2 steps away from the first animal (max_depth=2
).
The edges
define how metadata values can be used for traversal. In this case, every
animal is connected to other animals with the same habitat
and/or origin
.
capybara
, heron
, frog
, crocodile
, and newt
all
share the same habitat=wetlands
, as defined by their metadata. This should increase
Document Relevance and the quality of the answer from the LLM.
Comparison to Standard Retrieval
Whenmax_depth=0
, the graph traversing retriever behaves like a standard retriever:
start_k=5
),
and returns them without any traversal (max_depth=0
). The edge definitions
are ignored in this case.
This is essentially the same as:
Usage
Following the examples above,.invoke
is used to initiate retrieval on a query.
Use within a chain
Like other retrievers,GraphRetriever
can be incorporated into LLM applications
via chains.