DataStax Astra DB is a serverless
AI-ready database built on Apache Cassandra®
and made conveniently available
through an easy-to-use JSON API.
In the walkthrough, we’ll demo the SelfQueryRetriever
with an Astra DB
vector store.
Creating an Astra DB vector store
First, create an Astra DB vector store and seed it with some data. We’ve created a small demo set of documents containing movie summaries. NOTE: The self-query retriever requires thelark
package installed (pip install lark
).
OpenAIEmbeddings
. Please enter an OpenAI API Key.
- the API Endpoint looks like
https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com
- the Token looks like
AstraCS:aBcD0123...
Creating a self-querying retriever
Now you can instantiate the retriever. To do this, you need to provide some information upfront about the metadata fields that the documents support, along with a short description of the documents’ contents.Testing it out
Now you can try actually using our retriever:Set a limit (‘k’)
you can also use the self-query retriever to specifyk
, the number of documents to fetch.
You achieve this by passing enable_limit=True
to the constructor.