Setup
To access Databricks models you’ll need to create a Databricks account, set up credentials (only if you are outside Databricks workspace), and install required packages.Credentials (only if you are outside Databricks)
If you are running LangChain app inside Databricks, you can skip this step. Otherwise, you need manually set the Databricks workspace hostname and personal access token toDATABRICKS_HOST
and DATABRICKS_TOKEN
environment variables, respectively. See Authentication Documentation for how to get an access token.
Installation
The LangChain Databricks integration lives in thedatabricks-langchain
package.
Create a Vector Search Endpoint and Index (if you haven’t already)
In this section, we will create a Databricks Vector Search endpoint and an index using the client SDK. If you already have an endpoint and an index, you can skip the section and go straight to “Instantiation” section. First, instantiate the Databricks VectorSearch client:DatabricksVectorSearch
class support both use cases.
- Delta Sync Index automatically syncs with a source Delta Table, automatically and incrementally updating the index as the underlying data in the Delta Table changes.
- Direct Vector Access Index supports direct read and write of vectors and metadata. The user is responsible for updating this table using the REST API or the Python SDK.
Instantiation
The instantiation ofDatabricksVectorSearch
is a bit different depending on whether your index uses Databricks-managed embeddings or self-managed embeddings i.e. LangChain Embeddings object of your choice.
If you are using a delta-sync index with Databricks-managed embeddings:
Manage vector store
Add items to vector store
Note: Adding items to vector store viaadd_documents
method is only supported for a direct-access index.
Delete items from vector store
Note: Deleting items to vector store viadelete
method is only supported for a direct-access index.
Query vector store
Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.Query directly
Performing a simple similarity search can be done as follows:columns
parameter when initializing the vector store.