Python
agent under the hood.
Bodo DataFrames is a high performance DataFrame library that can automatically accelerate and scale
Pandas code with a simple import change (see examples below). Because of it’s strong Pandas compatibility, Bodo DataFrames
enables LLMs, which are typically good at generating Pandas code, to answer questions about larger
datasets more efficiently and scales generated code beyond the limitations of Pandas.
NOTE: The Python
agent executes LLM generated Python code - this can be bad if the LLM generated Python code is harmful. Use cautiously.
Setup
Before running examples, copy the titanic dataset and save locally astitanic.csv
.
Installing langchain-bodo will also install dependencies Bodo and Pandas:
pip
Credentials
Bodo DataFrames is free and does not require additional credentials. The examples use OpenAI models, if not already configured, set your OPENAI_API_KEY:Creating and invoking agents
The following examples are borrowed from the Pandas DataFrames agent notebook with some modifications to highlight key differences. This first example shows how you can directly pass Bodo DataFrame tocreate_bodo_dataframes_agent
and
ask a simple question.
Using ZERO_SHOT_REACT_DESCRIPTION
This shows how to initialize the agent using the ZERO_SHOT_REACT_DESCRIPTION
agent type.
Using OpenAI Functions
This shows how to initialize the agent using the OPENAI_FUNCTIONS agent type. Note that this is an alternative to the above.Creating and invoking agents with Bodo DataFrames and preprocessing
This example shows a slightly more complex use case of passing a Bodo DataFrame tocreate_bodo_dataframes_agent
with some additional preprocessing.
Since Bodo DataFrames are lazily evaluated, you can potentially save on computation if not all columns
are needed to answer the question. Note that the DataFrame(s) passed to the agent can also be
larger than the available memory.
Multi DataFrame Example
You can also pass multiple DataFrames to the agent. Note that while Bodo DataFrames supports most common compute intensive operations in Pandas, if the agent generates code that is not currently supported (see warnings below), the DataFrames will be converted back to Pandas to prevent errors. Refer to the Bodo DataFrames API documentation for more details about the currently supported features.Optimizing agent invocation with number_of_head_rows
By default, the head of the DataFrame(s) are embedded into the prompt as a markdown table.
Since Bodo DataFrames are lazily evaluated, this head operation can be optimized, but can
still be slow in some cases. As an optimization, you can set number of rows in
the head to 0 so that no evaluation occurs during prompting.
Passing Pandas DataFrames
You can also pass one or more Pandas DataFrames tocreate_bodo_dataframes_agent
. The DataFrame(s) will
be converted to Bodo before being passed to the agent.