Pandas Dataframe 代理

这个文档展示了如何使用代理与 pandas dataframe 进行交互。主要用于问答。

注意:该代理在内部调用 Python 代理,执行由语言模型生成的 Python 代码 - 如果语言模型生成的 Python 代码具有恶意,这可能会造成危险。请谨慎使用。

from langchain.agents import create_pandas_dataframe_agent
from langchain.chat_models import ChatOpenAI
from langchain.agents.agent_types import AgentType
from langchain.llms import OpenAI
import pandas as pd

df = pd.read_csv('titanic.csv')

使用 ZERO_SHOT_REACT_DESCRIPTION

这展示了如何使用 ZERO_SHOT_REACT_DESCRIPTION 代理类型来初始化代理。请注意,这是上述方法的替代方案。

agent = create_pandas_dataframe_agent(OpenAI(temperature=0), df, verbose=True)

使用 OpenAI 函数

这展示了如何使用 OPENAI_FUNCTIONS 代理类型来初始化代理。请注意,这是上述方法的替代方案。

agent = create_pandas_dataframe_agent(
    ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613"), 
    df, 
    verbose=True, 
    agent_type=AgentType.OPENAI_FUNCTIONS
)
agent.run("how many rows are there?")
Entering new  chain...

Invoking: `python_repl_ast` with `df.shape[0]`


891There are 891 rows in the dataframe.

Finished chain.

'There are 891 rows in the dataframe.'
agent.run("how many people have more than 3 siblings")
Entering new AgentExecutor chain...
Thought: I need to count the number of people with more than 3 siblings
Action: python_repl_ast
Action Input: df[df['SibSp'] > 3].shape[0]
Observation: 30
Thought: I now know the final answer
Final Answer: 30 people have more than 3 siblings.

Finished chain.

'30 people have more than 3 siblings.'
agent.run("whats the square root of the average age?")
Entering new AgentExecutor chain...
Thought: I need to calculate the average age first
Action: python_repl_ast
Action Input: df['Age'].mean()
Observation: 29.69911764705882
Thought: I now need to calculate the square root of the average age
Action: python_repl_ast
Action Input: math.sqrt(df['Age'].mean())
Observation: NameError("name 'math' is not defined")
Thought: I need to import the math library
Action: python_repl_ast
Action Input: import math
Observation: 
Thought: I now need to calculate the square root of the average age
Action: python_repl_ast
Action Input: math.sqrt(df['Age'].mean())
Observation: 5.449689683556195
Thought: I now know the final answer
Final Answer: The square root of the average age is 5.449689683556195.

Finished chain.

'The square root of the average age is 5.449689683556195.'

多个 DataFrame 的例子

下面的部分展示了代理如何与作为列表传入的多个 DataFrame 进行交互。

df1 = df.copy()
df1["Age"] = df1["Age"].fillna(df1["Age"].mean())
agent = create_pandas_dataframe_agent(OpenAI(temperature=0), [df, df1], verbose=True)
agent.run("how many rows in the age column are different?")
Entering new AgentExecutor chain...
Thought: I need to compare the age columns in both dataframes
Action: python_repl_ast
Action Input: len(df1[df1['Age'] != df2['Age']])
Observation: 177
Thought: I now know the final answer
Final Answer: 177 rows in the age column are different.

Finished chain.

'177 rows in the age column are different.'
Last Updated:
Contributors: 刘强