本文翻译自 OpenAI Cookbook 的 Orchestrating Agents: Routines and Handoffs
Ilan Bigio
Oct 10, 2024
当使用语言模型时,通常只需要设计一个好的提示词并配合合适的工具,就能获得不错的效果。然而,当你需要处理许多不同的流程时,情况可能会变得复杂。本手册将介绍一种方法来应对这些复杂情况。
我们将引入“常规任务 (routine)”和“任务交接 (handoff)”的概念,并逐步展示如何实现这些功能,以及如何通过它们协调多个 AI 智能体 (agents),从而实现简单、强大且可控的系统。
最后,我们提供了一个示例仓库 Swarm,它实现了这些想法并附带了示例代码。
让我们从设置导入开始:
from openai import OpenAI
from pydantic import BaseModel
from typing import Optional
import json
client = OpenAI()
常规任务 (Routines)
“常规任务”这个概念没有严格的定义,主要用来表示一系列步骤。具体来说,我们可以将常规任务定义为一组用自然语言编写的指令 (我们通过系统提示词来实现),以及完成这些任务所需的工具。
让我们来看一个示例。下方代码定义了一个客户服务智能体的常规任务,指示它对用户问题进行分类,然后要么建议解决方案,要么提供退款。我们还定义了两个辅助函数 execute_refund
和 look_up_item
。你可以把它称为客户服务常规任务、智能体或助手,但核心思想相同:一组步骤和执行这些步骤的工具。
# Customer Service Routine
system_message = (
"You are a customer support agent for ACME Inc."
"Always answer in a sentence or less."
"Follow the following routine with the user:"
"1. First, ask probing questions and understand the user's problem deeper.\n"
" - unless the user has already provided a reason.\n"
"2. Propose a fix (make one up).\n"
"3. ONLY if not satesfied, offer a refund.\n"
"4. If accepted, search for the ID and then execute refund."
""
)
def look_up_item(search_query):
"""Use to find item ID.
Search query can be a description or keywords."""
# return hard-coded item ID - in reality would be a lookup
return "item_132612938"
def execute_refund(item_id, reason="not provided"):
print("Summary:", item_id, reason) # lazy summary
return "success"
常规任务的优势在于其简单性和稳健性。注意,这些指令包含类似状态机或代码分支的条件判断。大语言模型 (LLM) 实际上可以非常有效地处理小型和中型的常规任务,并且能够灵活调整对话,不会陷入死胡同。
执行常规任务
为了执行常规任务,我们可以实现一个简单的循环:
- 获取用户输入。
- 将用户消息添加到
messages
中。 - 调用模型。
- 将模型的回复添加到
messages
中。
def run_full_turn(system_message, messages):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "system", "content": system_message}] + messages,
)
message = response.choices[0].message
messages.append(message)
if message.content: print("Assistant:", message.content)
return message
messages = []
while True:
user = input("User: ")
messages.append({"role": "user", "content": user})
run_full_turn(system_message, messages)
如你所见,当前的实现忽略了函数调用,因此我们需要添加该功能。
模型要求函数以函数模式 (function schema) 的形式传递。为了方便,我们可以定义一个辅助函数,将 Python 函数转换为相应的函数模式。
import inspect
def function_to_schema(func) -> dict:
type_map = {
str: "string",
int: "integer",
float: "number",
bool: "boolean",
list: "array",
dict: "object",
type(None): "null",
}
try:
signature = inspect.signature(func)
except ValueError as e:
raise ValueError(
f"Failed to get signature for function {func.__name__}: {str(e)}"
)
parameters = {}
for param in signature.parameters.values():
try:
param_type = type_map.get(param.annotation, "string")
except KeyError as e:
raise KeyError(
f"Unknown type annotation {param.annotation} for parameter {param.name}: {str(e)}"
)
parameters[param.name] = {"type": param_type}
required = [
param.name
for param in signature.parameters.values()
if param.default == inspect._empty
]
return {
"type": "function",
"function": {
"name": func.__name__,
"description": (func.__doc__ or "").strip(),
"parameters": {
"type": "object",
"properties": parameters,
"required": required,
},
},
}
例如:
def sample_function(param_1, param_2, the_third_one: int, some_optional="John Doe"):
"""
This is my docstring. Call this function when you want.
"""
print("Hello, world")
schema = function_to_schema(sample_function)
print(json.dumps(schema, indent=2))
{
"type": "function",
"function": {
"name": "sample_function",
"description": "This is my docstring. Call this function when you want.",
"parameters": {
"type": "object",
"properties": {
"param_1": {
"type": "string"
},
"param_2": {
"type": "string"
},
"the_third_one": {
"type": "integer"
},
"some_optional": {
"type": "string"
}
},
"required": [
"param_1",
"param_2",
"the_third_one"
]
}
}
}
现在,当我们调用模型时,可以使用这个函数来传递工具。
messages = []
tools = [execute_refund, look_up_item]
tool_schemas = [function_to_schema(tool) for tool in tools]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Look up the black boot."}],
tools=tool_schemas,
)
message = response.choices[0].message
message.tool_calls[0].function
Function(arguments='{"search_query":"black boot"}', name='look_up_item')
最后,当模型调用某个工具时,我们需要执行相应的函数并将结果反馈给模型。
我们可以通过将工具名称映射到 Python 函数的 tool_map
中来完成这一点,然后在 execute_tool_call
中查找并调用它。最后,将结果添加到对话中。
tools_map = {tool.__name__: tool for tool in tools}
def execute_tool_call(tool_call, tools_map):
name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
print(f"Assistant: {name}({args})")
# call corresponding function with provided arguments
return tools_map[name](**args)
for tool_call in message.tool_calls:
result = execute_tool_call(tool_call, tools_map)
# add result back to conversation
result_message = {
"role": "tool",
"tool_call_id": tool_call.id,
"content": result,
}
messages.append(result_message)
实际上,我们还希望模型使用这个结果生成另一个回复。这个回复也可能包含工具调用,因此我们可以将其放入循环中,直到没有更多的工具调用为止。
如果将所有步骤组合在一起,它看起来会像这样:
tools = [execute_refund, look_up_item]
def run_full_turn(system_message, tools, messages):
num_init_messages = len(messages)
messages = messages.copy()
while True:
# turn python functions into tools and save a reverse map
tool_schemas = [function_to_schema(tool) for tool in tools]
tools_map = {tool.__name__: tool for tool in tools}
# === 1. get openai completion ===
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "system", "content": system_message}] + messages,
tools=tool_schemas or None,
)
message = response.choices[0].message
messages.append(message)
if message.content: # print assistant response
print("Assistant:", message.content)
if not message.tool_calls: # if finished handling tool calls, break
break
# === 2. handle tool calls ===
for tool_call in message.tool_calls:
result = execute_tool_call(tool_call, tools_map)
result_message = {
"role": "tool",
"tool_call_id": tool_call.id,
"content": result,
}
messages.append(result_message)
# ==== 3. return new messages =====
return messages[num_init_messages:]
def execute_tool_call(tool_call, tools_map):
name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
print(f"Assistant: {name}({args})")
# call corresponding function with provided arguments
return tools_map[name](**args)
messages = []
while True:
user = input("User: ")
messages.append({"role": "user", "content": user})
new_messages = run_full_turn(system_message, tools, messages)
messages.extend(new_messages)
现在我们有了一个常规任务,但假如我们想要增加更多步骤和工具,可以在一定程度上继续扩展。但当任务变得过多时,可能会导致系统无法应对。这时,我们可以利用多个常规任务的概念——根据用户的请求,加载适当的常规任务以及对应的工具和步骤来解决问题。
动态切换系统指令和工具可能看起来比较复杂。不过,如果我们将“常规任务”看作是“智能体”,那么“任务交接 (handoff)”这个概念就能帮助我们简化这种切换——就像一个智能体把对话转交给另一个智能体一样。
任务交接 (Handoffs)
任务交接指的是一个智能体 (或常规任务) 将当前的对话转交给另一个智能体,类似于电话转接的过程。不同的是,在这个例子中,智能体能够完整掌握之前的对话内容!
为了展示任务交接的实际操作,我们首先定义一个基本的智能体类:
class Agent(BaseModel):
name: str = "Agent"
model: str = "gpt-4o-mini"
instructions: str = "You are a helpful Agent"
tools: list = []
现在,为了支持这个功能,我们需要修改 run_full_turn
函数,使其接收一个 Agent
对象,而不是单独的 system_message
和 tools
:
def run_full_turn(agent, messages):
num_init_messages = len(messages)
messages = messages.copy()
while True:
# turn python functions into tools and save a reverse map
tool_schemas = [function_to_schema(tool) for tool in agent.tools]
tools_map = {tool.__name__: tool for tool in agent.tools}
# === 1. get openai completion ===
response = client.chat.completions.create(
model=agent.model,
messages=[{"role": "system", "content": agent.instructions}] + messages,
tools=tool_schemas or None,
)
message = response.choices[0].message
messages.append(message)
if message.content: # print assistant response
print("Assistant:", message.content)
if not message.tool_calls: # if finished handling tool calls, break
break
# === 2. handle tool calls ===
for tool_call in message.tool_calls:
result = execute_tool_call(tool_call, tools_map)
result_message = {
"role": "tool",
"tool_call_id": tool_call.id,
"content": result,
}
messages.append(result_message)
# ==== 3. return new messages =====
return messages[num_init_messages:]
def execute_tool_call(tool_call, tools_map):
name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
print(f"Assistant: {name}({args})")
# call corresponding function with provided arguments
return tools_map[name](**args)
我们现在可以轻松运行多个智能体:
def execute_refund(item_name):
return "success"
refund_agent = Agent(
name="Refund Agent",
instructions="You are a refund agent. Help the user with refunds.",
tools=[execute_refund],
)
def place_order(item_name):
return "success"
sales_assistant = Agent(
name="Sales Assistant",
instructions="You are a sales assistant. Sell the user a product.",
tools=[place_order],
)
messages = []
user_query = "Place an order for a black boot."
print("User:", user_query)
messages.append({"role": "user", "content": user_query})
response = run_full_turn(sales_assistant, messages) # sales assistant
messages.extend(response)
user_query = "Actually, I want a refund." # implitly refers to the last item
print("User:", user_query)
messages.append({"role": "user", "content": user_query})
response = run_full_turn(refund_agent, messages) # refund agent
User: Place an order for a black boot.
Assistant: place_order({'item_name': 'black boot'})
Assistant: Your order for a black boot has been successfully placed! If you need anything else, feel free to ask!
User: Actually, I want a refund.
Assistant: execute_refund({'item_name': 'black boot'})
Assistant: Your refund for the black boot has been successfully processed. If you need further assistance, just let me know!
很棒!不过,当前的任务交接是手动实现的。我们希望智能体能够自主决定何时进行任务交接。一种简单但非常有效的方式是为智能体提供一个 transfer_to_XXX
函数,其中 XXX
是另一个智能体。模型足够智能,能够在合适的时机调用这个函数来进行任务交接!
任务交接函数
现在智能体可以表达交接的意图,接下来我们需要让交接真正发生。尽管有多种实现方法,但有一种特别简洁的方式。
对于我们已经定义的智能体函数,如 execute_refund
或 place_order
,它们通常返回一个字符串,供模型使用。那么,如果我们让它返回一个 Agent
对象,表示需要切换到哪个智能体呢?例如:
refund_agent = Agent(
name="Refund Agent",
instructions="You are a refund agent. Help the user with refunds.",
tools=[execute_refund],
)
def transfer_to_refunds():
return refund_agent
sales_assistant = Agent(
name="Sales Assistant",
instructions="You are a sales assistant. Sell the user a product.",
tools=[place_order],
)
然后我们可以更新代码,检查函数响应的返回类型。如果返回的是 Agent
,那么就更新当前使用的智能体。此外,run_full_turn
现在还需要返回最新使用的智能体,以应对可能发生的任务交接。(我们可以通过 Response
类来保持代码整洁。)
class Response(BaseModel):
agent: Optional[Agent]
messages: list
更新后的 run_full_turn
看起来如下:
def run_full_turn(agent, messages):
current_agent = agent
num_init_messages = len(messages)
messages = messages.copy()
while True:
# turn python functions into tools and save a reverse map
tool_schemas = [function_to_schema(tool) for tool in current_agent.tools]
tools = {tool.__name__: tool for tool in current_agent.tools}
# === 1. get openai completion ===
response = client.chat.completions.create(
model=agent.model,
messages=[{"role": "system", "content": current_agent.instructions}]
+ messages,
tools=tool_schemas or None,
)
message = response.choices[0].message
messages.append(message)
if message.content: # print agent response
print(f"{current_agent.name}:", message.content)
if not message.tool_calls: # if finished handling tool calls, break
break
# === 2. handle tool calls ===
for tool_call in message.tool_calls:
result = execute_tool_call(tool_call, tools, current_agent.name)
if type(result) is Agent: # if agent transfer, update current agent
current_agent = result
result = (
f"Transfered to {current_agent.name}. Adopt persona immediately."
)
result_message = {
"role": "tool",
"tool_call_id": tool_call.id,
"content": result,
}
messages.append(result_message)
# ==== 3. return last agent used and new messages =====
return Response(agent=current_agent, messages=messages[num_init_messages:])
def execute_tool_call(tool_call, tools, agent_name):
name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
print(f"{agent_name}:", f"{name}({args})")
return tools[name](**args) # call corresponding function with provided arguments
让我们来看一个包含更多智能体的示例:
def escalate_to_human(summary):
"""Only call this if explicitly asked to."""
print("Escalating to human agent...")
print("\n=== Escalation Report ===")
print(f"Summary: {summary}")
print("=========================\n")
exit()
def transfer_to_sales_agent():
"""User for anything sales or buying related."""
return sales_agent
def transfer_to_issues_and_repairs():
"""User for issues, repairs, or refunds."""
return issues_and_repairs_agent
def transfer_back_to_triage():
"""Call this if the user brings up a topic outside of your purview,
including escalating to human."""
return triage_agent
triage_agent = Agent(
name="Triage Agent",
instructions=(
"You are a customer service bot for ACME Inc. "
"Introduce yourself. Always be very brief. "
"Gather information to direct the customer to the right department. "
"But make your questions subtle and natural."
),
tools=[transfer_to_sales_agent, transfer_to_issues_and_repairs, escalate_to_human],
)
def execute_order(product, price: int):
"""Price should be in USD."""
print("\n\n=== Order Summary ===")
print(f"Product: {product}")
print(f"Price: ${price}")
print("=================\n")
confirm = input("Confirm order? y/n: ").strip().lower()
if confirm == "y":
print("Order execution successful!")
return "Success"
else:
print("Order cancelled!")
return "User cancelled order."
sales_agent = Agent(
name="Sales Agent",
instructions=(
"You are a sales agent for ACME Inc."
"Always answer in a sentence or less."
"Follow the following routine with the user:"
"1. Ask them about any problems in their life related to catching roadrunners.\n"
"2. Casually mention one of ACME's crazy made-up products can help.\n"
" - Don't mention price.\n"
"3. Once the user is bought in, drop a ridiculous price.\n"
"4. Only after everything, and if the user says yes, "
"tell them a crazy caveat and execute their order.\n"
""
),
tools=[execute_order, transfer_back_to_triage],
)
def look_up_item(search_query):
"""Use to find item ID.
Search query can be a description or keywords."""
item_id = "item_132612938"
print("Found item:", item_id)
return item_id
def execute_refund(item_id, reason="not provided"):
print("\n\n=== Refund Summary ===")
print(f"Item ID: {item_id}")
print(f"Reason: {reason}")
print("=================\n")
print("Refund execution successful!")
return "success"
issues_and_repairs_agent = Agent(
name="Issues and Repairs Agent",
instructions=(
"You are a customer support agent for ACME Inc."
"Always answer in a sentence or less."
"Follow the following routine with the user:"
"1. First, ask probing questions and understand the user's problem deeper.\n"
" - unless the user has already provided a reason.\n"
"2. Propose a fix (make one up).\n"
"3. ONLY if not satesfied, offer a refund.\n"
"4. If accepted, search for the ID and then execute refund."
""
),
tools=[execute_refund, look_up_item, transfer_back_to_triage],
)
最后,我们可以将该过程放入循环中运行 (这段代码无法在 Python notebooks 中运行,所以请在独立的 Python 文件中尝试):
agent = triage_agent
messages = []
while True:
user = input("User: ")
messages.append({"role": "user", "content": user})
response = run_full_turn(agent, messages)
agent = response.agent
messages.extend(response.messages)
Swarm
作为概念验证,我们将这些想法打包成了一个示例库,称为 Swarm。这只是一个示例库,不应直接用于生产环境。不过,你可以参考其中的想法和代码来构建自己的系统!