-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Self-debugging in CodeExecutionAgent
#6207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @ekzhu, Below is a proposed approach along with a few questions I had while thinking about it: Questions
model_result = None
execution_result: CodeExecutionEvent = None
for attempts in range(max_code_retries):
async for inference_output in self._call_llm(
model_client=model_client,
model_client_stream=model_client_stream,
system_messages=system_messages,
model_context=model_context,
agent_name=agent_name,
cancellation_token=cancellation_token,
):
if isinstance(inference_output, CreateResult):
model_result = inference_output
else:
# Streaming chunk event
yield inference_output
assert model_result is not None, "No model result was produced."
...
inferred_text_message: CodeGenerationEvent = CodeGenerationEvent(
content=str(model_result.content),
code_blocks=self._extract_markdown_code_blocks(model_result.content),
source=agent_name,
)
# execute generated code if present
execution_result = await self.execute_code_block([inferred_text_message], cancellation_token)
if execution_result.result.exit_code == 0:
break |
For simplicity, let's store all events to model context, including unsuccessful and successful ones.
Yield full code and result for transparency. User don't trust models at this point for most scenarios.
User intervention should happen outside of the agent. It should be part of a team. So the agent after certain number of unsuccessful attempts, it should just stop, reflect, and pass it on to the next agent, which could be the user if the team orchestrated this way. |
I think a simple loop with counter may not address all cases. At the end of each iteration, it should use the model to determine if the code error can be fixed, and if not, exit the loop and return a final response to the caller. We can use structured output for this to ensure the model output can be used with the loop condition. If the code error can be improved, it attempts to try again. If the code execution succeeded, then apply a final reflection. It still needs a maximum retry to avoid the model goes off the rails. Let's experiment with this using a few examples especially data science and data analytics scenarios. |
) Signed-off-by: Abhijeetsingh Meena <[email protected]>
Confirmation
Issue body
Follow up to #6098, add auto-debugging loop to
CodeExecutionAgent
so it automatically tries to regenerate code when there is error.The text was updated successfully, but these errors were encountered: