A behavior tree is a hierarchical control structure that determines which actions an agent should take based on conditions and priorities. Unlike static task lists, behavior trees provide reactive decision-making with built-in fallback strategies.
? or →Selector: Handle Test Failure
→ Try quick fix (syntax error, typo)
→ Try deeper analysis (logic error)
→ Add debug logging and report
→ Escalate to human→ or ⇒Sequence: Implement Feature
→ Understand requirements
→ Check existing codebase
→ Write implementation
→ Run tests
→ Verify tests passSelector: Choose Task Type
→ Sequence: Bug Fix Task
• Is this a bug fix request?
• Execute bug fix subtree
→ Sequence: New Feature Task
• Is this a feature request?
• Execute feature subtree
→ Sequence: Refactoring Task
• Is this refactoring?
• Execute refactor subtree
→ Sequence: Analysis Task
• Is this analysis/explanation?
• Execute analysis subtree
→ Action: Request clarificationSequence: Fix Bug
→ Action: Read error message/description
→ Action: Locate relevant code
→ Selector: Identify Bug Type
• Sequence: Syntax Error
- Is syntax error?
- Fix syntax
- RETURN SUCCESS
• Sequence: Import Error
- Is import/dependency issue?
- Install/fix import
- RETURN SUCCESS
• Sequence: Logic Error
- Analyze logic
- Identify fix
- RETURN SUCCESS
• Action: Complex - needs investigation subtree
→ Action: Apply fix
→ Action: Run tests
→ Selector: Handle Test Results
• Condition: Tests pass? → SUCCESS
• Execute debug subtree → ContinueSequence: Implement Feature
→ Condition: Requirements clear?
• If false → Action: Ask questions, then restart
→ Action: Read related existing code
→ Action: Plan implementation approach
→ Selector: Choose Implementation Strategy
• Sequence: Extend Existing
- Can extend existing code?
- Extend and modify
• Sequence: New Module
- Create new file/module
- Implement feature
• Sequence: Hybrid
- Create new + modify existing
→ Action: Write code
→ Action: Run tests
→ Selector: Verify Quality
• Sequence: Tests Pass
- Tests passing?
- Meets requirements?
- SUCCESS
• Execute debug/fix subtree
→ Action: Format and returnSelector: Debug Failed Tests
→ Sequence: Quick Fixes
• Identify obvious errors (syntax, typos)
• Apply fixes
• Re-run tests
• Tests pass? → SUCCESS
→ Sequence: Systematic Debug
• Add logging/print statements
• Run tests again
• Analyze output
• Identify root cause
• Apply fix
• Re-run tests
• Tests pass? → SUCCESS
→ Sequence: Deeper Investigation
• Check test expectations vs actual
• Trace execution flow
• Identify edge cases
• Implement fix
• Tests pass? → SUCCESS
→ Action: Report unable to fix automatically (with analysis)Tick-based (game AI style):
Event-based (better for coding agents):
Nodes can be:
Example: A "Write 500 lines of code" action might be stateful - it returns RUNNING while the LLM is generating, then SUCCESS when complete.
Each node returns:
Log every node evaluation with:
{
"timestamp": "2025-11-25T14:32:01Z",
"node_type": "Selector",
"node_name": "Debug Failed Tests",
"children_tried": [
{
"name": "Quick Fixes",
"result": "FAILURE",
"reason": "No obvious syntax errors found"
},
{
"name": "Systematic Debug",
"result": "SUCCESS",
"details": "Added logging revealed null pointer in line 42"
}
],
"final_result": "SUCCESS",
"path_taken": ["Debug Failed Tests", "Systematic Debug", "Add logging", "Analyze output"]
}When something goes wrong, trace backwards:
Question: "Why did the agent delete my config file?"
Trace:
Root cause: Reference checker didn't scan configuration directories.
This is vastly cleaner than parsing LLM chain-of-thought outputs.
Create visual tree diagrams showing:
Tools like Graphviz can generate these automatically from logs.
The LLM should be called at specific nodes:
The tree structure itself should be deterministic and not require LLM calls to navigate.
Use behavior trees when:
Stick with task lists when:
Consider: Behavior tree for control flow, LLM-generated task list for planning.
Sequence: Execute Task
→ Action: LLM generates task list
→ Selector: Execute Each Task
• For each task in list:
Sequence:
- Action: Execute task
- Selector: Handle outcome
• Success → Continue
• Failure → Execute retry subtreeThis combines the best of both: LLM for high-level planning, behavior tree for execution and error handling.
Selector: Attempt Task with Retries
→ Sequence: Try Once
• Execute action
• Success? → DONE
→ Sequence: Try with modifications
• Modify parameters
• Execute action
• Success? → DONE
→ Sequence: Try alternative approach
• Use different method
• Execute action
• Success? → DONE
→ Action: Report failure with contextSequence: Safe File Operation
→ Condition: File exists?
→ Condition: Have write permissions?
→ Condition: Not a system file?
→ Action: Backup file
→ Action: Modify file
→ Selector: Verify
• Tests pass? → SUCCESS
• Action: Restore backup → FAILURESelector: Implement with Quality Levels
→ Sequence: Production Quality
• Implement with tests
• Add error handling
• Add documentation
• All checks pass? → SUCCESS
→ Sequence: Functional Quality
• Basic implementation
• Basic tests
• Works? → SUCCESS
→ Sequence: Prototype Quality
• Minimal implementation
• Manual verification
• SUCCESSSimilar: Both try alternatives until one succeeds
Different:
Behavior trees advantages:
State machines advantages:
Behavior trees: Priority-based (try in order)
Utility AI: Score-based (pick highest scoring action)
Behavior trees are simpler and more debuggable. Utility AI is better for nuanced decision-making with many factors.
The key is to start simple and only add complexity when the pain of ad-hoc if-statements becomes greater than the pain of maintaining a tree structure.