Code-Then-Execute Pattern NEW
Problem
Plan lists are opaque; we want full data-flow analysis and taint tracking.
Solution
Have the LLM output a sandboxed program or DSL script:
- LLM writes code that calls tools and untrusted-data processors.
- Static checker/Taint engine verifies flows (e.g., no tainted var to
send_email.recipient
). - Interpreter runs the code in a locked sandbox.
x = calendar.read(today)
y = QuarantineLLM.format(x)
email.write(to="john@acme.com", body=y)
How to use it
Complex multi-step agents like SQL copilots, software-engineering bots.
Trade-offs
- Pros: Formal verifiability; replay logs.
- Cons: Requires DSL design and static-analysis infra.
References
- Debenedetti et al., CaMeL (2025); Beurer-Kellner et al., ยง3.1 (5).