The Dual LLM Pattern for Building AI Assistants That Can Resist Prompt Injection




  • I really want an AI assistant: a Large Language Model powered chatbot that can answer questions and perform actions for me based on access to my private data and tools
  • So, if it turns out we can't solve this class of vulnerabilities against the design of existing Large Language Models, what's a safe subset of the AI assistant that we can responsibly build today?