Sandboxing LLM Generated Code with Extism

May 4, 2023 · 5 min read

In our last post we showed how to call ChatGPT from inside an Extism plug-in. Although it’s a useful capability for plug-in developers, there isn’t anything specially suited for Wasm here. So we’ve been thinking about what role Wasm might be able to play in this LLM enabled future. In this post we will explore one idea around code generation.

Code Generation

One key touted feature of LLMs has been their ability to generate code. Many programmers have interacted with this through tools like Copilot or by just asking ChatGPT programming questions. Programmers generally copy-paste these code snippets or review it in some way before they integrate it into their apps. Directly running code generated by an LLM has some of the same risks that Extism deals with. The code should be considered untrusted. Not only could it be wrong, your use case might expose you to malicious actors and exploits through prompt injection.

Unix Utility Maker

Suppose you want to create a bot that generates unix text processing utilities. You ask ChatGPT through the API to generate some bash, then you pipe some standard input to that bash. The interface might look like this:

util-bot "Make a tool that counts the number of vowels in the input text" \ 
   > count_vowels.sh
echo "Hello World" | sh count_vowels.sh

# => 3

The bash it generates could do anything on your computer. So this feels like a scary idea. Here the worst case scenario is likely that your script is wrong and destroys some state on your machine, but you could see if you had a use case that allowed untrusted input into this code generator, someone could easily craft some malicious actions.

Prompt injection is an ongoing research topic and some believe it will be impossible to prevent. This is perhaps where Wasm can provide some relief. Wasm can create a sandbox for the generated code and you can give it only the capabilities it needs to get the job done.

info

A Wasm sandbox cannot completely save you from all prompt injection attacks if you need to trust the output. But it can prevent the code from accessing capabilities or data that it should not have access to.

Utility Maker in Extism

Let’s build this using Extism. We’ll use LangChain to generate the code based on a description, and we’ll use the Extism Python SDK to orchestrate the Wasm module for executing the code. We are going to ask it to generate JavaScript code and run it in a sandbox we create with the JavaScript PDK.

info

If you want to jump straight to runnable code, we packaged it up here: https://github.com/extism/func-gen

And demo video can be seen here: https://www.loom.com/share/d956147a1a7d449391ec0778ebe12918

Let’s start with our plug-in. We’ll create a simple JavaScript module that has 2 functions: one to store the code for a function in a plug-in variable and one to read it and invoke it.

// sandbox.js

function register_function() {
  let code = Host.inputString();
  Var.set("code", code);
}

function invoke() {
  let input = Host.inputString();
  let code = Var.getString("code");
  let func = eval(code);
  Host.outputString(func(input).toString());
}

module.exports = {register_function, invoke}

info

This simple plug-in can be re-used in many situations where you want to sandbox some JavaScript code and it works from any host that Extism supports, including the browser!

Let’s start the host side now. First we need some imports:

# host.py

import re
import sys
import json
import pathlib
from extism import Context
from langchain.chat_models import ChatOpenAI
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

In order to generate runnable code, we need to coerce ChatGPT into generating a CJS module with a single function as an export. As an example, here is the code we expect from the prompt:

Write a function that takes a string of comma separated integers and
returns the sum as an integer

This prompt should generate the following code:

module.exports = function sumCommaSeparatedIntegers(str) {
  return str.split(',').reduce((acc, curr) => acc + parseInt(curr), 0);
}

This allows us to easily eval it into a function and call it in the plug-in code:

let func = eval(code)
let result = func(inputStr)

This process of coercion is colloquially called prompt engineering. LangChain gives us some tools to do this. Let’s write the function that generates the code from a description:

MARKDOWN_RE = re.compile("\`\`\`javascript([^\`]*)\`\`\`")
chat = ChatOpenAI(temperature=0)

def generate_code(description):
    messages = [
        SystemMessage(content="Your goal is to write a Javascript function."),
        SystemMessage(content="You must export the function as the default export, using the CommonJS module syntax"),
        SystemMessage(content="You must not include any comments, explanations, or markdown. The response should be JavaScript only."),
        HumanMessage(content=description),
    ]

    response = chat(messages)
    code = response.content
    # sometimes the LLM wraps the code in markdown, this removes it
    # if it does
    m = MARKDOWN_RE.match(code)
    if m and m.group(1):
        code = m.group(1)
    return code

Here we use a series of SystemMessages to pre-prompt the model with its goals. We put the user’s description at the end, in the HumanMessage.

info

There are likely more rules you’d want to add to this system, we are not prompt engineers.

Now we’ll write a function to execute this code in the Wasm module.

def execute_code(code, input):
    wasm_file_path = pathlib.Path(__file__).parent / "sandbox.wasm"
    config = {"wasm": [{"path": str(wasm_file_path)}], "memory": {"max": 5}}

    with Context() as context:
        plugin = context.plugin(config, wasi=True)
        plugin.call("register_function", code)
        return plugin.call("invoke", input)

info

You can turn your JS plug-in code into Wasm with the command: extism-js sandbox.js -o sandbox.wasm

And now a main to orchestrate it:

if __name__ == "__main__":
    code = generate_code(sys.argv[1])
    print(execute_code(code, sys.stdin.read()))

Let’s try it out by asking it to generate the canonical Extism count-vowels example

$ pip3 install extism langchain
$ export OPENAI_API_KEY=sk-mykeyhere # needed to query openai
$ echo "hello world" | python3 host.py "write a function that counts the number of vowels in a string"
b'3'

Code Generation​

Unix Utility Maker​

Utility Maker in Extism​

Code Generation

Unix Utility Maker

Utility Maker in Extism