Understanding Code Assistance in Zed

1. Extraction

Zed's code extraction focuses on extracting code from the buffer, code blocks, and metadata from LLM output. Here's how they do it, illustrated with TypeScript examples:

Extracting Code Context from Buffer:

Concept: Before sending to the LLM, Zed extracts relevant code snippets around the user's cursor. This is done by using the Buffer's API to get text and position information.

TypeScript Analogy:

interface TextRange {
  start: number;
  end: number;
}
function extractCodeSnippet(
    bufferText: string,
    selectionRange: TextRange
): string {
    const codeBefore = bufferText.slice(0, selectionRange.start);
    const codeAfter = bufferText.slice(selectionRange.end);

    return `${codeBefore}\\\\n<selection>\\\\n${codeAfter}`;
}

const bufferText = `
    function add(a: number, b: number) {
      console.log("start");
      const sum = a + b;
      return sum;
    }
    console.log("end")
  `;
const selectionRange = { start: 45, end: 60 };
const codeSnippet = extractCodeSnippet(bufferText, selectionRange);
// `codeSnippet` will be:
// `
//     function add(a: number, b: number) {
//       console.log("start");
//       <selection>
//       return sum;
//     }
//     console.log("end")
//   `

Zed Implementation: This is reflected in BufferCodegen's logic, where the code extracts the text around the selection range and inserts it into the LanguageModelRequest.

Extracting Text from Code Blocks:

Concept: Zed parses the buffer using tree-sitter to understand the structure of code. This is particularly important when the user has a code block selected, because it allows extracting the text from the code block instead of sending the surrounding fence (e.g., ```).

TypeScript Analogy:

interface CodeBlockNode {
  start: number;
  end: number;
  kind: string;
  children?: CodeBlockNode[];
}

function extractCodeFromBlock(bufferText: string, offset: number, rootNode: CodeBlockNode): string | null {
  if(rootNode.kind === "fenced_code_block") {
      for (const child of rootNode.children || []) {
          if (child.kind === "code_fence_content" && child.start <= offset && offset <= child.end) {
            return bufferText.slice(child.start, child.end);
          }
      }
    }
    return null;
}

const bufferText = `
    \\\\`\\\\`\\\\`typescript
    function add(a: number, b: number): number {
      return a + b;
    }
    \\\\`\\\\`\\\\`
  `;
 const rootNode : CodeBlockNode = {
     kind: "fenced_code_block",
      start: 0,
      end: 100,
      children: [{
         kind: "code_fence_content",
         start: 19,
         end: 85
     }]
 }

const code = extractCodeFromBlock(bufferText, 40, rootNode);
// code will be:
// `
//     function add(a: number, b: number): number {
//       return a + b;
//     }
//  `

Zed Implementation: This logic is used in find_surrounding_code_block in crates/assistant/src/assistant_panel.rs and in crates/assistant2/src/buffer_codegen.rs.

Extracting Tool Call Data:

Concept: When the LLM indicates a tool should be used, Zed extracts the tool name and input parameters from the completion.

TypeScript Analogy:

interface ToolUseEvent {
  toolName: string;
  input: string;
}

function extractToolUse(llmOutput: string): ToolUseEvent | null {
  const toolUseRegex = /<tool name="(.*?)".*?input="(.*?)"/g;
  const match = toolUseRegex.exec(llmOutput);
  if (match) {
      return {
          toolName: match[1],
          input: match[2],
      }
  } else {
    return null;
  }
}

const llmOutput = "This is a text with <tool name='search' input='hello world'/>";
const toolUse = extractToolUse(llmOutput);
// toolUse will be: { toolName: 'search', input: 'hello world' }

Zed Implementation: This process is done in crates/assistant2/src/thread.rs and crates/language_models/src/provider/anthropic.rs.

Parsing Structured Output:

Concept: When performing a code edit, Zed expects a JSON-like output from the LLM, which it then deserializes using serde.

TypeScript Analogy:

interface EditInstruction {
  path: string;
  oldText: string;
  newText: string;
  operation: "insert" | "replace" | "delete";
}

function parseEditInstructions(llmOutput: string): EditInstruction[] {
  try {
    return JSON.parse(llmOutput);
  } catch (e) {
    console.error("Failed to parse", llmOutput);
    return [];
  }
}

const llmOutput = `
    [
      {
        "path": "src/main.rs",
        "oldText": "console.log",
        "newText": "console.info",
        "operation": "replace"
      }
    ]
  `;
const editInstructions = parseEditInstructions(llmOutput);
// `editInstructions` will be:
// `
// [
//     {
//       path: 'src/main.rs',
//       oldText: 'console.log',
//       newText: 'console.info',
//       operation: 'replace'
//     }
//   ]
// `

Zed Implementation: This logic can be found in crates/assistant2/src/buffer_codegen.rs.

XML Tag Parsing:

Concept: Zed also uses XML-like tags to parse the output from LLMs to provide information about the output, like if it contains a patch or a title.

TypeScript Analogy:

interface XmlTag {
  kind: string;
  range: { start: number; end: number };
  isOpenTag: boolean;
}

function parseXmlTags(text: string): XmlTag[] {
  const xmlTagRegex = /<(\\\\/)?(\\\\w+)(.*?)>/g;
  const tags = [];
  let match;
  while ((match = xmlTagRegex.exec(text)) !== null) {
    const isOpenTag = match[1] !== "/";
    tags.push({
      kind: match[2],
      range: { start: match.index, end: match.index + match[0].length },
      isOpenTag,
    });
  }
  return tags;
}

const text = `
<patch>
    <title>Refactor foo</title>
    <edit>
        <path>src/main.rs</path>
        <oldText>console.log</oldText>
        <newText>console.info</newText>
        <operation>replace</operation>
    </edit>
</patch>
`;
const tags = parseXmlTags(text);
// tags will contain objects representing the different tags in the xml-like text.

Zed Implementation: The parsing of these XML tags is found in crates/assistant2/src/context.rs.

2. Prompt Engineering

Zed uses a PromptBuilder to dynamically create prompts. Here's how prompt engineering is happening:

Context Inclusion: Prompts include code snippets before and after the user selection. This context is captured using MultiBuffer and Range<Offset> and converted into text.
Structured Output: The prompts are written in a way that encourages structured output (e.g., JSON) from the LLM, allowing for easier parsing and application of edits.
User Intent: The user_prompt is included in the generated prompt, allowing the user to guide the LLM's behavior. The user prompt can be just the key binding that is pressed or a text written in the editor.
System Prompt: A system prompt is added to the request when summarizing conversations in crates/assistant2/src/thread.rs using the summarize function.
Prompt Templates: In the prompts module, you can see a specific template used for generating the inline transformation prompt. This template will be used to generate the prompt, taking in a user prompt, language, and the content of the buffer.
Typescript Analogy:

function buildEditPrompt(
    codeBefore: string,
    userPrompt: string,
    codeAfter: string,
    language?: string
): string {
  let language_line = ""
  if (language) {
    language_line = `\\`\\`\\`${language}\\n`
  }
    return `
    Given the following code snippet, perform the requested change.

    ${language_line}
    ${codeBefore}
    ${language_line.trim()}
    User: ${userPrompt}

    ${language_line}
    ${codeAfter}
    ${language_line.trim()}
    `;
}
const prompt = buildEditPrompt(
`
    function add(a: number, b: number) {
      console.log("start");`,
"change console.log to console.info",
`
      return sum;
    }
`,
    "typescript"
);
  // prompt will be a string that combines the context with the user's instruction.

3. System Prompt Logic

Limited System Prompts: The primary use of system prompts is for summarizing threads using the summarize function in crates/assistant2/src/thread.rs, which adds a system prompt that instructs the LLM to summarize the conversation in a few words.
No Explicit System Prompt Customization: The code does not show evidence of a mechanism for users to define or customize system prompts. The prompts are primarily used internally by Zed.
TypeScript Analogy: