Architecture: The Engine Room

Init[Initialize Turn] Init --> Compact{Need Compaction?} Compact -->|Yes| CompactLLM[LLM Summarize] CompactLLM --> Assembly Compact -->|No| Assembly[Assemble Context] Assembly --> Stream[Stream to LLM] Stream --> Process[Process Events] Process --> Tools{Tool Requests?} Tools -->|Yes| Execute[Execute Tools] Execute --> Recurse[Recurse tt] Recurse --> Init Tools -->|No| End([Complete]) end style Init fill:#e1f5fe style Stream fill:#fff3e0 style Execute fill:#e8f5e9 style Recurse fill:#fce4ec "> Init[Initialize Turn] Init --> Compact{Need Compaction?} Compact -->|Yes| CompactLLM[LLM Summarize] CompactLLM --> Assembly Compact -->|No| Assembly[Assemble Context] Assembly --> Stream[Stream to LLM] Stream --> Process[Process Events] Process --> Tools{Tool Requests?} Tools -->|Yes| Execute[Execute Tools] Execute --> Recurse[Recurse tt] Recurse --> Init Tools -->|No| End([Complete]) end style Init fill:#e1f5fe style Stream fill:#fff3e0 style Execute fill:#e8f5e9 style Recurse fill:#fce4ec "> Init[Initialize Turn] Init --> Compact{Need Compaction?} Compact -->|Yes| CompactLLM[LLM Summarize] CompactLLM --> Assembly Compact -->|No| Assembly[Assemble Context] Assembly --> Stream[Stream to LLM] Stream --> Process[Process Events] Process --> Tools{Tool Requests?} Tools -->|Yes| Execute[Execute Tools] Execute --> Recurse[Recurse tt] Recurse --> Init Tools -->|No| End([Complete]) end style Init fill:#e1f5fe style Stream fill:#fff3e0 style Execute fill:#e8f5e9 style Recurse fill:#fce4ec ">

graph TB
    subgraph "The Heart: tt Control Loop"
        Start([User Input]) --> Init[Initialize Turn]
        Init --> Compact{Need Compaction?}
        Compact -->|Yes| CompactLLM[LLM Summarize]
        CompactLLM --> Assembly
        Compact -->|No| Assembly[Assemble Context]

        Assembly --> Stream[Stream to LLM]
        Stream --> Process[Process Events]
        Process --> Tools{Tool Requests?}

        Tools -->|Yes| Execute[Execute Tools]
        Execute --> Recurse[Recurse tt]
        Recurse --> Init

        Tools -->|No| End([Complete])
    end

    style Init fill:#e1f5fe
    style Stream fill:#fff3e0
    style Execute fill:#e8f5e9
    style Recurse fill:#fce4ec

The `tt` Control Loop: The Beating Heart

The entire Claude Code system revolves around a single async generator function called tt. This function orchestrates every interaction, from user input to LLM communication to tool execution. Let's dissect this remarkable piece of engineering:

// The actual tt function signature from the codebase
async function* tt(
  currentMessages: CliMessage[],         // Full conversation history
  baseSystemPromptString: string,        // Static system instructions
  currentGitContext: GitContext,         // Real-time git state
  currentClaudeMdContents: ClaudeMdContent[], // Project context
  permissionGranterFn: PermissionGranter, // Permission callback
  toolUseContext: ToolUseContext,         // Shared execution context
  activeStreamingToolUse?: ToolUseBlock,  // Resume streaming state
  loopState: {
    turnId: string,        // UUID for this turn
    turnCounter: number,   // Recursion depth tracker
    compacted?: boolean,   // History compression flag
    isResuming?: boolean   // Resume from saved state
  }
): AsyncGenerator<CliMessage, void, void>

This signature reveals the sophisticated state management at play. The function yields CliMessage objects that drive UI updates while maintaining conversation flow. Let's examine each phase:

Phase 1: Turn Initialization & Context Preparation

{
  // Signal UI that processing has begun
  yield {
    type: "ui_state_update",
    uuid: `uistate-${loopState.turnId}-${Date.now()}`,
    timestamp: new Date().toISOString(),
    data: { status: "thinking", turnId: loopState.turnId }
  };

  // Check context window pressure
  let messagesForLlm = currentMessages;
  let wasCompactedThisIteration = false;

  if (await shouldAutoCompact(currentMessages)) {
    yield {
      type: "ui_notification",
      data: { message: "Context is large, attempting to compact..." }
    };

    try {
      const compactionResult = await compactAndStoreConversation(
        currentMessages,
        toolUseContext,
        true
      );
      messagesForLlm = compactionResult.messagesAfterCompacting;
      wasCompactedThisIteration = true;
      loopState.compacted = true;

      yield createSystemNotificationMessage(
        `Conversation history automatically compacted. Summary: ${
          compactionResult.summaryMessage.message.content[0].text
        }`
      );
    } catch (compactionError) {
      yield createSystemErrorMessage(
        `Failed to compact conversation: ${compactionError.message}`
      );
    }
  }
}

Performance Profile of Phase 1:

Operation	Typical Duration	Complexity
Token counting	10-50ms	O(n) messages
Compaction decision	<1ms	O(1)
LLM summarization	2000-3000ms	One LLM call
Message reconstruction	5-10ms	O(n) messages

Phase 2: Dynamic System Prompt Assembly

The system prompt isn't static—it's assembled fresh for each turn:

{
  // Parallel fetch of all context sources
  const [toolSpecs, dirStructure] = await Promise.all([
    // Convert tool definitions to LLM-compatible specs
    Promise.all(
      toolUseContext.options.tools
        .filter(tool => tool.isEnabled ? tool.isEnabled() : true)
        .map(async (tool) => convertToolDefinitionToToolSpecification(tool, toolUseContext))
    ),
    // Get current directory structure
    getDirectoryStructureSnapshot(toolUseContext)
  ]);

  // Assemble the complete system prompt
  const systemPromptForLlm = assembleSystemPrompt(
    baseSystemPromptString,      // Core instructions
    currentClaudeMdContents,     // Project-specific context
    currentGitContext,           // Git status/branch/commits
    dirStructure,                // File tree
    toolSpecs                    // Available tools
  );

  // Prepare messages with cache control
  const apiMessages = prepareMessagesForApi(
    messagesForLlm,
    true // applyEphemeralCacheControl
  );
}

The assembly process follows a strict priority order:

Priority 1: Base Instructions (~2KB)
    ↓
Priority 2: Model-Specific Adaptations (~500B)
    ↓
Priority 3: CLAUDE.md Content (Variable, typically 5-50KB)
    ↓
Priority 4: Git Context (~1-5KB)
    ↓
Priority 5: Directory Structure (Truncated to fit)
    ↓
Priority 6: Tool Specifications (~10-20KB)

Phase 3: LLM Stream Initialization

{
  // Initialize streaming call
  const llmStream = callLlmStreamApi(
    apiMessages,
    systemPromptForLlm,
    toolSpecificationsForLlm,
    toolUseContext.options.mainLoopModel,
    toolUseContext.abortController.signal
  );

  // Initialize accumulators for streaming response
  let accumulatedAssistantMessage: Partial<CliMessage> & {
    message: Partial<ApiMessage> & { content: ContentBlock[] }
  } = {
    type: "assistant",
    uuid: `assistant-${loopState.turnId}-${loopState.turnCounter}-${Date.now()}`,
    timestamp: new Date().toISOString(),
    message: { role: "assistant", content: [] }
  };

  let currentToolUsesFromLlm: ToolUseBlock[] = [];
  let currentThinkingContent: string = "";
  let currentToolInputJsonBuffer: string = "";
}

Phase 4: Stream Event Processing State Machine

This is where the magic happens—processing streaming events in real-time:

{
  for await (const streamEvent of llmStream) {
    // Abort check
    if (toolUseContext.abortController.signal.aborted) {
      yield createSystemNotificationMessage("LLM stream processing aborted by user.");
      return;
    }

    switch (streamEvent.type) {
      case "message_start":
        accumulatedAssistantMessage.message.id = streamEvent.message.id;
        accumulatedAssistantMessage.message.model = streamEvent.message.model;
        accumulatedAssistantMessage.message.usage = streamEvent.message.usage;
        yield {
          type: "ui_state_update",
          data: {
            status: "assistant_responding",
            model: streamEvent.message.model
          }
        };
        break;

      case "content_block_start":
        const newBlockPlaceholder = { ...streamEvent.content_block };

        // Initialize empty content based on block type
        if (streamEvent.content_block.type === "thinking") {
          currentThinkingContent = "";
          newBlockPlaceholder.thinking = "";
        } else if (streamEvent.content_block.type === "tool_use") {
          currentToolInputJsonBuffer = "";
          newBlockPlaceholder.input = "";
        } else if (streamEvent.content_block.type === "text") {
          newBlockPlaceholder.text = "";
        }

        accumulatedAssistantMessage.message.content.push(newBlockPlaceholder);
        break;

      case "content_block_delta":
        const lastBlockIndex = accumulatedAssistantMessage.message.content.length - 1;
        if (lastBlockIndex < 0) continue;

        const currentBlock = accumulatedAssistantMessage.message.content[lastBlockIndex];

        if (streamEvent.delta.type === "text_delta" && currentBlock.type === "text") {
          currentBlock.text += streamEvent.delta.text;
          yield {
            type: "ui_text_delta",
            data: {
              textDelta: streamEvent.delta.text,
              blockIndex: lastBlockIndex
            }
          };
        } else if (streamEvent.delta.type === "input_json_delta" && currentBlock.type === "tool_use") {
          currentToolInputJsonBuffer += streamEvent.delta.partial_json;
          currentBlock.input = currentToolInputJsonBuffer;

          // Try parsing incomplete JSON for preview
          const parseAttempt = tryParsePartialJson(currentToolInputJsonBuffer);
          if (parseAttempt.complete) {
            yield {
              type: "ui_tool_preview",
              data: {
                toolId: currentBlock.id,
                preview: parseAttempt.value
              }
            };
          }
        }
        break;

      case "content_block_stop":
        const completedBlock = accumulatedAssistantMessage.message.content[streamEvent.index];

        if (completedBlock.type === "tool_use") {
          try {
            const parsedInput = JSON.parse(currentToolInputJsonBuffer);
            completedBlock.input = parsedInput;
            currentToolUsesFromLlm.push({
              type: "tool_use",
              id: completedBlock.id,
              name: completedBlock.name,
              input: parsedInput
            });
          } catch (e) {
            // Handle malformed JSON from LLM
            completedBlock.input = {
              error: "Invalid JSON input from LLM",
              raw_json_string: currentToolInputJsonBuffer,
              parse_error: e.message
            };
          }
          currentToolInputJsonBuffer = "";
        }

        yield {
          type: "ui_content_block_complete",
          data: { block: completedBlock, blockIndex: streamEvent.index }
        };
        break;

      case "message_stop":
        // LLM generation complete
        const finalAssistantMessage = finalizeCliMessage(
          accumulatedAssistantMessage,
          loopState.turnId,
          loopState.turnCounter
        );
        yield finalAssistantMessage;

        // Move to Phase 5 or 6...
        break;
    }
  }
}

Stream Processing Performance:

The tt Control Loop: The Beating Heart

Phase 1: Turn Initialization & Context Preparation

Phase 2: Dynamic System Prompt Assembly

Phase 3: LLM Stream Initialization

Phase 4: Stream Event Processing State Machine

The `tt` Control Loop: The Beating Heart