Init[Initialize Turn] Init --> Compact{Need Compaction?} Compact -->|Yes| CompactLLM[LLM Summarize] CompactLLM --> Assembly Compact -->|No| Assembly[Assemble Context] Assembly --> Stream[Stream to LLM] Stream --> Process[Process Events] Process --> Tools{Tool Requests?} Tools -->|Yes| Execute[Execute Tools] Execute --> Recurse[Recurse tt] Recurse --> Init Tools -->|No| End([Complete]) end style Init fill:#e1f5fe style Stream fill:#fff3e0 style Execute fill:#e8f5e9 style Recurse fill:#fce4ec "> Init[Initialize Turn] Init --> Compact{Need Compaction?} Compact -->|Yes| CompactLLM[LLM Summarize] CompactLLM --> Assembly Compact -->|No| Assembly[Assemble Context] Assembly --> Stream[Stream to LLM] Stream --> Process[Process Events] Process --> Tools{Tool Requests?} Tools -->|Yes| Execute[Execute Tools] Execute --> Recurse[Recurse tt] Recurse --> Init Tools -->|No| End([Complete]) end style Init fill:#e1f5fe style Stream fill:#fff3e0 style Execute fill:#e8f5e9 style Recurse fill:#fce4ec "> Init[Initialize Turn] Init --> Compact{Need Compaction?} Compact -->|Yes| CompactLLM[LLM Summarize] CompactLLM --> Assembly Compact -->|No| Assembly[Assemble Context] Assembly --> Stream[Stream to LLM] Stream --> Process[Process Events] Process --> Tools{Tool Requests?} Tools -->|Yes| Execute[Execute Tools] Execute --> Recurse[Recurse tt] Recurse --> Init Tools -->|No| End([Complete]) end style Init fill:#e1f5fe style Stream fill:#fff3e0 style Execute fill:#e8f5e9 style Recurse fill:#fce4ec ">
graph TB
subgraph "The Heart: tt Control Loop"
Start([User Input]) --> Init[Initialize Turn]
Init --> Compact{Need Compaction?}
Compact -->|Yes| CompactLLM[LLM Summarize]
CompactLLM --> Assembly
Compact -->|No| Assembly[Assemble Context]
Assembly --> Stream[Stream to LLM]
Stream --> Process[Process Events]
Process --> Tools{Tool Requests?}
Tools -->|Yes| Execute[Execute Tools]
Execute --> Recurse[Recurse tt]
Recurse --> Init
Tools -->|No| End([Complete])
end
style Init fill:#e1f5fe
style Stream fill:#fff3e0
style Execute fill:#e8f5e9
style Recurse fill:#fce4ec
tt Control Loop: The Beating HeartThe entire Claude Code system revolves around a single async generator function called tt. This function orchestrates every interaction, from user input to LLM communication to tool execution. Let's dissect this remarkable piece of engineering:
// The actual tt function signature from the codebase
async function* tt(
currentMessages: CliMessage[], // Full conversation history
baseSystemPromptString: string, // Static system instructions
currentGitContext: GitContext, // Real-time git state
currentClaudeMdContents: ClaudeMdContent[], // Project context
permissionGranterFn: PermissionGranter, // Permission callback
toolUseContext: ToolUseContext, // Shared execution context
activeStreamingToolUse?: ToolUseBlock, // Resume streaming state
loopState: {
turnId: string, // UUID for this turn
turnCounter: number, // Recursion depth tracker
compacted?: boolean, // History compression flag
isResuming?: boolean // Resume from saved state
}
): AsyncGenerator<CliMessage, void, void>
This signature reveals the sophisticated state management at play. The function yields CliMessage objects that drive UI updates while maintaining conversation flow. Let's examine each phase:
{
// Signal UI that processing has begun
yield {
type: "ui_state_update",
uuid: `uistate-${loopState.turnId}-${Date.now()}`,
timestamp: new Date().toISOString(),
data: { status: "thinking", turnId: loopState.turnId }
};
// Check context window pressure
let messagesForLlm = currentMessages;
let wasCompactedThisIteration = false;
if (await shouldAutoCompact(currentMessages)) {
yield {
type: "ui_notification",
data: { message: "Context is large, attempting to compact..." }
};
try {
const compactionResult = await compactAndStoreConversation(
currentMessages,
toolUseContext,
true
);
messagesForLlm = compactionResult.messagesAfterCompacting;
wasCompactedThisIteration = true;
loopState.compacted = true;
yield createSystemNotificationMessage(
`Conversation history automatically compacted. Summary: ${
compactionResult.summaryMessage.message.content[0].text
}`
);
} catch (compactionError) {
yield createSystemErrorMessage(
`Failed to compact conversation: ${compactionError.message}`
);
}
}
}
Performance Profile of Phase 1:
| Operation | Typical Duration | Complexity |
|---|---|---|
| Token counting | 10-50ms | O(n) messages |
| Compaction decision | <1ms | O(1) |
| LLM summarization | 2000-3000ms | One LLM call |
| Message reconstruction | 5-10ms | O(n) messages |
The system prompt isn't static—it's assembled fresh for each turn:
{
// Parallel fetch of all context sources
const [toolSpecs, dirStructure] = await Promise.all([
// Convert tool definitions to LLM-compatible specs
Promise.all(
toolUseContext.options.tools
.filter(tool => tool.isEnabled ? tool.isEnabled() : true)
.map(async (tool) => convertToolDefinitionToToolSpecification(tool, toolUseContext))
),
// Get current directory structure
getDirectoryStructureSnapshot(toolUseContext)
]);
// Assemble the complete system prompt
const systemPromptForLlm = assembleSystemPrompt(
baseSystemPromptString, // Core instructions
currentClaudeMdContents, // Project-specific context
currentGitContext, // Git status/branch/commits
dirStructure, // File tree
toolSpecs // Available tools
);
// Prepare messages with cache control
const apiMessages = prepareMessagesForApi(
messagesForLlm,
true // applyEphemeralCacheControl
);
}
The assembly process follows a strict priority order:
Priority 1: Base Instructions (~2KB)
↓
Priority 2: Model-Specific Adaptations (~500B)
↓
Priority 3: CLAUDE.md Content (Variable, typically 5-50KB)
↓
Priority 4: Git Context (~1-5KB)
↓
Priority 5: Directory Structure (Truncated to fit)
↓
Priority 6: Tool Specifications (~10-20KB)
{
// Initialize streaming call
const llmStream = callLlmStreamApi(
apiMessages,
systemPromptForLlm,
toolSpecificationsForLlm,
toolUseContext.options.mainLoopModel,
toolUseContext.abortController.signal
);
// Initialize accumulators for streaming response
let accumulatedAssistantMessage: Partial<CliMessage> & {
message: Partial<ApiMessage> & { content: ContentBlock[] }
} = {
type: "assistant",
uuid: `assistant-${loopState.turnId}-${loopState.turnCounter}-${Date.now()}`,
timestamp: new Date().toISOString(),
message: { role: "assistant", content: [] }
};
let currentToolUsesFromLlm: ToolUseBlock[] = [];
let currentThinkingContent: string = "";
let currentToolInputJsonBuffer: string = "";
}
This is where the magic happens—processing streaming events in real-time:
{
for await (const streamEvent of llmStream) {
// Abort check
if (toolUseContext.abortController.signal.aborted) {
yield createSystemNotificationMessage("LLM stream processing aborted by user.");
return;
}
switch (streamEvent.type) {
case "message_start":
accumulatedAssistantMessage.message.id = streamEvent.message.id;
accumulatedAssistantMessage.message.model = streamEvent.message.model;
accumulatedAssistantMessage.message.usage = streamEvent.message.usage;
yield {
type: "ui_state_update",
data: {
status: "assistant_responding",
model: streamEvent.message.model
}
};
break;
case "content_block_start":
const newBlockPlaceholder = { ...streamEvent.content_block };
// Initialize empty content based on block type
if (streamEvent.content_block.type === "thinking") {
currentThinkingContent = "";
newBlockPlaceholder.thinking = "";
} else if (streamEvent.content_block.type === "tool_use") {
currentToolInputJsonBuffer = "";
newBlockPlaceholder.input = "";
} else if (streamEvent.content_block.type === "text") {
newBlockPlaceholder.text = "";
}
accumulatedAssistantMessage.message.content.push(newBlockPlaceholder);
break;
case "content_block_delta":
const lastBlockIndex = accumulatedAssistantMessage.message.content.length - 1;
if (lastBlockIndex < 0) continue;
const currentBlock = accumulatedAssistantMessage.message.content[lastBlockIndex];
if (streamEvent.delta.type === "text_delta" && currentBlock.type === "text") {
currentBlock.text += streamEvent.delta.text;
yield {
type: "ui_text_delta",
data: {
textDelta: streamEvent.delta.text,
blockIndex: lastBlockIndex
}
};
} else if (streamEvent.delta.type === "input_json_delta" && currentBlock.type === "tool_use") {
currentToolInputJsonBuffer += streamEvent.delta.partial_json;
currentBlock.input = currentToolInputJsonBuffer;
// Try parsing incomplete JSON for preview
const parseAttempt = tryParsePartialJson(currentToolInputJsonBuffer);
if (parseAttempt.complete) {
yield {
type: "ui_tool_preview",
data: {
toolId: currentBlock.id,
preview: parseAttempt.value
}
};
}
}
break;
case "content_block_stop":
const completedBlock = accumulatedAssistantMessage.message.content[streamEvent.index];
if (completedBlock.type === "tool_use") {
try {
const parsedInput = JSON.parse(currentToolInputJsonBuffer);
completedBlock.input = parsedInput;
currentToolUsesFromLlm.push({
type: "tool_use",
id: completedBlock.id,
name: completedBlock.name,
input: parsedInput
});
} catch (e) {
// Handle malformed JSON from LLM
completedBlock.input = {
error: "Invalid JSON input from LLM",
raw_json_string: currentToolInputJsonBuffer,
parse_error: e.message
};
}
currentToolInputJsonBuffer = "";
}
yield {
type: "ui_content_block_complete",
data: { block: completedBlock, blockIndex: streamEvent.index }
};
break;
case "message_stop":
// LLM generation complete
const finalAssistantMessage = finalizeCliMessage(
accumulatedAssistantMessage,
loopState.turnId,
loopState.turnCounter
);
yield finalAssistantMessage;
// Move to Phase 5 or 6...
break;
}
}
}
Stream Processing Performance: