🚧 Upcoming Feature

This design document describes a feature currently in planning. CLI Local LLM Processing will enable checks without sending diffs to the Threadline web application, with optional syncing of results for analytics and collaboration.

CLI Local LLM Processing

Summary

CLI calls LLM directly, so code diffs don't have to go to Threadline. You can still optionally send diffs and check results to Threadline, thereby benefiting from convenience and analysis capabilities. When syncing, the full request (diffs + threadlines + results) is sent to the web app, which stores everything but skips LLM processing since it's already done.

Problem It Solves

This creates greater flexibility on which LLM provider and models to use, makes the user directly responsible for LLM costs, and allows you to opt out of sending code diffs up to Threadline's servers.

How It Works

Current Flow

CLI (check.ts)
  ↓
Collects: git context, diffs, threadlines
  ↓
POST /api/threadline-check
  ↓
Web App (route.ts)
  ↓
processThreadlines() → calls OpenAI API
  ↓
storeCheck() → saves to database
  ↓
Return results → CLI
  ↓
displayResults()

Currently, the CLI sends all threadline data, diffs, and context files to the web app. The web app processes everything using OpenAI, stores results, and returns them to the CLI.

New Flow

CLI (check.ts)
  ↓
Collects: git context, diffs, threadlines
  ↓
processThreadlines() → calls OpenAI API locally
  ↓
displayResults()
  ↓
[STOP HERE - Local only]
  OR
  ↓
[Optional] POST /api/threadline-check-results
  ↓
Send: diffs + threadlines + results
  ↓
Web App (new route.ts)
  ↓
storeCheck() → saves to database
(Skips LLM processing - already done)

With local processing, the CLI handles all LLM calls directly. Results are displayed immediately. The process can stop here for local-only usage. Optionally, the full request (diffs + threadlines + results) can be synced to the web app for storage and analytics. The web app stores everything but skips LLM processing since it's already complete.

Detailed Analysis of Current Implementation

1. Request Object Structure

The CLI sends a ReviewRequest object via POST /api/threadline-check:

ReviewRequest {
  threadlines: Array<{
    id: string;              // Threadline identifier
    version: string;         // Version string from file
    patterns: string[];      // File patterns (e.g., ["**/*.ts"])
    content: string;         // Threadline guidelines text
    filePath: string;        // Path to threadline file
    contextFiles?: string[]; // Optional: paths to context files
    contextContent?: {       // Optional: full content of context files
      [filePath: string]: string;
    };
  }>;
  diff: string;              // Full git diff (unified format)
  files: string[];           // List of changed file paths
  apiKey: string;            // Threadline API key (for auth)
  account: string;           // Account identifier (email)
  repoName?: string;         // Git remote URL
  branchName?: string;       // Branch name
  commitSha?: string;        // Commit SHA
  commitMessage?: string;    // Commit message
  commitAuthorName?: string; // Author name
  commitAuthorEmail?: string;// Author email
  prTitle?: string;          // PR/MR title
  environment?: string;      // 'github' | 'gitlab' | 'vercel' | 'local'
  cliVersion?: string;       // CLI version
  reviewContext: 'local' | 'commit' | 'pr' | 'file' | 'folder' | 'files';
}

Key Points:

CLI reads threadline files and context files from disk, includes full content
CLI generates git diff using git commands (varies by context: commit, PR, file, etc.)
All metadata (repo, branch, commit, author) is collected by CLI before sending
Full diff is sent (can be large with -U200 context lines)

2. Endpoint Processing Flow

POST /api/threadline-check (route.ts)
│
├─ Step 1: Parse & Validate Request
│  ├─ Validate threadlines array (required, non-empty)
│  ├─ Validate filePath on each threadline (required)
│  ├─ Validate diff (must be string, empty allowed)
│  ├─ Validate reviewContext (must be valid enum)
│  └─ Validate apiKey & account (required)
│
├─ Step 2: Calculate Statistics (for audit)
│  ├─ countLinesInDiff(diff) → {added, removed, total}
│  ├─ calculateContextStats(threadlines) → {fileCount, totalLines, files}
│  └─ Log audit statistics
│
├─ Step 3: Early Return for Zero Diffs
│  └─ If diff.trim() === '':
│     └─ Return all threadlines as 'not_relevant' (no LLM calls)
│
├─ Step 4: Authentication
│  ├─ Look up account in database by identifier
│  ├─ Compare apiKey (plaintext comparison)
│  ├─ Get accountId and userId
│  └─ Fallback to env vars (backward compatibility)
│
├─ Step 5: Get OpenAI API Key
│  └─ Read OPENAI_API_KEY from server environment
│
├─ Step 6: Process Threadlines (LLM Calls)
│  └─ processThreadlines({...request, apiKey: openaiApiKey})
│     │
│     ├─ For each threadline (parallel):
│     │  └─ processThreadline(threadline, diff, files, apiKey)
│     │     │
│     │     ├─ Filter files matching patterns
│     │     │  └─ If no matches → return 'not_relevant'
│     │     │
│     │     ├─ Filter diff to relevant files
│     │     │  └─ filterDiffByFiles(diff, relevantFiles)
│     │     │
│     │     ├─ Extract files from filtered diff
│     │     │  └─ extractFilesFromDiff(filteredDiff)
│     │     │
│     │     ├─ Trim diff for LLM (reduce tokens)
│     │     │  └─ createSlimDiff(filteredDiff, contextLines)
│     │     │     (Default: 10 context lines, configurable)
│     │     │
│     │     ├─ Build prompt
│     │     │  └─ buildPrompt(threadline, trimmedDiff, filesInDiff)
│     │     │     ├─ Includes threadline content
│     │     │     ├─ Includes context files content
│     │     │     ├─ Includes trimmed diff
│     │     │     └─ Includes changed files list
│     │     │
│     │     ├─ Call OpenAI API
│     │     │  └─ openai.chat.completions.create({
│     │     │       model: 'gpt-5.2',
│     │     │       messages: [system, user],
│     │     │       response_format: {type: 'json_object'},
│     │     │       temperature: 0.1
│     │     │     })
│     │     │
│     │     └─ Return ProcessThreadlineResult
│     │        ├─ status: 'compliant' | 'attention' | 'not_relevant' | 'error'
│     │        ├─ reasoning: string
│     │        ├─ fileReferences: string[]
│     │        ├─ relevantFiles: string[]
│     │        ├─ filteredDiff: string (full filtered diff, not trimmed)
│     │        ├─ filesInFilteredDiff: string[]
│     │        └─ llmCallMetrics: {...}
│     │
│     └─ Return ProcessThreadlinesResponse
│        ├─ results: ProcessThreadlineResult[]
│        └─ metadata: {totalThreadlines, completed, timedOut, errors, llmModel}
│
├─ Step 7: Store Check in Database
│  └─ storeCheck({request, result, diffStats, contextStats, ...})
│     │
│     ├─ Begin Transaction
│     │
│     ├─ Insert check record
│     │  └─ INSERT INTO checks (repo_name, branch_name, commit_sha, ...)
│     │
│     ├─ Insert diff content
│     │  └─ INSERT INTO check_diffs (check_id, diff_content, diff_format)
│     │
│     ├─ For each threadline:
│     │  ├─ Generate hashes
│     │  │  ├─ versionHash = generateVersionHash({
│     │  │  │     threadlineId, filePath, patterns, content,
│     │  │  │     version, repoName, accountId
│     │  │  │   })
│     │  │  └─ identityHash = generateIdentityHash({
│     │  │        threadlineId, filePath, repoName, accountId
│     │  │      })
│     │  │
│     │  ├─ Check if version_hash exists
│     │  │  └─ If yes → reuse threadline_definition_id
│     │  │  └─ If no → check identity_hash for predecessor
│     │  │     └─ Insert new threadline_definition
│     │  │
│     │  ├─ Process context files
│     │  │  ├─ For each context file:
│     │  │  │  ├─ contextHash = generateContextHash({
│     │  │  │  │     accountId, repoName, filePath, content
│     │  │  │  │   })
│     │  │  │  └─ Check if content_hash exists
│     │  │  │     └─ Reuse or create context_file_snapshot
│     │  │  └─ Collect snapshot IDs
│     │  │
│     │  └─ Insert check_threadlines
│     │     └─ INSERT INTO check_threadlines (
│     │          threadline_definition_id,
│     │          context_snapshot_ids,
│     │          relevant_files,
│     │          filtered_diff,
│     │          files_in_filtered_diff
│     │        )
│     │
│     ├─ Insert check_results
│     │  └─ INSERT INTO check_results (
│     │       status, reasoning, file_references
│     │     )
│     │
│     └─ Commit Transaction
│
├─ Step 8: Log Metrics (non-blocking)
│  ├─ Log LLM call metrics for each threadline
│  └─ Log check summary metrics
│
└─ Step 9: Return Response
   └─ NextResponse.json(result)

3. What Needs to Move vs. Stay

✅ Move to CLI (Required for Local Processing)

processThreadlines() - Main orchestration with parallel execution
processThreadline() - Single threadline processing with OpenAI API
buildPrompt() - Prompt construction
filterDiffByFiles() - Filter diff by threadline patterns
extractFilesFromDiff() - Extract file list from diff
createSlimDiff() - Trim diff for LLM (token reduction)
OpenAI SDK integration
Timeout handling (40s per threadline)
Parallel processing logic

❌ Stay in Web App (Storage & Analytics Only)

storeCheck() - Database storage logic
Hash calculations (generateVersionHash, generateIdentityHash, generateContextHash) - Only done on server for analysis purposes, after LLM results are received
Authentication logic (apiKey validation, account lookup)
Metrics logging (LLM call metrics, check summary metrics)
Audit statistics calculation (for logging, not needed for processing)

4. Data Flow Summary

CLI → Web App (Current):
  ├─ Sends: threadlines[], diff, files[], metadata
  ├─ Web App: Processes with LLM
  ├─ Web App: Stores in DB (with hashes)
  └─ Returns: results[], metadata

CLI → Web App (New - Sync):
  ├─ Sends: threadlines[], diff, files[], metadata, results[], metadata
  ├─ Web App: Skips LLM (already processed)
  ├─ Web App: Stores in DB (with hashes)
  └─ Returns: success/error

Key Insight:
  - Web app still needs full diff for UI diff viewer
  - Web app still needs results for storage
  - Web app still generates hashes for deduplication
  - Only LLM processing moves to CLI

Implementation Plan

Prerequisite: Storage function storeCheckAndMetrics() exists at app/lib/audit/store-check-and-metrics.ts. Both paths (web app LLM, CLI sync) use this function with the same ProcessThreadlinesResponse interface.

Next: Add Sync Endpoint

Create POST /api/threadline-check-results — same as current endpoint minus LLM processing.

// CLI sends: ReviewRequest + results + metadata
{
  ...reviewRequest,                                         // Same as today
  results: ProcessThreadlineResult[],                       // Already processed
  metadata: { totalThreadlines, completed, timedOut, errors, llmModel }
}

// Endpoint: validate → auth → storeCheckAndMetrics() → return success

Then: Port Processing to CLI

Port processThreadlines(), buildPrompt(), diff filtering to CLI. Returns same ProcessThreadlinesResponse.

// CLI: process locally, optionally sync
const result = await processThreadlines({...});
displayResults(result);
if (shouldSync) await client.syncResults({...reviewRequest, ...result});

Changes

CLI Changes

New Files:

src/api/openai.ts - OpenAI client wrapper
src/processors/expert.ts - Port processThreadlines from web app
src/processors/single-expert.ts - Port processThreadline from web app
src/llm/prompt-builder.ts - Port buildPrompt from web app

Modified Files:

src/commands/check.ts - Replace ReviewAPIClient.review() call with local processing
src/api/client.ts - Add new syncResults() method for optional web app sync

Key Implementation Details:

CLI currently calls ReviewAPIClient.review() at line 283 in check.ts, sending full request including diffs
Replace with local processThreadlines() that calls OpenAI directly
Port timeout logic (40s per threadline) and parallel processing from web app
Port prompt building logic - includes threadline content, context files, diff, and changed files
After local processing, optionally call new sync endpoint with full request (diffs + threadlines + results) - web app stores everything but skips LLM processing

Web App Changes

New Endpoint:

POST /api/threadline-check-results - Accepts full request (diffs + threadlines + results). Stores in database using existing storeCheck() function, but skips LLM processing since results are already provided.

Request Format:

{
  // Full request (same as current /api/threadline-check)
  threadlines: [...],  // Threadline definitions
  diff: string,        // Full git diff
  files: string[],     // Changed files
  results: ExpertResult[];  // Already processed results
  metadata: {
    totalThreadlines: number;
    completed: number;
    timedOut: number;
    errors: number;
    llmModel: string;
  };
  // Same metadata as current: repoName, branchName, commitSha, etc.
  apiKey: string;
  account: string;
}

The web app needs the full diff for the UI diff viewer, analytics, and fix detection. It just skips calling the LLM since results are already provided.

Existing Endpoint:

Keep POST /api/threadline-check for backward compatibility. Can be deprecated later.

Configuration

Environment Variables:

OPENAI_API_KEY - Required for local processing
THREADLINE_SYNC - true | false - Control whether to sync results to web app

CLI Flags:

--no-sync - Skip syncing results to web app
--sync - Explicitly enable syncing (default: enabled for backward compatibility)

Code to Port

From Web App to CLI:

app/lib/processors/expert.ts → src/processors/expert.ts - Main processing logic with parallel execution and timeout handling
app/lib/processors/single-expert.ts → src/processors/single-expert.ts - Single threadline processing with OpenAI API calls
app/lib/llm/prompt-builder.ts → src/llm/prompt-builder.ts - Prompt construction logic
app/lib/utils/diff-filter.ts → src/utils/diff-filter.ts - Filter diffs by threadline patterns

Dependencies to Add:

openai - OpenAI SDK for Node.js