Best Boilerplates for Building Developer Tools 2026
TL;DR
There's no single "developer tools boilerplate" — you pick the right starter for your tool type. CLI tools: use create-typescript-app or oclif. VS Code extensions: the official Yeoman generator or generator-code. npm packages: tsup-starter or pkgroll setup. Language servers: vscode-languageserver-node. Each category has established patterns in 2026 with TypeScript as the non-negotiable baseline.
Key Takeaways
- CLI tools:
oclif(Heroku, complex CLIs) orcommander/meow(lightweight) - VS Code extensions: Official
yo codegenerator, then customize - npm packages:
tsup+changesets+ GitHub Actions release workflow - Language servers: LSP protocol via
vscode-languageserver-nodeor@volar/language-server - Browser DevTools: Chrome DevTools extension template
- TypeScript: Non-negotiable baseline for all categories in 2026
CLI Tools: Two Approaches
Lightweight CLI (commander/meow pattern)
npm create typescript-app@latest my-cli
# Or from scratch:
mkdir my-cli && cd my-cli
npm init -y
npm install typescript commander chalk ora
npm install -D @types/node tsup
// src/index.ts — production CLI pattern:
#!/usr/bin/env node
import { program } from 'commander';
import chalk from 'chalk';
import ora from 'ora';
import { readFile } from 'node:fs/promises';
import { resolve } from 'node:path';
const packageJson = JSON.parse(
await readFile(new URL('../package.json', import.meta.url), 'utf-8')
);
program
.name('my-cli')
.description('My developer tool CLI')
.version(packageJson.version);
program
.command('analyze <path>')
.description('Analyze files in a directory')
.option('-r, --recursive', 'Analyze recursively', false)
.option('--format <type>', 'Output format (json|table)', 'table')
.action(async (path, options) => {
const spinner = ora('Analyzing...').start();
try {
const result = await analyze(resolve(path), options);
spinner.succeed(chalk.green('Analysis complete!'));
if (options.format === 'json') {
console.log(JSON.stringify(result, null, 2));
} else {
printTable(result);
}
} catch (err) {
spinner.fail(chalk.red(`Failed: ${err.message}`));
process.exit(1);
}
});
program.parse();
// package.json for CLI:
{
"name": "my-cli",
"version": "1.0.0",
"bin": {
"my-cli": "./dist/index.js"
},
"type": "module",
"main": "./dist/index.js",
"scripts": {
"build": "tsup src/index.ts --format esm --target node18",
"dev": "tsx src/index.ts",
"test": "vitest"
},
"publishConfig": {
"access": "public"
}
}
oclif: Full-Featured CLI Framework
npx oclif generate my-cli
cd my-cli
// src/commands/analyze.ts — oclif command:
import { Command, Flags, Args } from '@oclif/core';
import chalk from 'chalk';
export default class Analyze extends Command {
static override description = 'Analyze files in a directory';
static override args = {
path: Args.string({ description: 'Path to analyze', required: true }),
};
static override flags = {
recursive: Flags.boolean({ char: 'r', description: 'Analyze recursively' }),
format: Flags.string({ char: 'f', default: 'table', options: ['json', 'table'] }),
output: Flags.file({ char: 'o', description: 'Write output to file' }),
};
async run(): Promise<void> {
const { args, flags } = await this.parse(Analyze);
this.log(chalk.blue(`Analyzing ${args.path}...`));
const result = await analyze(args.path, flags);
if (flags.format === 'json') {
this.log(JSON.stringify(result, null, 2));
} else {
this.table(result.files, {
name: { header: 'File' },
size: { header: 'Size' },
issues: { header: 'Issues' },
});
}
}
}
oclif gives you:
→ Auto-generated help (--help on every command)
→ Plugin system (users can extend your CLI)
→ Auto-update mechanism
→ Multi-command CLIs with subcommands
→ Tab completion
Used by: Heroku CLI, Salesforce CLI, Twilio CLI
VS Code Extension Boilerplate
# Official generator:
npm install -g yo generator-code
yo code
# Choose:
# - Extension (TypeScript/JavaScript)
# - Color Theme
# - Language Support
# - Snippet pack
Generated structure:
my-extension/
src/
extension.ts ← Main entry (activate/deactivate)
test/
suite/
package.json ← Contributes section = extension manifest
.vscode/
launch.json ← Debug configuration
tsconfig.json
// src/extension.ts — VS Code extension pattern:
import * as vscode from 'vscode';
let statusBarItem: vscode.StatusBarItem;
export function activate(context: vscode.ExtensionContext) {
// Register a command:
const analyzeCommand = vscode.commands.registerCommand(
'myExtension.analyze',
async () => {
const editor = vscode.window.activeTextEditor;
if (!editor) {
vscode.window.showWarningMessage('No active editor');
return;
}
const document = editor.document;
const text = document.getText();
await vscode.window.withProgress(
{
location: vscode.ProgressLocation.Notification,
title: 'Analyzing...',
cancellable: false,
},
async (progress) => {
progress.report({ increment: 0 });
const result = await analyze(text);
progress.report({ increment: 100 });
vscode.window.showInformationMessage(
`Analysis complete: ${result.issues} issues found`
);
}
);
}
);
// Status bar item:
statusBarItem = vscode.window.createStatusBarItem(
vscode.StatusBarAlignment.Right,
100
);
statusBarItem.command = 'myExtension.analyze';
statusBarItem.text = '$(search) Analyze';
statusBarItem.show();
// Register a code action provider:
const codeActionProvider = vscode.languages.registerCodeActionsProvider(
['typescript', 'javascript'],
new MyCodeActionProvider(),
{ providedCodeActionKinds: [vscode.CodeActionKind.QuickFix] }
);
context.subscriptions.push(analyzeCommand, statusBarItem, codeActionProvider);
}
export function deactivate() {
statusBarItem?.dispose();
}
class MyCodeActionProvider implements vscode.CodeActionProvider {
provideCodeActions(
document: vscode.TextDocument,
range: vscode.Range,
context: vscode.CodeActionContext
): vscode.CodeAction[] {
return context.diagnostics
.filter(d => d.code === 'MY_RULE')
.map(diagnostic => {
const fix = new vscode.CodeAction('Fix: auto-fix', vscode.CodeActionKind.QuickFix);
fix.edit = new vscode.WorkspaceEdit();
fix.edit.replace(document.uri, diagnostic.range, 'corrected-value');
fix.diagnostics = [diagnostic];
fix.isPreferred = true;
return fix;
});
}
}
// package.json contributes section:
{
"contributes": {
"commands": [
{
"command": "myExtension.analyze",
"title": "Analyze Current File",
"category": "My Extension"
}
],
"menus": {
"editor/context": [
{
"command": "myExtension.analyze",
"when": "editorHasSelection",
"group": "myExtension"
}
]
},
"configuration": {
"title": "My Extension",
"properties": {
"myExtension.autoAnalyze": {
"type": "boolean",
"default": false,
"description": "Auto-analyze on save"
}
}
},
"keybindings": [
{
"command": "myExtension.analyze",
"key": "ctrl+shift+a",
"mac": "cmd+shift+a",
"when": "editorTextFocus"
}
]
}
}
npm Package: Release Automation
# Modern npm package starter:
npm create typescript-app@latest my-package
# Or manual setup:
mkdir my-package && cd my-package
npm init -y
npm install -D typescript tsup vitest @changesets/cli
// package.json — dual CJS/ESM exports:
{
"name": "my-package",
"version": "1.0.0",
"type": "module",
"main": "./dist/index.cjs",
"module": "./dist/index.js",
"types": "./dist/index.d.ts",
"exports": {
".": {
"types": "./dist/index.d.ts",
"import": "./dist/index.js",
"require": "./dist/index.cjs"
}
},
"files": ["dist"],
"scripts": {
"build": "tsup",
"test": "vitest run",
"release": "changeset publish"
}
}
// tsup.config.ts:
import { defineConfig } from 'tsup';
export default defineConfig({
entry: ['src/index.ts'],
format: ['cjs', 'esm'],
dts: true,
splitting: false,
sourcemap: true,
clean: true,
treeshake: true,
});
# .github/workflows/release.yml — automated releases:
name: Release
on:
push:
branches: [main]
jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v3
- uses: actions/setup-node@v4
with:
node-version: 24
registry-url: 'https://registry.npmjs.org'
- run: pnpm install --frozen-lockfile
- run: pnpm test
- run: pnpm build
- name: Create release PR or publish
uses: changesets/action@v1
with:
publish: pnpm release
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
Decision Guide
Building a CLI tool:
→ Simple single-command: commander + tsup
→ Multi-command with plugins: oclif
→ Maximum speed (Bun-based): oclif on Bun or commander + Bun
Building a VS Code extension:
→ Start with yo code (official generator)
→ Add language server if syntax highlighting/intellisense needed
→ Use vscode-test for integration testing
Building an npm package:
→ tsup + changesets is the 2026 standard
→ Dual CJS/ESM exports via exports field
→ Automated releases via changesets/action GitHub Action
Building a language server:
→ vscode-languageserver-node for VS Code integration
→ Volar (@volar/language-server) for Vue-ecosystem tools
→ Both implement the Language Server Protocol (LSP)
Find developer tool boilerplates at StarterPick.
Decision Framework: Shipping in Production
Most teams fail to ship this capability because they optimize for perfect architecture before they validate user behavior. The practical path is to define one narrow success metric, ship the smallest production-safe version, and then expand scope only after observing real usage. For SaaS teams, this means designing around operational clarity first: clear ownership, clear rollback paths, and clear instrumentation.
Start by deciding what must be true on day one and what can wait until day thirty. Day-one requirements should focus on reliability, observability, and user comprehension. If users cannot understand what happened, trust erodes quickly. If your team cannot debug problems in minutes, support cost explodes. This is why logs, idempotency, and explicit state transitions matter more than visual polish in the first release.
A useful rubric is to split requirements into three buckets: required for correctness, required for supportability, and required for growth. Correctness is the narrow technical contract the feature must satisfy. Supportability includes alerts, admin visibility, and failure messaging. Growth includes advanced automation, richer UX, and integrations that extend the core feature. Teams that mix these buckets early usually ship late and maintain fragile systems.
For architecture choices, prefer the smallest set of moving parts your current team can operate confidently. A "simpler" stack with strong team familiarity will beat a theoretically better stack that requires weeks of new operational learning. In 2026, speed-to-feedback is still a major competitive advantage for early-stage products, so choose patterns that reduce coordination overhead between product, engineering, and support.
30-Day Rollout Plan
Week 1 should focus on contracts and guardrails. Define your domain events, persistence model, permission boundaries, and failure semantics. Write down what happens when dependencies are slow, unavailable, or return unexpected payloads. Add request IDs and structured logs before launch so you can trace flows across services. Build basic dashboards for throughput, error rate, and latency.
Week 2 should focus on end-to-end completion paths. Implement the happy path with one representative user journey, then cover at least three common failure scenarios. Add deterministic retries where appropriate and idempotency keys wherever duplicate delivery or duplicate clicks are likely. Keep feature flags on every risky branch so you can disable behavior without emergency deploys.
Week 3 should focus on user-facing trust. Improve states and messages for pending, success, and failure outcomes. Add meaningful timeline entries or activity logs in your product where users can verify what happened. Build internal admin views for support teammates so incident response does not depend on engineering digging through raw logs.
Week 4 should focus on hardening and scale. Validate high-volume scenarios using replay traffic or synthetic load. Tune queue concurrency, query indexes, and backoff strategies from measured bottlenecks rather than assumptions. Define an incident runbook with clear ownership and escalation paths. If a capability has legal or compliance implications, add explicit evidence trails and retention policies.
Common Failure Modes
A frequent failure mode is hidden coupling between UI actions and backend execution. When UI state assumes immediate backend completion, users see stale or contradictory states during retries and delayed processing. Prevent this by modeling asynchronous states explicitly and rendering those states directly in the interface.
Another failure mode is missing idempotency around write operations. Duplicate submits, webhook retries, and browser refreshes are common in production. If you do not enforce idempotency keys and uniqueness constraints, you will eventually create duplicate records, duplicate charges, or duplicate side effects. Fixing this after launch is painful because data cleanup requires case-by-case handling.
Teams also underestimate support workflows. A feature that technically works can still overwhelm support if there is no internal visibility into user-level events. Add searchable logs, event timelines, and actor metadata so support can answer "what happened" without escalating every ticket to engineering.
A final failure mode is over-broad initial scope. Shipping one reliable workflow is better than shipping five partial workflows with inconsistent behavior. Product quality is judged by reliability under edge conditions, not by checklist length. Keep the first release narrow, then expand from observed demand.
Integration Guidance for Starter Kits
Starter kits accelerate initial setup, but each kit encodes assumptions about auth, data access, background processing, and deployment. Before adding major features, map those assumptions so your implementation aligns with existing conventions. If the starter uses server actions and route handlers, keep your new logic in that model rather than introducing parallel patterns unless there is a clear benefit.
For data schema changes, prefer additive migrations with backward compatibility. Add nullable fields first, deploy readers that can handle old and new shapes, then backfill, and finally enforce constraints. This staged approach reduces downtime risk and avoids blocking deploys when data volume grows.
For API boundaries, keep internal contracts typed and versioned. Even if you run a monorepo, treat cross-module calls as contracts. This prevents accidental breaking changes and makes future extraction easier if you split services later. Add minimal contract tests at module boundaries where regressions would be expensive.
For deployment, define environment parity rules. At minimum, local and staging should use the same auth modes, queue semantics, and feature flags. Divergent environments cause "works on staging" failures that are hard to diagnose and expensive to fix under pressure.
Internal Links and Next Reads
To keep implementation decisions consistent across the site taxonomy, use these references while planning follow-up work:
- Best SaaS starter kits ranked 2026
- SaaS boilerplate vs custom build
- Best Next.js SaaS boilerplates 2026
- Prisma vs Drizzle for boilerplates
- Authentication setup in Next.js boilerplates
Methodology & Sources
This article was expanded using a repeatable editorial workflow: identify thin sections, preserve existing technical examples, add production-focused implementation guidance, and include operational failure analysis so the content is useful beyond copy-paste snippets. Claims about platform behavior and integration patterns were aligned with primary documentation.
Primary references:
- Stripe Billing docs: meter-based usage billing and event reporting
- Next.js docs: App Router, Route Handlers, and deployment behavior
- Vercel docs: cron jobs, serverless execution model, and limits
- Auth.js and Clerk docs: session handling and authentication flows
- Inngest, BullMQ, and Trigger.dev docs: background execution and retries
- Svix docs: webhook signing, retry windows, and delivery semantics
Full Story: Operating Model That Scales
The difference between a tutorial-grade implementation and a production-grade implementation is rarely a single library choice. The difference is operating discipline. Teams that ship durable systems define explicit contracts between product requirements and technical behavior, then enforce those contracts through tests, observability, and rollout controls. This section gives a practical operating model that you can apply immediately inside a starter-kit codebase.
Begin with an execution contract written in plain language. It should state what input is accepted, what output is guaranteed, what failure classes exist, and what user-visible behavior appears for each failure class. Keep the document short enough to read in one sitting. If the contract cannot be understood quickly by product and support, your implementation is too implicit and incident response will be slow.
Next, define the event model for your feature before adding UI polish. Every meaningful state transition should map to one domain event with a stable name. Stable events let you build analytics, alerting, and support timelines without rewriting instrumentation every sprint. They also make future integrations easier because external systems consume event contracts, not implementation details.
Treat retries as a first-class behavior, not an edge case. In distributed systems, duplicate delivery and delayed execution are routine. Your system should behave correctly when the same command arrives twice, when dependent services return timeouts, and when users refresh or re-submit actions. Idempotency keys, uniqueness constraints, and deterministic retry backoff are the minimum baseline.
Design user feedback around certainty levels. There is a real distinction between "request accepted," "processing," and "completed." Avoid collapsing these states into one success message. Users trust systems that communicate uncertainty honestly and then resolve that uncertainty predictably. If work is async, show pending state and expected completion windows.
Build support tooling in parallel with feature development. The fastest way to reduce engineering interrupts is to give support agents a searchable event timeline with actor, timestamp, payload hash, and current processing state. Include links to related objects such as workspace, subscription, job run, and webhook delivery attempt. This visibility converts vague bug reports into actionable reproduction steps.
For security-sensitive workflows, enforce policy at service boundaries rather than within UI components. UI checks are helpful for user experience, but authorization must be evaluated server-side on each sensitive action. Log denials with enough context to diagnose misconfigured roles. Over time, these denial logs become useful signals for policy refinement.
When working from starter kits, avoid broad rewrites in the first month. Instead, adopt a strangle pattern: keep existing conventions, add capability behind feature flags, and move one slice at a time. This lowers regression risk and keeps diff size manageable for code review. Large cross-cutting rewrites often look elegant in architecture diagrams but carry high delivery risk under real deadlines.
Use release rings to limit blast radius. Roll out to internal users first, then a small external cohort, then general availability. Pair each ring with explicit exit criteria: error budget, support ticket volume, and median completion latency. If criteria are not met, pause rollout and fix the dominant failure mode rather than shipping around it.
Cost control should be measured, not guessed. Add basic usage and cost telemetry from day one, especially for features that call paid APIs, run background compute, or trigger outbound delivery. Surface these metrics weekly so product and engineering can align on pricing and quota decisions before margin problems emerge.
Finally, codify ownership. Every subsystem needs a directly responsible owner, a backup owner, and an escalation path. Ownership should include runbook maintenance, alert tuning, and dependency upgrades. Features without clear ownership become reliability debt and eventually block roadmap velocity.
Production Readiness Checklist
Use this checklist before declaring a feature complete:
- Contract defined: inputs, outputs, and failure classes documented
- Idempotency enforced for all mutating operations
- Structured logs include correlation IDs and actor metadata
- Alerting configured for error spikes and queue lag
- Admin/support timeline available for debugging user incidents
- Feature flags in place for rapid rollback
- Security boundaries validated at server endpoints
- Data migrations staged for backward compatibility
- Load tests or replay tests run against expected peak traffic
- Incident runbook published with owner and escalation path
If more than two checklist items are missing, treat the feature as beta, not finished. This framing helps set realistic expectations with stakeholders and protects your team from rushed launch commitments. Developer tool teams that ship with a complete production readiness checklist from the beginning consistently build stronger user trust — developers evaluate tools with higher scrutiny than typical SaaS consumers, and early quality signals disproportionately influence adoption and word-of-mouth in technical communities.
Methodology Notes
Editorially, this expansion prioritizes operational depth over surface-level implementation snippets. The goal is to make each article useful for teams moving from prototype to dependable production behavior. The recommendations align with current platform documentation and established SaaS delivery patterns in 2026, with emphasis on reliability, supportability, and controlled rollout.