# AGENTS.md — VoicePaste ## Project Overview VoicePaste: phone as microphone via browser → LAN WebSocket → Go server → Doubao ASR → real-time preview on phone → auto-paste to computer's focused app. Single Go binary with embedded frontend. ## Tech Stack - **Backend**: Go 1.25+, Fiber v3, fasthttp/websocket, CGO required (robotgo + clipboard) - **Frontend**: TypeScript, Vite 7, Biome 2, bun (package manager + runtime) - **Tooling**: Taskfile (not Make), mise (Go + bun + task) - **ASR**: Doubao Seed-ASR-2.0 via custom binary WebSocket protocol ## Build & Run Commands ```bash # Install mise tools (go, bun, task) mise install # Build everything (frontend + Go binary → dist/) task # Build frontend only task build:frontend # Run (build + execute) task run # Dev mode (go run, skips frontend build) task dev # Clean all artifacts task clean # Tidy Go modules task tidy ``` ### Frontend (run from `web/`) ```bash bun install # Install deps bun run build # Vite production build bun run dev # Vite dev server bun run lint # Biome check (lint + format) bun run lint:fix # Biome check --write (auto-fix) bun run typecheck # tsc --noEmit ``` ### Go ```bash go vet ./... # Lint go build -o dist/voicepaste . # Build (add .exe on Windows) ``` No test suite exists yet. No `go test` targets. ## Project Structure ``` main.go # Entry point, embed.FS, TLS init, server startup internal/ config/config.go # YAML + env var config, fsnotify hot-reload, atomic global server/server.go # Fiber v3 HTTPS server, static files from embed.FS server/net.go # LAN IP detection tls/tls.go # AnyIP cert download/cache + self-signed fallback tls/generate.go # Self-signed cert generation ws/protocol.go # JSON message types (start/stop/paste/partial/final/pasted/error) ws/handler.go # WS upgrade, token auth, session lifecycle, text accumulation, paste asr/protocol.go # Doubao binary protocol codec (4-byte header, gzip) asr/client.go # WSS client to Doubao, audio streaming, result forwarding paste/paste.go # clipboard.Write + robotgo key simulation (Ctrl+V / Cmd+V) web/ app.ts # Main app: WS client, audio pipeline, recording, history, UI audio-processor.ts # AudioWorklet: PCM capture, 200ms frame accumulation index.html # Mobile-first UI (all Chinese) style.css # Dark theme vite.config.ts # Vite config biome.json # Biome config tsconfig.json # TypeScript strict config ``` ## Code Style — Go ### Imports Group in stdlib → external → internal order, separated by blank lines: ```go import ( "fmt" "log/slog" "github.com/gofiber/fiber/v3" "github.com/imbytecat/voicepaste/internal/config" ) ``` Use aliases only to avoid collisions: `crypto_tls "crypto/tls"`, `vpTLS "...internal/tls"`, `wsMsg "...internal/ws"`. ### Logging Use `log/slog` exclusively. Structured key-value pairs: ```go slog.Info("message", "key", value) slog.Error("failed to X", "err", err) ``` Per-connection loggers via `slog.With("remote", addr)`. ### Error Handling - Always wrap with context: `fmt.Errorf("dial doubao: %w", err)` - Return errors up; log at the boundary (main, handler entry) - Never suppress errors silently. `slog.Warn` for non-fatal, `slog.Error` + exit/return for fatal - Never use `as any`, `@ts-ignore`, or empty catch blocks ### Naming - Package names: short, lowercase, single word (`asr`, `ws`, `paste`, `config`) - Exported types: `PascalCase` with doc comments - Unexported: `camelCase` - Constants: `PascalCase` for exported, `camelCase` for unexported - Acronyms stay uppercase: `ASR`, `TLS`, `WS`, `URL`, `IP` ### Patterns - `sync.Mutex` for shared state, `chan` for goroutine communication - `atomic.Value` for hot-reloadable config - Goroutine cleanup: `defer`, `sync.WaitGroup`, `closeCh chan struct{}` - Fiber v3 middleware pattern for auth checks before WS upgrade ## Code Style — TypeScript (Frontend) ### Formatting (Biome) - Indent: tabs - Quotes: double quotes - Semicolons: default (enabled) - Organize imports: enabled via Biome assist ### TypeScript Config - `strict: true`, `noUnusedLocals`, `noUnusedParameters` - Target: ES2022, module: ESNext, bundler resolution - DOM + DOM.Iterable libs ### Patterns - No framework — vanilla TypeScript with direct DOM manipulation - State object pattern: single `AppState` interface with mutable fields - Pointer Events for touch/mouse (not touch + mouse separately) - AudioWorklet for audio capture (not MediaRecorder) - `?worker&url` Vite import for AudioWorklet files - WebSocket: binary for audio frames, JSON text for control messages ## Language & Locale - **UI text**: Chinese (中文) — this app is for family members - **Git commits**: Chinese, conventional format: `feat:`, `fix:`, `chore:`, `refactor:` - **Code comments**: English - **Communication with user**: Chinese (中文) ## Key Constraints - CGO is required (robotgo, clipboard) — no cross-compilation - Token auth: read from `config.yaml`; empty = no auth. Never auto-generate tokens - Frontend is embedded via `//go:embed all:web/dist` in `main.go` - `embed` directive cannot use `../` paths — must be in the package referencing it - Build output goes to `dist/` (gitignored) - Frontend ignores (`node_modules`, `dist`) in `web/.gitignore`, not root - Config file (`config.yaml`) is gitignored; `config.example.yaml` is committed - `os.UserCacheDir()` for platform-correct cert cache paths - robotgo paste: `KeyDown(modifier)` → delay → `KeyTap("v")` → delay → `KeyUp(modifier)`