diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..207db04 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,165 @@ +# AGENTS.md — VoicePaste + +## Project Overview + +VoicePaste: phone as microphone via browser → LAN WebSocket → Go server → Doubao ASR → real-time preview on phone → auto-paste to computer's focused app. Single Go binary with embedded frontend. + +## Tech Stack + +- **Backend**: Go 1.25+, Fiber v3, fasthttp/websocket, CGO required (robotgo + clipboard) +- **Frontend**: TypeScript, Vite 7, Biome 2, bun (package manager + runtime) +- **Tooling**: Taskfile (not Make), mise (Go + bun + task) +- **ASR**: Doubao Seed-ASR-2.0 via custom binary WebSocket protocol + +## Build & Run Commands + +```bash +# Install mise tools (go, bun, task) +mise install + +# Build everything (frontend + Go binary → dist/) +task + +# Build frontend only +task build:frontend + +# Run (build + execute) +task run + +# Dev mode (go run, skips frontend build) +task dev + +# Clean all artifacts +task clean + +# Tidy Go modules +task tidy +``` + +### Frontend (run from `web/`) + +```bash +bun install # Install deps +bun run build # Vite production build +bun run dev # Vite dev server +bun run lint # Biome check (lint + format) +bun run lint:fix # Biome check --write (auto-fix) +bun run typecheck # tsc --noEmit +``` + +### Go + +```bash +go vet ./... # Lint +go build -o dist/voicepaste . # Build (add .exe on Windows) +``` + +No test suite exists yet. No `go test` targets. + +## Project Structure + +``` +main.go # Entry point, embed.FS, TLS init, server startup +internal/ + config/config.go # YAML + env var config, fsnotify hot-reload, atomic global + server/server.go # Fiber v3 HTTPS server, static files from embed.FS + server/net.go # LAN IP detection + tls/tls.go # AnyIP cert download/cache + self-signed fallback + tls/generate.go # Self-signed cert generation + ws/protocol.go # JSON message types (start/stop/paste/partial/final/pasted/error) + ws/handler.go # WS upgrade, token auth, session lifecycle, text accumulation, paste + asr/protocol.go # Doubao binary protocol codec (4-byte header, gzip) + asr/client.go # WSS client to Doubao, audio streaming, result forwarding + paste/paste.go # clipboard.Write + robotgo key simulation (Ctrl+V / Cmd+V) +web/ + app.ts # Main app: WS client, audio pipeline, recording, history, UI + audio-processor.ts # AudioWorklet: PCM capture, 200ms frame accumulation + index.html # Mobile-first UI (all Chinese) + style.css # Dark theme + vite.config.ts # Vite config + biome.json # Biome config + tsconfig.json # TypeScript strict config +``` + +## Code Style — Go + +### Imports +Group in stdlib → external → internal order, separated by blank lines: +```go +import ( + "fmt" + "log/slog" + + "github.com/gofiber/fiber/v3" + + "github.com/imbytecat/voicepaste/internal/config" +) +``` +Use aliases only to avoid collisions: `crypto_tls "crypto/tls"`, `vpTLS "...internal/tls"`, `wsMsg "...internal/ws"`. + +### Logging +Use `log/slog` exclusively. Structured key-value pairs: +```go +slog.Info("message", "key", value) +slog.Error("failed to X", "err", err) +``` +Per-connection loggers via `slog.With("remote", addr)`. + +### Error Handling +- Always wrap with context: `fmt.Errorf("dial doubao: %w", err)` +- Return errors up; log at the boundary (main, handler entry) +- Never suppress errors silently. `slog.Warn` for non-fatal, `slog.Error` + exit/return for fatal +- Never use `as any`, `@ts-ignore`, or empty catch blocks + +### Naming +- Package names: short, lowercase, single word (`asr`, `ws`, `paste`, `config`) +- Exported types: `PascalCase` with doc comments +- Unexported: `camelCase` +- Constants: `PascalCase` for exported, `camelCase` for unexported +- Acronyms stay uppercase: `ASR`, `TLS`, `WS`, `URL`, `IP` + +### Patterns +- `sync.Mutex` for shared state, `chan` for goroutine communication +- `atomic.Value` for hot-reloadable config +- Goroutine cleanup: `defer`, `sync.WaitGroup`, `closeCh chan struct{}` +- Fiber v3 middleware pattern for auth checks before WS upgrade + +## Code Style — TypeScript (Frontend) + +### Formatting (Biome) +- Indent: tabs +- Quotes: double quotes +- Semicolons: default (enabled) +- Organize imports: enabled via Biome assist + +### TypeScript Config +- `strict: true`, `noUnusedLocals`, `noUnusedParameters` +- Target: ES2022, module: ESNext, bundler resolution +- DOM + DOM.Iterable libs + +### Patterns +- No framework — vanilla TypeScript with direct DOM manipulation +- State object pattern: single `AppState` interface with mutable fields +- Pointer Events for touch/mouse (not touch + mouse separately) +- AudioWorklet for audio capture (not MediaRecorder) +- `?worker&url` Vite import for AudioWorklet files +- WebSocket: binary for audio frames, JSON text for control messages + +## Language & Locale + +- **UI text**: Chinese (中文) — this app is for family members +- **Git commits**: Chinese, conventional format: `feat:`, `fix:`, `chore:`, `refactor:` +- **Code comments**: English +- **Communication with user**: Chinese (中文) + +## Key Constraints + +- CGO is required (robotgo, clipboard) — no cross-compilation +- Token auth: read from `config.yaml`; empty = no auth. Never auto-generate tokens +- Frontend is embedded via `//go:embed all:web/dist` in `main.go` +- `embed` directive cannot use `../` paths — must be in the package referencing it +- Build output goes to `dist/` (gitignored) +- Frontend ignores (`node_modules`, `dist`) in `web/.gitignore`, not root +- Config file (`config.yaml`) is gitignored; `config.example.yaml` is committed +- `os.UserCacheDir()` for platform-correct cert cache paths +- robotgo paste: `KeyDown(modifier)` → delay → `KeyTap("v")` → delay → `KeyUp(modifier)`