Compare commits
2 Commits
350e405fac
...
cead3e42b8
| Author | SHA1 | Date | |
|---|---|---|---|
| cead3e42b8 | |||
| bfaa792760 |
165
AGENTS.md
Normal file
165
AGENTS.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# AGENTS.md — VoicePaste
|
||||
|
||||
## Project Overview
|
||||
|
||||
VoicePaste: phone as microphone via browser → LAN WebSocket → Go server → Doubao ASR → real-time preview on phone → auto-paste to computer's focused app. Single Go binary with embedded frontend.
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Backend**: Go 1.25+, Fiber v3, fasthttp/websocket, CGO required (robotgo + clipboard)
|
||||
- **Frontend**: TypeScript, Vite 7, Biome 2, bun (package manager + runtime)
|
||||
- **Tooling**: Taskfile (not Make), mise (Go + bun + task)
|
||||
- **ASR**: Doubao Seed-ASR-2.0 via custom binary WebSocket protocol
|
||||
|
||||
## Build & Run Commands
|
||||
|
||||
```bash
|
||||
# Install mise tools (go, bun, task)
|
||||
mise install
|
||||
|
||||
# Build everything (frontend + Go binary → dist/)
|
||||
task
|
||||
|
||||
# Build frontend only
|
||||
task build:frontend
|
||||
|
||||
# Run (build + execute)
|
||||
task run
|
||||
|
||||
# Dev mode (go run, skips frontend build)
|
||||
task dev
|
||||
|
||||
# Clean all artifacts
|
||||
task clean
|
||||
|
||||
# Tidy Go modules
|
||||
task tidy
|
||||
```
|
||||
|
||||
### Frontend (run from `web/`)
|
||||
|
||||
```bash
|
||||
bun install # Install deps
|
||||
bun run build # Vite production build
|
||||
bun run dev # Vite dev server
|
||||
bun run lint # Biome check (lint + format)
|
||||
bun run lint:fix # Biome check --write (auto-fix)
|
||||
bun run typecheck # tsc --noEmit
|
||||
```
|
||||
|
||||
### Go
|
||||
|
||||
```bash
|
||||
go vet ./... # Lint
|
||||
go build -o dist/voicepaste . # Build (add .exe on Windows)
|
||||
```
|
||||
|
||||
No test suite exists yet. No `go test` targets.
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
main.go # Entry point, embed.FS, TLS init, server startup
|
||||
internal/
|
||||
config/config.go # YAML + env var config, fsnotify hot-reload, atomic global
|
||||
server/server.go # Fiber v3 HTTPS server, static files from embed.FS
|
||||
server/net.go # LAN IP detection
|
||||
tls/tls.go # AnyIP cert download/cache + self-signed fallback
|
||||
tls/generate.go # Self-signed cert generation
|
||||
ws/protocol.go # JSON message types (start/stop/paste/partial/final/pasted/error)
|
||||
ws/handler.go # WS upgrade, token auth, session lifecycle, text accumulation, paste
|
||||
asr/protocol.go # Doubao binary protocol codec (4-byte header, gzip)
|
||||
asr/client.go # WSS client to Doubao, audio streaming, result forwarding
|
||||
paste/paste.go # clipboard.Write + robotgo key simulation (Ctrl+V / Cmd+V)
|
||||
web/
|
||||
app.ts # Main app: WS client, audio pipeline, recording, history, UI
|
||||
audio-processor.ts # AudioWorklet: PCM capture, 200ms frame accumulation
|
||||
index.html # Mobile-first UI (all Chinese)
|
||||
style.css # Dark theme
|
||||
vite.config.ts # Vite config
|
||||
biome.json # Biome config
|
||||
tsconfig.json # TypeScript strict config
|
||||
```
|
||||
|
||||
## Code Style — Go
|
||||
|
||||
### Imports
|
||||
Group in stdlib → external → internal order, separated by blank lines:
|
||||
```go
|
||||
import (
|
||||
"fmt"
|
||||
"log/slog"
|
||||
|
||||
"github.com/gofiber/fiber/v3"
|
||||
|
||||
"github.com/imbytecat/voicepaste/internal/config"
|
||||
)
|
||||
```
|
||||
Use aliases only to avoid collisions: `crypto_tls "crypto/tls"`, `vpTLS "...internal/tls"`, `wsMsg "...internal/ws"`.
|
||||
|
||||
### Logging
|
||||
Use `log/slog` exclusively. Structured key-value pairs:
|
||||
```go
|
||||
slog.Info("message", "key", value)
|
||||
slog.Error("failed to X", "err", err)
|
||||
```
|
||||
Per-connection loggers via `slog.With("remote", addr)`.
|
||||
|
||||
### Error Handling
|
||||
- Always wrap with context: `fmt.Errorf("dial doubao: %w", err)`
|
||||
- Return errors up; log at the boundary (main, handler entry)
|
||||
- Never suppress errors silently. `slog.Warn` for non-fatal, `slog.Error` + exit/return for fatal
|
||||
- Never use `as any`, `@ts-ignore`, or empty catch blocks
|
||||
|
||||
### Naming
|
||||
- Package names: short, lowercase, single word (`asr`, `ws`, `paste`, `config`)
|
||||
- Exported types: `PascalCase` with doc comments
|
||||
- Unexported: `camelCase`
|
||||
- Constants: `PascalCase` for exported, `camelCase` for unexported
|
||||
- Acronyms stay uppercase: `ASR`, `TLS`, `WS`, `URL`, `IP`
|
||||
|
||||
### Patterns
|
||||
- `sync.Mutex` for shared state, `chan` for goroutine communication
|
||||
- `atomic.Value` for hot-reloadable config
|
||||
- Goroutine cleanup: `defer`, `sync.WaitGroup`, `closeCh chan struct{}`
|
||||
- Fiber v3 middleware pattern for auth checks before WS upgrade
|
||||
|
||||
## Code Style — TypeScript (Frontend)
|
||||
|
||||
### Formatting (Biome)
|
||||
- Indent: tabs
|
||||
- Quotes: double quotes
|
||||
- Semicolons: default (enabled)
|
||||
- Organize imports: enabled via Biome assist
|
||||
|
||||
### TypeScript Config
|
||||
- `strict: true`, `noUnusedLocals`, `noUnusedParameters`
|
||||
- Target: ES2022, module: ESNext, bundler resolution
|
||||
- DOM + DOM.Iterable libs
|
||||
|
||||
### Patterns
|
||||
- No framework — vanilla TypeScript with direct DOM manipulation
|
||||
- State object pattern: single `AppState` interface with mutable fields
|
||||
- Pointer Events for touch/mouse (not touch + mouse separately)
|
||||
- AudioWorklet for audio capture (not MediaRecorder)
|
||||
- `?worker&url` Vite import for AudioWorklet files
|
||||
- WebSocket: binary for audio frames, JSON text for control messages
|
||||
|
||||
## Language & Locale
|
||||
|
||||
- **UI text**: Chinese (中文) — this app is for family members
|
||||
- **Git commits**: Chinese, conventional format: `feat:`, `fix:`, `chore:`, `refactor:`
|
||||
- **Code comments**: English
|
||||
- **Communication with user**: Chinese (中文)
|
||||
|
||||
## Key Constraints
|
||||
|
||||
- CGO is required (robotgo, clipboard) — no cross-compilation
|
||||
- Token auth: read from `config.yaml`; empty = no auth. Never auto-generate tokens
|
||||
- Frontend is embedded via `//go:embed all:web/dist` in `main.go`
|
||||
- `embed` directive cannot use `../` paths — must be in the package referencing it
|
||||
- Build output goes to `dist/` (gitignored)
|
||||
- Frontend ignores (`node_modules`, `dist`) in `web/.gitignore`, not root
|
||||
- Config file (`config.yaml`) is gitignored; `config.example.yaml` is committed
|
||||
- `os.UserCacheDir()` for platform-correct cert cache paths
|
||||
- robotgo paste: `KeyDown(modifier)` → delay → `KeyTap("v")` → delay → `KeyUp(modifier)`
|
||||
@@ -74,7 +74,7 @@ func Dial(cfg Config, resultCh chan<- wsMsg.ServerMsg) (*Client, error) {
|
||||
EnableDDC: true,
|
||||
ShowUtterances: false,
|
||||
ResultType: "single",
|
||||
EndWindowSize: 400,
|
||||
EndWindowSize: 2000,
|
||||
},
|
||||
}
|
||||
data, err := EncodeFullClientRequest(req)
|
||||
@@ -132,10 +132,15 @@ func (c *Client) readLoop(resultCh chan<- wsMsg.ServerMsg) {
|
||||
resultCh <- wsMsg.ServerMsg{Type: wsMsg.MsgError, Message: resp.ErrMsg}
|
||||
return
|
||||
}
|
||||
// nostream mode: result comes after last audio packet or >15s
|
||||
// nostream mode: may return intermediate results every ~15s
|
||||
text := resp.Text
|
||||
if text != "" {
|
||||
resultCh <- wsMsg.ServerMsg{Type: wsMsg.MsgFinal, Text: text}
|
||||
if resp.IsLast {
|
||||
resultCh <- wsMsg.ServerMsg{Type: wsMsg.MsgFinal, Text: text}
|
||||
} else {
|
||||
// Intermediate result (>15s audio) — preview only, don't paste
|
||||
resultCh <- wsMsg.ServerMsg{Type: wsMsg.MsgPartial, Text: text}
|
||||
}
|
||||
}
|
||||
if resp.IsLast {
|
||||
return
|
||||
|
||||
@@ -62,23 +62,32 @@ func (h *Handler) handleConn(c *websocket.Conn) {
|
||||
defer close(resultCh)
|
||||
|
||||
// Writer goroutine: single writer to avoid concurrent writes
|
||||
// Accumulates all result texts; paste is triggered by stop, not by ASR final.
|
||||
var wg sync.WaitGroup
|
||||
var accMu sync.Mutex
|
||||
var accText string
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
for msg := range resultCh {
|
||||
// Accumulate text from both partial and final results
|
||||
if msg.Type == MsgPartial || msg.Type == MsgFinal {
|
||||
accMu.Lock()
|
||||
accText += msg.Text
|
||||
// Send accumulated preview to phone
|
||||
preview := ServerMsg{Type: MsgPartial, Text: accText}
|
||||
accMu.Unlock()
|
||||
if err := c.WriteMessage(websocket.TextMessage, preview.Bytes()); err != nil {
|
||||
log.Warn("ws write error", "err", err)
|
||||
return
|
||||
}
|
||||
continue
|
||||
}
|
||||
// Forward other messages (error, pasted) as-is
|
||||
if err := c.WriteMessage(websocket.TextMessage, msg.Bytes()); err != nil {
|
||||
log.Warn("ws write error", "err", err)
|
||||
return
|
||||
}
|
||||
// Auto-paste on final result
|
||||
if msg.Type == MsgFinal && msg.Text != "" && h.pasteFunc != nil {
|
||||
if err := h.pasteFunc(msg.Text); err != nil {
|
||||
log.Error("auto-paste failed", "err", err)
|
||||
} else {
|
||||
_ = c.WriteMessage(websocket.TextMessage, ServerMsg{Type: MsgPasted}.Bytes())
|
||||
}
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
@@ -119,6 +128,10 @@ func (h *Handler) handleConn(c *websocket.Conn) {
|
||||
if active {
|
||||
continue
|
||||
}
|
||||
// Reset accumulated text for new session
|
||||
accMu.Lock()
|
||||
accText = ""
|
||||
accMu.Unlock()
|
||||
sa, cl, err := h.asrFactory(resultCh)
|
||||
if err != nil {
|
||||
log.Error("asr start failed", "err", err)
|
||||
@@ -134,12 +147,25 @@ func (h *Handler) handleConn(c *websocket.Conn) {
|
||||
if !active {
|
||||
continue
|
||||
}
|
||||
// Finish ASR session — waits for final result from readLoop
|
||||
if cleanup != nil {
|
||||
cleanup()
|
||||
cleanup = nil
|
||||
}
|
||||
sendAudio = nil
|
||||
active = false
|
||||
// Now paste the accumulated text
|
||||
accMu.Lock()
|
||||
finalText := accText
|
||||
accText = ""
|
||||
accMu.Unlock()
|
||||
if finalText != "" && h.pasteFunc != nil {
|
||||
if err := h.pasteFunc(finalText); err != nil {
|
||||
log.Error("auto-paste failed", "err", err)
|
||||
} else {
|
||||
resultCh <- ServerMsg{Type: MsgPasted}
|
||||
}
|
||||
}
|
||||
log.Info("recording stopped")
|
||||
|
||||
case MsgPaste:
|
||||
|
||||
Reference in New Issue
Block a user