code.ivysaur.me

bridle

An agentic harness for OpenAI-compatible chat completion endpoints.

Features

  • Readline-based CLI with tab completion
    • Coloured output for sections (-Colour, or NO_COLOR env var)
  • Toggle thinking (/think and /no_think)
  • Tool calling (read_file, write_file, local_shell, apply_patch)
    • Allow providing alternative message (?)
    • User confirmation levels (/checktools or -CheckTools: set to ask default, auto, or yolo)
  • Auto load AGENTS.md and CLAUDE.md files
  • Oneshot mode (-p)
  • Rewind turn history (/rewind and /turns)
  • Support interrupting current message or tool call (^C)
  • Session management (/save and resume via -r)
  • Compaction via context summarization (/compact)
    • Auto compact on client-side token threshold (-MaxContext) or on server-side length exceeded
  • Import local files (wildcard match) into prompt (/read)
  • Use custom model provider (-Host and -Key)
  • Linux sandboxing with Bubblewrap (-Sandbox), use at own risk
  • Customizable system message, default-tuned for Golang development

Usage

# For 128k 131072, for 256k 262144 . Mostly affects pp speed, not tg speed
llama-server -c 262144 --cache-type-k q8_0 --cache-type-v q8_0 -fa on -m /srv/llama/Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf &

./bridle

# or
./bridle -CheckTools=yolo -Sandbox

Changelog

➡️ View changelog