LLM Playground

Changelog

Release history for LLM Playground (current: v0.13.1)

v0.13.1
Reverse Wordle stops hitting the rate limit2026-07-02
  • Reverse Wordle games no longer die mid-game with a “too many requests” error — full six-guess games now run to completion
v0.13.0
Dark mode2026-07-02
  • The app now opens in dark mode by default, with a toggle in the sidebar to switch between dark and light — your choice is remembered
v0.12.0
Big Numbers moves out2026-07-01
  • Big Numbers has moved to its new home on lsdhar.com — it is no longer part of this playground
v0.11.1
Big Numbers accuracy2026-06-24
  • Big Numbers now reads correctly between exact powers of ten — in-between values are clearly marked as approximate, small numbers like ten and a hundred are named properly, and the arrow keys keep working after you touch the slider
v0.11.0
Big Numbers2026-06-23
  • Slide from one to a googol and keep going to 10^200, watching every zero written out in full with each magnitude's exact name and a real-world anchor — and travel between scales with the arrow keys
  • Flip on auto-play to let it drift through every magnitude on its own, back and forth, 1.5 seconds at a time
v0.10.1
Under-the-hood cleanup2026-06-05
  • Internal refactor that removes duplicated code across the experiments — every feature works exactly as before
v0.10.0
Concept Explainer2026-05-31
  • Name any concept and get a single interactive explainer built for your level — a familiar hook, a model you can manipulate, the precise definition, then a quick recall check, rendering live as it is generated
  • Keep two 'about me' profiles — one for you, one for your kid — and flip the Adult/Child toggle to pitch each explainer to that audience; Simpler and Go deeper regenerate at an adjacent level
  • Save explainers to a history sidebar and download any of them as a self-contained .html file
v0.9.1
Evals Question Catalog2026-05-30
  • Evals includes a new weekday letter-count stumper that checks whether models notice every day name contains d
v0.9.0
Games vs. the LLM2026-05-16
  • Reverse Wordle: pick a model and a 5-letter target, then watch the LLM solve it with green/yellow/gray feedback each turn
  • 20 Questions: think of a thing and answer yes/no/unsure while the LLM narrows down to a final guess within 20 turns
  • Both games let you pick any OpenRouter model as the opponent and persist your in-progress game across page refreshes
v0.8.0
Evals2026-04-29
  • Run any question against many models in parallel and get pass/fail verdicts from a judge model, with OpenRouter cost shown per row and as a run total
  • Three preloaded stumper questions (letters in 'strawberry', drive-or-walk to the car wash, 9.11 vs 9.9) plus a freeform option, with selections and last run persisted to localStorage
  • Refreshed model lineup spans 15 models across OpenAI, Anthropic, Google, Moonshot AI, Qwen, DeepSeek, MiniMax, and Z.AI
v0.7.0
Simplified Scoring2026-04-08
  • Scoring replaced with two clear ratings: Accuracy and Conciseness, each shown as low/medium/high
  • Old numeric scores in history are automatically migrated to the new rating format
v0.6.0
Compression Test UI Overhaul2026-04-07
  • Compact settings bar, tighter concept card, and smaller page header for less scrolling
  • New Level setting (Easy/Medium/Hard) controls how advanced the generated concepts are
  • Prompts rewritten to explain real mechanisms instead of substituting cute analogies
  • Settings persist across page refreshes via localStorage
  • Concepts batch-prefetched in one LLM call for instant refresh
v0.5.0
Kid-Focused Diff View & Generate Button2026-04-07
  • Red/green word diff shows exactly what to change in your explanation after submitting
  • 'Generate for me' button creates a kid-friendly answer when you're stuck
  • Scoring now targets explanations for bright 3-5 year olds instead of adults
v0.4.0
Freestyle Mode & LLM-Generated Concepts2026-04-07
  • Freestyle mode with instant concept generation, inline scoring, and always-visible settings
  • Concepts generated on-the-fly by the selected model with thumbs up/down voting
  • Record history with sorting, filtering, and copy-to-clipboard
v0.3.0
Mobile Responsive UI2026-04-07
  • Hamburger menu and slide-out drawer for mobile navigation
  • All pages now work well on small screens
v0.2.0
Debate Arena2026-03-29
  • Two AI models debate a topic while an audience of AI models votes and shifts opinions in real time
  • Configurable topic, models, rounds, with live transcript and audience sentiment tracking
v0.1.0
Initial Setup2026-03-16
  • Next.js project with OpenRouter API integration and model playground