Request for Feedback on Luthien

Walk through our product as a new user. Think aloud the whole time -we want the good, the bad, and the ugly.

Context: Luthien is an AI safety startup inspired by Redwood Research's AI control agenda. We're requesting feedback on our landing page, readme instructions, quickstart setup and the open-source tool we've built. Your feedback will help us improve the clarity of our onboarding process and make our setup and trial process smoother. Thanks in advance for your help!

Do this async, or book a call to walk through it together.

Part 1: Start Recording & Warm Up ~5 min

Start a screen recording (Loom, Zoom, QuickTime, OBS) with webcam + audio, and an AI note-taker (SuperWhisper, Otter.ai) for a transcript.

Warm up: Introduce yourself -where did you grow up? What type of projects do you like working on? Then, give us a 30-second tour of your dock or taskbar -what's open, what are your pinned apps? No wrong answers, just get your brain-to-mouth pipeline going.

Part 2: Your Dev Workflow & Pain Points ~15 min

Share your screen and walk us through:

Typical day: What does a normal coding or DevOps day look like?
AI in your flow: Where do AI tools (Cursor, Claude Code, etc.) show up? What makes you reach for them vs. doing it manually?
Code review: How do you handle reviews -who signs off?
Frustrations: Think of the last time an AI assistant was really frustrating. Give us the play-by-play -what happened, what did you try, how did you troubleshoot? What specifically was the most frustrating part? On a scale of 0-10, how painful was it?
Worst case: What keeps you up at night about AI in your workflow? Any security worries? Spending / token costs? Privacy concerns with third-party servers? 0-10, how worried are you?
Future: Which of your workflows do you expect to become more agentic soon? Which will still need close supervision? Does that make you nervous?
Workarounds: What do you do today to cope with AI failure modes? How much time do you spend on workarounds?
Magic wand: If you could prevent one AI failure mode, what would it be? How about the top 3? How would your ideal solution work -while you're coding, while you grab coffee, or while you sleep?

Part 3: Landing Page ~10 min

luthienresearch.github.io/luthien-pbc-site/

Scroll through the whole thing. React out loud to whatever grabs you or confuses you.

Part 4: README Quick Start tell us how long it took

GitHub README

Read the intro, then follow the Quick Start. Note what clicks and what doesn't. If you get stuck on any step, note where, skip it, and keep going.

When something doesn't work: exact error + screenshot. When it does work: try something simple, check the conversation history UI, and test any policies you set up.

Then try to break it - throw weird prompts at it, stress-test the policies, do things we didn't anticipate.

Part 5: Your Report

Send your recording, transcript, and a Google Doc summarizing your key findings to Scott Wofford (co-founder) at scott@luthienresearch.org.

If you'd like to share your conversation logs, use this template:

⬇ conversation_log_template.csv