Examples
Different domains. Different questions. Same principle.
A 10-second YouTube intro
The prompt: "I want a short intro for my YouTube channel about Pantion."
The result — built from the dialog below.
The dialog
You "I want a 10-second intro for my YouTube channel about Pantion."
Pantion "What's the tone — corporate, playful, minimal, cinematic?"
You "Minimal and confident. No flashy effects." HARD
Pantion "Should the camera move, or stay fixed?"
You "Slow zoom out, revealing the logo." HARD
Pantion "What's the background — solid color, gradient, environment?"
You "Dark, warm — like the Pantion website. Maybe a subtle grain texture." FLEX
Pantion "Audio — music, sound effect, silence?"
You "A soft, single note. Like a gentle chime. Nothing dramatic." HARD
Pantion "Should the manakin mascot appear?"
You "Yes, it should animate in before the logo. Small, centered." HARD
Loneliness
The prompt: "I want an image that radiates loneliness."
The result — generated from the dialog below.
The dialog
You "I want an image that radiates loneliness."
Pantion "What kind of loneliness? Serene solitude, painful isolation, existential smallness — or something else?"
You "Painful isolation." HARD
Pantion "Is there a person in the image, or should the loneliness be felt indirectly — through an empty space, an abandoned object, a trace of absence?"
You "A trace of absence." HARD
Pantion "Is the absence recent — warmth still lingering — or long ago, with decay setting in?"
You "Long ago." HARD
Pantion "Indoor or outdoor? And should there be clues pointing to a specific person, or should it be universal — so the viewer projects themselves into it?"
You "Outdoor and universal." HARD
Pantion "What kind of landscape? Urban — a neglected bench in an empty park. Nature — an overgrown path, nature swallowing something human. Coast — a washed-up object, an empty pier."
You "An overgrown bench." HARD
Pantion "What season and light? Autumn and dusk are obvious — but maybe you want something unexpected?"
You "Autumn at dawn." FLEX
Pantion "Style — photorealistic, painterly, illustrative? Color palette? And most importantly: what should it NOT be?"
You "Photorealistic. Muted ochre, brown, grey with cold morning light. Definitely not hopeful." HARD
Pantion "How far has nature reclaimed the bench — still clearly recognizable, or barely visible?"
You "Almost completely overgrown, but still recognizable as a bench." FLEX
WhatsApp Reminder
The prompt: "I want to send myself reminders via WhatsApp."
The result — built from the dialog below.
The dialog
You "I want to enter a date, time, and reminder text somewhere. When the time comes, I get a WhatsApp message."
Pantion "Where do you enter it?"
You "In WhatsApp itself. I send a message to a contact called Reminder." HARD
Pantion "If you were setting a reminder right now, how would you type it?"
You "Example: '12:15 Piano lesson'. No date means today." HARD
Pantion "When you send a reminder, what do you expect back?"
You "A confirmation showing the interpreted time." HARD
Pantion "What if you don't respond to the confirmation?"
You "The reminder is not scheduled." HARD
Pantion "What if you don't acknowledge the reminder when it fires?"
You "The system resends after 5 minutes, max 3 times." FLEX
Pantion "Is that 3 messages total, or 1 original + 3 retries?"
You "4 total." HARD
Pantion "What if the system is temporarily down when a reminder is due?"
You "A missed reminder is a missed reminder." HARD
Pantion "Can user A see or affect user B's reminders?"
You "No." HARD
Pantion "Timezones. Which 12:00 is '12:00 lunch'?"
You "Amsterdam. Single timezone for MVP." FLEX
Pantion "What if someone enters a time in the past?"
You "Reject it." HARD
Pantion "If someone types '5-6 12:00 lunch' — is that May 6th or June 5th?"
You "June 5th — day first." HARD
Pantion "Can someone have multiple open confirmations at once?"
You "Yes — WhatsApp lets you reply to a specific message. That determines which reminder." HARD
After building
After using the system, the user simplified it: removed the confirmation flow, restricted to today-only reminders, and dropped retries. The dialog tracked every change.
Audio-to-Report Pipeline
The prompt: "I want to turn audio recordings of client conversations into structured reports."
A domestic safety agency. Real client conversations. The stakes are high — wrong assumptions don't just waste time, they compromise trust and accountability.
The dialog
You "I want to enter audio recordings somewhere. The system transcribes them and generates a report in our agency's template format."
Pantion "Where do the recordings come from?"
You "My partner records client conversations at a domestic safety agency. She drops audio files into a shared folder on our home NAS." HARD
Pantion "What audio formats does her recording device produce?"
You "Multiple formats — it varies. M4A, WAV, sometimes MP3." FLEX
Pantion "You mentioned you already have a working transcription app. What does it do, and what model does it use?"
You "It's a drag-and-drop app. She drops an audio file, it transcribes using Whisper large-v3 with speaker diarization — all local, no cloud APIs." HARD
Pantion "The transcription is local. What about the report generation — which model generates the report from the transcript?"
You "A local LLM. Currently Qwen 3 8B running via Ollama. Must also be local." HARD
Pantion "You said 'our agency's template format.' What does that template look like? Is it a fixed structure, or does it vary per case type?"
You "Fixed structure. Header with case metadata, then sections: attendees, key topics discussed, agreed actions, risk indicators, and follow-up date." HARD
Pantion "Where does the case metadata come from — the case name, case number, client identifiers? Does the system need to look these up, or are they embedded in the folder name?"
You "The subfolder name is the case identifier. That's it. No lookup, no database." HARD
Pantion "The report has a 'risk indicators' section. How does the system identify risk? Does the LLM infer them from the transcript, or does the template have a fixed checklist?"
You "The LLM infers them. But it should flag uncertainty — if it's not sure, it marks the indicator as 'requires review' rather than asserting." HARD
Pantion "What should happen when something goes wrong? If an audio file is invalid, or transcription fails, or the report can't be generated — how should she know?"
You "For the MVP, moving it to an error folder is enough." FLEX
Pantion "Should the system run continuously as a background process, or does she start it manually?"
You "It runs on a Mac Mini she doesn't have access to. It must run continuously." HARD
Pantion "What happens after a restart? Should it automatically pick up unprocessed files?"
You "Yes. It scans the inbox on startup and processes anything that hasn't been handled." HARD
Pantion "How does it know what's been handled? A tracking file? A database? Or does moving the file out of the inbox serve as the signal?"
You "Moving it out of the inbox. If it's in the inbox, it hasn't been processed." HARD
Pantion "How does the system know which case a new audio file belongs to?"
You "She creates a subfolder in the inbox with the case name. Every audio file in that subfolder belongs to that case." FLEX
Pantion "When she drops a second audio file in the same subfolder — is that a second conversation within the same case?"
You "Yes. Each audio file is one conversation. A case can have many conversations over weeks." HARD
Pantion "Should the report include a verbatim transcript, or just the structured summary?"
You "Both. The full transcript is saved separately. The report references it but contains only the structured summary." HARD
Pantion "Where does the generated report go?"
You "Into the case folder structure: cases/{case-id}/conversations/{conversation-id}/report.docx, next to the recording and transcript." HARD
Pantion "You said .docx. Is that a firm requirement, or would Markdown or PDF also work?"
You ".docx is required — she opens them in Word and sometimes edits before filing." HARD
Pantion "Given the domain — domestic violence, child protection — what needs to be logged? At minimum: which file was processed, timestamps per step, success or failure with reason, and which model version generated the transcript and report. If you switch models later, you need to trace which model produced which report. This matters for accountability."
You "Yes. Log all of that. Append-only. Every step timestamped. Model version in every log entry." HARD
Pantion "What must the system never do?"
You "Never send data to external services. Never delete files — only move them. Never overwrite a report that already exists." HARD
Pantion "Can she drop multiple files at once? Does order matter?"
You "First in, first out. If she drops five files, they process sequentially." FLEX
Pantion "What is your success criterion for the MVP?"
You "An audio file in the inbox leads to a report in the case folder — without manual intervention." HARD
Pantion "Last question. The NAS path — is it a fixed mount point, or should it be configurable?"
You "Configurable via environment variable. Never hardcoded." HARD
Amendment 1 — after first build
After building and testing with real recordings, three things became clear: long recordings needed chunking, the template needed a confidence score, and error notifications needed more than just a folder.
You "Three things changed after real-world testing. First: some recordings are over two hours. Whisper chokes on those."
Pantion "What's the maximum duration Whisper handles reliably?"
You "About 30 minutes. Anything longer needs to be split into chunks first." HARD
Pantion "Should the system split automatically using silence detection, or fixed-length chunks?"
You "Silence detection. Fixed-length might cut mid-sentence." HARD
Pantion "After splitting and transcribing each chunk, the system needs to stitch the transcripts back together. Should it preserve chunk boundaries in the final transcript, or produce one seamless document?"
You "One seamless transcript. The chunks are an implementation detail." HARD
You "Second change: the report needs an overall confidence score. She needs to know at a glance whether the report is reliable or needs careful review."
Pantion "What drives the confidence score? Audio quality? Transcript completeness? Number of uncertain risk indicators?"
You "All three. Low audio quality, many inaudible segments, or uncertain risk flags should lower the score." HARD
Pantion "How is the score presented? Numeric (0-100), categorical (high/medium/low), or traffic light?"
You "Traffic light. Green, orange, red. With a one-line explanation." FLEX
You "Third: the error folder isn't enough. She doesn't check it. She needs a notification."
Pantion "What kind of notification? Email, push notification, a file in a specific location she does check?"
You "A simple text file in the case folder called PROCESSING_ERROR.txt. She opens the case folder to check for the report — if she sees that file instead, she knows something went wrong." HARD
Pantion "Should the error file contain the technical details, or a human-readable summary?"
You "Human-readable. 'Transcription failed — audio quality too low' is enough. Technical details go in the audit log." HARD
Amendment 2 — after compliance review
The agency's compliance officer reviewed the system. Two new requirements emerged: data retention limits and the ability to regenerate reports when the template changes.
You "Compliance says we need a retention policy. Audio recordings must be deletable after the case closes. But the transcripts and reports stay."
Pantion "That conflicts with the existing HARD constraint: 'never delete files.' Do you want to amend that constraint?"
You "Yes. Amend it to: never delete files during processing. After a case is explicitly marked as closed, audio files may be purged. Transcripts and reports are permanent." HARD
Pantion "How is a case marked as closed? A file in the case folder? A command? A separate admin interface?"
You "She creates a file called CLOSED in the case folder. That's the signal." HARD
Pantion "When should the audio purge happen? Immediately when CLOSED appears, or after a retention period?"
You "After 30 days. Compliance wants a grace period." FLEX
Pantion "Should the purge be logged in the audit trail?"
You "Yes. Every purged file, with timestamp and the hash of the original file." HARD
You "Second thing: the agency is updating their report template. When the template changes, she needs to regenerate reports for open cases."
Pantion "Regenerate from the original transcript, or re-transcribe the audio too?"
You "From the transcript. The transcription doesn't change — only the report structure." HARD
Pantion "When she regenerates, should it overwrite the existing report or create a versioned copy?"
You "Versioned. The original report stays. The new one gets a version suffix: report-v2.docx, report-v3.docx." HARD
Pantion "How does she trigger regeneration? Per case, per conversation, or all open cases at once?"
You "All open cases at once. She drops a file called REGENERATE in the root inbox. The system walks all non-closed cases and regenerates." HARD
Pantion "Should regeneration be logged separately from normal processing?"
You "Yes. The audit log should distinguish between 'generated' and 'regenerated', including which template version was used." HARD
What a coding agent would have assumed
Without this dialog, a coding agent receiving "turn audio recordings into reports" would silently decide:
- Cloud transcription API — Pantion asked → must be local, HARD
- Copy files, keep originals in place — Pantion asked → must move from inbox, never delete during processing
- No audit logging — Pantion asked → legally critical in this domain, append-only with model versions
- Manual restart after reboot — Pantion asked → must auto-recover, HARD
- Delete failed files — Pantion asked → never delete, only move. Error notification in case folder
- Web upload interface — Pantion asked → NAS folder drop, no GUI needed
- Markdown or PDF output — Pantion asked → .docx required, she edits in Word
- No chunking for long audio — discovered after first build → silence detection, seamless stitching
- No confidence score — discovered after first build → traffic light based on audio quality + uncertainty
- No retention policy — discovered after compliance review → purge audio 30 days after case closure
- No report versioning — discovered after compliance review → versioned .docx, never overwrite
Eleven wrong assumptions. Three of them only surfaced after building and using the system — which is exactly why amendments exist.
The original dialog took 20 minutes. The two amendments took 10 minutes each. 40 minutes of dialog replaced weeks of rebuilding the wrong thing — in a domain where wrong assumptions have real consequences for vulnerable people.