Experiment
This experiment explores whether a schema-constrained large language model can reliably transform unstructured, natural conversation into structured, database-ready records with a human deliberately kept in the loop.
Rather than treating AI as an autonomous system, this work tests a hybrid model: conversations are captured automatically, reviewed by an administrator for legitimacy and completeness, and only then parsed by an LLM into normalized name-value pairs for downstream use. The goal is not automation at all costs, but precision, control, and trust in data extracted from real human dialogue.
Other experiments (all types)

Experiment goals
Can a schema-guided LLM consistently extract clean, structured data from free-form human conversation (without sacrificing accuracy or data integrity) when paired with a human validation checkpoint?
This experiment intentionally avoids treating the model as an infallible system. Instead, it assumes conversational input is messy, ambiguous, and sometimes invalid — and designs for that reality.
What the code does
At a technical level, this experiment implements an automated pipeline that converts unstructured chat transcripts into structured records suitable for storage and querying. The system is designed to be portable, secure, and deployable as a serverless backend function.
- A user completes a conversation with a digital avatar.
- The raw chat transcript is captured and stored by WordPress.
- A site administrator reviews the transcript to confirm the interaction is legitimate and substantive.
- Upon approval, the transcript is sent to a Google Gemini–powered parsing service.
- The model extracts predefined entities as strict name-value pairs.
- Parsed values are normalized and written to a Supabase database for further processing.
Human-in-the-loop
Human-in-the-loop is not a fallback; it is a design decision. This approach acknowledges a core truth of conversational interfaces: not every interaction deserves to become a database record. Rather than forcing AI to guess intent or completeness, this system allows a human reviewer to:
- Validate that a conversation is real and relevant
- Confirm that required information was actually collected
- Decide when automation should proceed
Why it is valuable
Conversational interfaces are increasingly replacing traditional forms, yet most backend systems still expect rigid structure. This experiment bridges that gap. Rather than replacing humans, the system reallocates human effort to where it matters most: judgment, not transcription. It demonstrates how:
- AI can extract meaning without requiring users to conform to rigid input patterns
- Schema constraints can dramatically improve consistency and query-ability
- Human validation can reduce error propagation in AI-assisted workflows
Reference artifacts








