# Let the schema drive extraction, not the prompt

- Category: product
- Author: Mihai (https://indie.md/people/mihai-balint/)
- Source: https://indie.md/events/indie-tm-10-timisoara-june-2026/
- Canonical URL: https://indie.md/advice/schema-driven-extraction-loop/

When you extract structured data with a model, do not prompt-and-pray and then parse the prose. Define the output schema first and let a framework like PydanticAI run the model in a loop, feeding validation errors back in until the result conforms. OCRskill's API takes the fields you ask for and returns a typed object that already matches your schema, so the caller never writes brittle string parsing. The pattern turns an unreliable text generator into a dependable function: the schema is the contract, the loop is the enforcement, and the caller gets data instead of a paragraph. Any time you are tempted to regex an LLM's answer, reach for structured output with a validation loop instead.
