
LLMs feel random when they’re treated like text boxes. They feel reliable when they’re treated like nodes in a workflow with contracts, control loops, and safety rails. In this article, we’ll explore how to move from prompt luck to system design using four durable patterns: function calling, JSON schemas, retries that repair, and guardrails.
What’s an “LLM node”
Think of each LLM step as a node in a graph. It takes a structured input, runs a bounded task, and emits a structured output. You can test it. You can measure it. You can swap it without breaking downstream systems. In short, it’s a component.
Pattern 1: Function calling is the backbone
Function calling turns the model from a writer into a router and planner. The model selects tools and populates typed arguments that your system validates before execution.
- Treat every tool as a narrow API with strong typing.
- Keep argument schemas small and explicit.
- Enforce timeouts and circuit breakers around tools.
- Make tool calls idempotent or include request IDs.
- Log the full “reason -> tool -> args -> result” chain for auditability.
Pattern 2: JSON schemas make outputs contract-first
Natural language is ambiguous. Contracts aren’t. Ask the model to produce a JSON object that matches a known schema; validate it before anything downstream runs. If it doesn’t validate, you repair or fall back.
Minimal example:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "CampaignBrief",
"type": "object",
"additionalProperties": false,
"required": ["audience", "offer", "channels", "kpis"],
"properties": {
"audience": { "type": "string", "minLength": 3 },
"offer": { "type": "string", "minLength": 3 },
"channels": {
"type": "array",
"minItems": 1,
"items": { "type": "string", "enum": ["email", "linkedin", "x", "blog", "ads"] }
},
"tone": { "type": "string", "enum": ["direct", "casual", "formal"] },
"kpis": {
"type": "array",
"minItems": 1,
"items": { "type": "string" }
},
"constraints": {
"type": "array",
"items": { "type": "string" }
}
}
}
Pattern 3: Retries that repair, not repeat
Plain retries amplify randomness. Useful retries target the failure.
- Detect failure class: schema mismatch, tool error, policy violation, or low confidence.
- Feed the exact error back to the model with a short repair prompt.
- Apply exponential backoff with jitter and a hard cap on attempts.
- Use small, local fixes; don’t regenerate everything if one field is wrong.
- If the node can’t be repaired within N attempts, trigger a deterministic fallback.
Pattern 4: Guardrails everywhere
Guardrails reduce the chance you need a retry at all.
Put them on inputs, outputs, and tools:
- Input filters for PII, profanity, and obvious prompt injection.
- Output scanners for policy, PII, and disallowed claims.
- Allowlists for tools and arguments; block risky combinations.
- Rate limits and concurrency caps to protect downstream services.
- Human-in-the-loop checkpoints for high-risk actions.
The reliability pipeline (end to end)
- Normalize the request and attach context from your datastore.
- Pre-screen inputs for policy and injection.
- Ask the LLM to plan, then call functions with typed args.
- Validate tool args before executing, then run the tool.
- Ask the LLM to assemble the final output as JSON.
- Validate JSON against schema; if invalid, run a repair loop.
- Scan output for policy and safety.
- On success, persist with a content hash and trace ID.
- On failure, fall back to a deterministic template or a previous good version.
- Emit metrics and traces; sample payloads for offline review.
Example
Task: Create a B2B campaign brief from CRM data.
- Node 1: Plan. The model chooses get_customer_profile(account_id) and fetch_competitive_notes(account_id).
- Node 2: Tools. Both functions return typed objects; timeouts and idempotency enforced.
- Node 3: Synthesis. The model outputs a CampaignBrief JSON that must validate.
- Node 4: Repair. If channels is empty, the validator throws; the model is asked to fill it using the provided CRM industry hints.
- Node 5: Safety. Output scanner blocks disallowed claims and PII leaks.
- Node 6: Publish. Brief is stored with trace ID, and a formatted human-friendly version is rendered from the JSON.
Starter checklist
- Define schemas for every LLM output you’ll consume.
- Wrap each external capability as a typed function.
- Add pre- and post-filters for safety and injection.
- Implement targeted repair loops with clear error messages.
- Set fallbacks per node; don’t fail the whole flow.
- Log traces with model, prompt template version, and tool payloads.
- Track conformance, retries, and policy events in a dashboard.
- Write unit tests with recorded prompts and golden JSON fixtures.
Prompts still matter…just differently
Prompts become interfaces, not essays. Keep them short, instruct on structure first, and reserve style tuning for the final render. The model’s job is to produce valid, useful data; your system’s job is to make that data safe and actionable.