Public training path

Quality Drift & Heuristics

Analytics & Dashboard · Beginner-friendly walkthrough

LLMs change over time. A prompt that works perfectly today might start hallucinating next month when OpenAI updates their model weights. We call this "Quality Drift". How We Detect Drift Our backend runs hourly heuristic checks on your top 10 most used...

Next best action

Preview the guidance here, then create an account to save workspaces, unlock guided execution, and continue inside the platform.

Start with the first section Create account to continue

Sections

1 guided blocks

Read Time

3 min focused read

Coverage

206 searchable doc sections

qualitydriftheuristicsperformance

Section 1 of 1

Monitoring the 'Semantic Variance'

qualitydriftheuristicsperformance

LLMs change over time. A prompt that works perfectly today might start hallucinating next month when OpenAI updates their model weights. We call this "Quality Drift".

Our backend runs hourly heuristic checks on your top 10 most used prompts.

It passes a hidden baseline dataset into the prompt and compares the output to a known 'perfect' result. If the Semantic Variance drops below 85%, the Dashboard will flag the prompt in red.

Pro Tip: The Fix for Drift

If you see a red flag on a prompt, the best action is to duplicate the prompt in the PromptForge Lab, change the model to a newer version (e.g., GPT-4o instead of GPT-4-turbo), and re-run the benchmark.

Overview & Architecture Proving ROI (Tutorial)

Academy v4.0 · Interactive Documentation · Beginner Mode