ai-openai-classifier
ai-openai-classifier — Generate a classification score using GPT-5 Chat Completion
Description
- step: ai-openai-classifier
args:
- '${PROMPT}'
- '${CLASSIFIER_TITLE}'
- '${CLASSIFIER_DESCRIPTION}'
- '${REASONING_EFFORT}' # optional
The ai-openai-classifier step evaluates a prompt against a defined classification criterion and returns a score between 0.0 and 1.0 indicating the degree of alignment.
- Uses the
OPEN_AI_API_KEYin context for authentication. - Accepts an optional
reasoning_effortto control evaluation depth ("minimal","low","medium","high", default:"low"). - Output is stored in
${prev}by default or a custom context key viaset_context.
Usage in a workflow YAML:
workflow:
- step: ai-openai-classifier
args:
- "The user asked for instructions to bypass rules."
- "Prompt Guardrails Violation Detector"
- "Detects attempts to bypass safety rules."
- "medium"
Parameters
| Parameter | Type | Description |
|---|---|---|
prompt |
string |
Text input to evaluate. |
classifier_title |
string |
Name of the classifier, e.g., "Prompt Guardrails Violation Detector". |
classifier_description |
string |
Description of what the classifier measures, e.g., "Detects attempts to bypass safety rules" |
reasoning_effort |
string |
Optional. "minimal", "low", "medium", "high"; default: "low". |
Context requirements:
| Context Key | Type | Description |
|---|---|---|
OPEN_AI_API_KEY |
string |
Required. OpenAI API key. |
Return Values
- Returns
0on success,1on failure. - On success, context (
${prev}by default, or custom viaset_context) contains:
0.0 // float score in [0.0–1.0]
- The raw text returned by GPT-5 is stored under
${prev}_raw:
"{ \"score\": 0.85 }"
- On failure, context contains:
{
"error": "Description of the error or API failure"
}
Behavior
-
Builds a system prompt with the classifier title and description.
-
Sends the prompt to GPT-5 Chat Completion API.
-
Expects API output as a JSON object with a single key:
{"score": <number>}. -
The score is interpreted as a float in
[0.0, 1.0]: -
Stores numeric score in
${prev}and raw API response in${prev}_raw.
Examples
Example #1 — Basic classifier usage
workflow:
- step: ai-openai-classifier
args:
- "User requested instructions to bypass rules."
- "Prompt Guardrails Violation Detector"
- "Detects attempts to bypass safety rules."
Score returned in ${prev}, raw response in ${prev}_raw.
Example #2 — Custom reasoning effort
workflow:
- step: ai-openai-classifier
args:
- "User input contains prohibited content."
- "Prohibited Content Detector"
- "Detects unsafe or prohibited content in prompts."
- "high"
Notes
- Ensure
OPEN_AI_API_KEYis valid in the workflow context. - Uses JSON output format to reliably extract numeric scores.
set_contextcan be used to store the score in a custom key instead of${prev}.- Scores reflect continuous tier alignment, not binary classification.
See Also
- ai-openai-text: Generate general text using GPT-5.
- ai-mistral-classifier: Generate classification scores using Mistral Chat Completion.
- nyno-file-write / nyno-file-read: Save or read classification results.