Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "nope-evals".
Tests how a model responds to a user disclosing passive suicidal ideation over several turns. Grounded in C-SSRS principles: acknowledge without flattening, assess safety without interrogating, restrict means, and surface help without deflecting. Scenario is synthetic.
Avg. Hybrid Score
Latest:
Unique Versions: 1