Metric Calibration Dashboard

Enable interactive configuration and validation of risk thresholds and fairness metrics for organizational context

42

Active Metrics

38

Calibrated

4

Pending Calibration

2 hours ago

Last Calibration

Metric Threshold Calibration
Demographic Parity Difference

Measures the difference in positive outcome rates between demographic groups

Threshold Configuration

Green: 95% Amber: 85%

Green Threshold: 95%

Amber Threshold: 85%

Current Performance

Models in Green
142
Models in Amber
87
Models in Red
18

Formula: |P(Ŷ=1|D=privileged) - P(Ŷ=1|D=unprivileged)| < threshold

Calibration History & Trends
Threshold Evolution
6080100 JanFebMarAprMayJun
Green Threshold
Amber Threshold
Model Distribution Changes
Weighted Risk Score Configuration

Configure the weights for each risk dimension in the unified AI Risk Score calculation

AI Risk Score (ARS) = Σ(w_i × s_i) where w_i is the weight and s_i is the normalized score

Fairness Bias:

15%

%

Security:

12%

%

Drift:

10%

%

Opaqueness:

8%

%

Robustness:

10%

%

Privacy:

12%

%

Performance:

8%

%

Compliance:

15%

%

Operational:

5%

%

Value:

5%

%


Total Weight: 100%
Valid Configuration
Recent Calibration Actions
Timestamp Metric Action Previous Value New Value User Impact
2024-01-15 14:30 Demographic Parity Threshold Update 0.08 0.05 Sarah Chen
23 models affected
2024-01-15 11:15 PSI Weight Adjustment 12% 10% Mike Johnson
Portfolio-wide
2024-01-14 16:45 Adversarial Score Formula Change v1.2 v1.3 AI Gov Team
178 models affected
2024-01-14 09:30 SHAP Fidelity Threshold Update 80% 85% Lisa Wang
12 models affected
AI-Powered Calibration Recommendations

Demographic Parity

High Priority

Tighten threshold from 0.05 to 0.03

Based on: 95% of models already meet stricter threshold, regulatory trend toward tighter fairness requirements

Data Drift PSI

Medium Priority

Increase monitoring frequency to hourly

Based on: 3 critical models showed rapid drift in past week, early detection would prevent issues

Security Score Weight

High Priority

Increase weight from 12% to 18%

Based on: Recent industry attacks highlight security importance, peer benchmarks show 15-20% typical

Explainability Threshold

Low Priority

Consider model-specific thresholds

Based on: High-risk models need 90%+ fidelity, low-risk could accept 70%
An unhandled error has occurred. Reload 🗙