Skip to main content

Overview

Rotavision’s trust scoring system provides a unified framework for measuring AI system trustworthiness across multiple dimensions. Each dimension is scored from 0-100, with higher scores indicating greater trustworthiness.

Trust Dimensions

Fairness

Measures equitable treatment across protected groups

Reliability

Measures consistency and stability of predictions

Explainability

Measures how well predictions can be understood

Privacy

Measures data protection and privacy preservation

Overall Trust Score

The overall trust score is a weighted combination of individual dimensions:
Trust Score = Σ (dimension_score × dimension_weight)
Default weights are:
  • Fairness: 30%
  • Reliability: 30%
  • Explainability: 25%
  • Privacy: 15%
Weights can be customized based on your industry and regulatory requirements. Financial services often increase fairness weight, while healthcare may prioritize explainability.

Fairness Metrics

Vishwas calculates fairness using industry-standard metrics:
MetricDescriptionThreshold
Demographic ParityEqual positive prediction rates across groups≥ 0.80
Equalized OddsEqual TPR and FPR across groups≥ 0.80
CalibrationPredicted probabilities match actual outcomes≥ 0.80
Individual FairnessSimilar individuals receive similar predictions≥ 0.75
Counterfactual FairnessPredictions unchanged if protected attributes changed≥ 0.80

Calculating Demographic Parity

# Demographic parity ratio
dp_ratio = P(Ŷ=1 | A=minority) / P(Ŷ=1 | A=majority)

# Score (0-100)
fairness_score = min(dp_ratio, 1/dp_ratio) * 100

Multi-Group Fairness

For attributes with multiple groups (e.g., states, languages), Rotavision calculates:
  1. Pairwise ratios between all group pairs
  2. Minimum ratio as the fairness bound
  3. Weighted average based on group sizes

Reliability Metrics

Guardian monitors reliability through:
MetricDescriptionAlert Threshold
Prediction DriftKL divergence of output distribution> 0.1
Feature DriftPSI of input features> 0.2
Accuracy DecayDrop in monitored accuracy metric> 5%
Latency P9999th percentile response time> SLA
Error RatePercentage of failed predictions> 1%

Drift Detection

# Population Stability Index (PSI)
psi = Σ (actual_% - expected_%) × ln(actual_% / expected_%)

# Interpretation
# PSI < 0.1  → No significant drift
# PSI 0.1-0.2 → Moderate drift (monitor)
# PSI > 0.2  → Significant drift (investigate)

Explainability Scores

Measured through explanation quality metrics:
MetricDescription
FaithfulnessHow accurately explanations reflect model behavior
StabilityConsistency of explanations for similar inputs
ComprehensibilityHuman-understandable explanation complexity
CompletenessCoverage of important features in explanations

Score Interpretation

Model meets highest trust standards. Suitable for high-stakes decisions with minimal additional oversight.
Model is generally trustworthy. Consider targeted improvements for specific dimensions below threshold.
Significant trust gaps exist. Recommend human oversight and remediation plan before production use.
Model does not meet minimum trust requirements. Do not deploy without major improvements.

Industry Benchmarks

Based on our analysis of enterprise AI deployments in India:
IndustryAverage Trust ScoreTop Quartile
Banking & Finance7285+
Insurance6882+
Healthcare6580+
E-commerce7083+
Telecom7486+

Regulatory Alignment

Rotavision trust scores map to regulatory requirements:
RegulationRelevant Dimensions
RBI AI GuidelinesFairness, Explainability
DPDP Act 2023Privacy, Transparency
SEBI ML CircularReliability, Auditability
IRDAI AI GuidelinesFairness, Explainability
Generate compliance-ready reports with vishwas.generate_report() that map your scores to specific regulatory requirements.

Improving Trust Scores

1

Identify Gaps

Review dimension-level scores to find areas below threshold
2

Analyze Root Causes

Use Vishwas explanations to understand why specific metrics are low
3

Implement Mitigations

Apply recommended techniques (resampling, threshold adjustment, etc.)
4

Monitor Continuously

Set up Guardian alerts to catch score degradation early