FanFAIR
Semi-automatic assessment of dataset fairness using fuzzy logic.
FanFAIR is a rule-based approach leveraging fuzzy logic to calculate fairness metrics over a dataset and combine them into a single, comprehensive score.
It enables a semi-automatic evaluation of datasets, bridging the gap between raw data and interpretable fairness metrics in algorithmic fairness research.
Key Features
- Fuzzy Metric Fusion: Combines disparate fairness metrics into a single, comprehensive fairness score.
- Semi-Automatic Analysis: Leverages automation for data analysis while allowing for expert qualitative input ( e.g., legal compliance).
- Interpretable Results: Provides visualization of linguistic variables to explain how the overall score is reached.
- Automated Reporting: Generates graphical reports, including gauge charts for overall fairness.
Quick Start
FanFAIR is designed to be as automatic as possible. However, the system allows for human intervention to define qualitative aspects like data protection laws and general data quality.
pip install fanfair
from fanfair import FanFAIR
# Initialize FanFAIR with dataset and output column
FF = FanFAIR(dataset="myfile.csv", output_column="output")
# Set qualitative metrics requiring human intervention
FF.set_compliance( {"data_protection_law": True,
"copyright_law": True,
"medical_law": True,
"non_discrimination_law": False,
"ethics": False})
# Set quantitative data quality score (0.0 to 1.0)
FF.set_quality(0.9)
# Automatically perform analysis and generate report
FF.produce_report()
Technical Highlights
Fuzzy Metric Fusion
Combines multiple disparate fairness metrics into a single, interpretable overall score using fuzzy logic rules.
Human-in-the-Loop
Integrates critical qualitative metrics like legal compliance and ethical assessments via expert input.
Automated Reporting
Generates comprehensive reports, including gauge charts and linguistic variable plots for analysis.
Key Publication
2025
- Investigating fairness with FanFAIR: is pre-processing useful only for performances?In 2025 IEEE Symposium on Computational Intelligence in Health and Medicine (CIHM), 2025
- FanFAIR: sensitive data sets semi-automatic fairness assessmentBMC Medical Informatics and Decision Making, 2025
2023
- Investigating semi-automatic assessment of data sets fairness by means of fuzzy logicIn 2023 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2023