interpretability-and-explainability

[CVPR 2025] Concept Bottleneck Autoencoder (CB-AE) -- efficiently transform any pretrained (black-box) image generative model into an interpretable generative concept bottleneck model (CBM) with minimal concept supervision, while preserving image quality

computer-vision deep-learning interpretable-deep-learning concept-bottleneck-models interpretability-and-explainability generative-ai mechanistic-interpretability

Updated Nov 4, 2025
Jupyter Notebook

warisgill / TraceFL

Star

TraceFL is a novel mechanism for Federated Learning that achieves interpretability by tracking neuron provenance. It identifies clients responsible for global model predictions, achieving 99% accuracy across diverse datasets (e.g., medical imaging) and neural networks (e.g., GPT).

testing debugging machine-learning software-engineering accountability differential-privacy interpretability federated-learning explainable-ai explainability interpretability-and-explainability

Updated Nov 12, 2024
Python

tail-unica / hopwise

Star

hopwise: A Python Library for Explainable Recommendation based on Path Reasoning over Knowledge Graphs, ACM CIKM '25

recommender-systems link-prediction knowledge-graph-embeddings interpretability-and-explainability path-reasoning

Updated Nov 8, 2025
Python

interpretable-ml-class / interpretable-ml-class.github.io

Star

Explainable AI: From Simple Rules to Complex Generative Models

ai ml interpretability explainable-ai explainable-ml explainability interpretability-and-explainability

Updated Jun 28, 2023
HTML

bgreenwell / ebm

Star

Explainable Boosting Machines

machine-learning ai blackbox interpretability interpretable-ai interpretable-ml explainable-ai explainable-ml xai interpretable interpretable-machine-learning iml glassbox interpretable-models explainable-machine-learning interpretability-and-explainability

Updated Mar 6, 2025
R

Skyyyy0920 / SSCBM

Star

Semi-supervised Concept Bottleneck Models (SSCBM)

concept-bottleneck-models interpretability-and-explainability

Updated Oct 24, 2025
Python

KU-HJH / CoIBA

Star

This repository contains the official code of the paper: "Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers", which is published in CVPR 2025.

interpretable-deep-learning interpretable-ai interpretable-ml explainable-ai explainable-ml interpretable-machine-learning interpretability-and-explainability

Updated Aug 18, 2025
Python

Imenbaa / BA-LR

Star

Explainable Speaker Recognition

forensics speaker-recognition resnet34 likelihood-ratio x-vector-pytorch interpretability-and-explainability automatic-voice-comparison forensic-speaker-recognition

Updated Oct 26, 2022
Python

sirraya-tech / Sirraya_LSD_Code

Star

Layer-wise Semantic Dynamics (LSD) is a model-agnostic framework for hallucination detection in Large Language Models (LLMs). It analyzes the geometric evolution of hidden-state semantics across transformer layers, using contrastive alignment between model activations and ground-truth embeddings to detect factual drift and semantic inconsistency.

nlp machine-learning deep-learning pytorch representation-learning language-model-evaluation contrastive-learning trustworthy-ai large-language-models interpretability-and-explainability geometric-learning hallucination-detection hallucination-mitigation factuality-evaluation semantic-geometry transformer-analysis semantic-drift-analysis

Updated Oct 28, 2025
Python

VictorNico / NNs_from_scratch

Star

Build a Neural net from scratch without keras or pytorch just by using numpy for calculus, pandas for data loading.

neural-network numpy eda pandas mnist oop-principles docstrings simulated-annealing-algorithm interpretability-and-explainability gradiantdescent

Updated Dec 8, 2022
Jupyter Notebook

DimitrisReppas / On_visual_explanation_of_supervised_and_self-supervised_learning

Star

Visualization methods to interpret CNNs and Vision Transformers, trained in a supervised or self-supervised way. The methods are based on CAM or on the attention mechanism of Transformers. The results are evaluated qualitatively and quantitatively.

deep-neural-networks computer-vision transformers supervised-learning cnns explainable-ai self-supervised-learning interpretability-and-explainability

Updated Jan 17, 2023
Python

MattScicluna / interpretable_tsne

Star

Implementation of the gradient-based t-SNE sttribution method described in our GLBIO oral presentation: 'Towards Computing Attributions for Dimensionality Reduction Techniques'

dimensionality-reduction research-tool interpretability-and-explainability

Updated May 27, 2024
Python

Improve this page

Add a description, image, and links to the interpretability-and-explainability topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the interpretability-and-explainability topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

interpretability-and-explainability

Here are 26 public repositories matching this topic...

ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models

HennyJie / IBGNN

liugangcode / GREA

Wuyxin / DISC

WanyuGroup / CVPR2022-OrphicX

cwangrun / ST-ProtoPNet

vdlad / Remarkable-Robustness-of-LLMs

cwangrun / MGProto

Trustworthy-ML-Lab / posthoc-generative-cbm

warisgill / TraceFL

tail-unica / hopwise

interpretable-ml-class / interpretable-ml-class.github.io

bgreenwell / ebm

Skyyyy0920 / SSCBM

KU-HJH / CoIBA

Imenbaa / BA-LR

sirraya-tech / Sirraya_LSD_Code

VictorNico / NNs_from_scratch

DimitrisReppas / On_visual_explanation_of_supervised_and_self-supervised_learning

MattScicluna / interpretable_tsne

Improve this page

Add this topic to your repo