Can we trust decisions made by AI?

15.8K subscribers

4,634 views

308

About
Share

Published On Jan 7, 2024

Since AI has started making more and more significant decisions on behalf of humans, it's important that we understand why it is making those decisions. Explainable artificial intelligence or XAI describes a system that can explain its reasoning to us. This is one of the best ways to engender trust in an AI system.

Unfortunately, neural networks, the basis of most state of the art AI systems today, are opaque to explanation. Hence, interpretation must be achieved in some other way, for example by training a proxy model or performing systematic perturbations of the neural network. This video has a short section on technical mechanistic interpretability methods (skip if desired).

Conceptually, neural networks pick up on correlations in their training data. These correlations may be related to the actual causes of features and classes, or they may be spurious correlations due to biases in the training data. It can be difficult to eliminate bias from automated systems, especially if one does not realize the bias is present. Work on explainable AI tries to focus more on causes and less on correlations, since an inaccurate model of the world will cause the model to be tripped up at some point.

Perhaps in the future, other algorithms like Bayesian inference will be used, which focus more on underlying causes than existing back propagation. As we work to align AI systems with human values, achieving explainable AI is an important stepping stone.

#trustworthy #ai #explainability

Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence
https://www.sciencedirect.com/science...

Artificial cognition: How experimental psychology can help generate explainable artificial intelligence
https://link.springer.com/article/10....

AI Scientists: Safe and Useful AI?
https://yoshuabengio.org/2023/05/07/a...

Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence
https://link.springer.com/article/10....

A transparency and interpretability tech tree
https://www.alignmentforum.org/posts/...

Tackling bias in artificial intelligence (and in humans)
https://www.mckinsey.com/featured-ins...

0:00 Intro
0:28 Contents
0:37 Part 1: How does AI make decisions?
0:50 Example: dogs or muffins?
1:31 Example: car accidents in the snow
2:12 Correlation does not imply causation
2:54 Bias in models due to correlations
3:20 Bias in facial recognition tech
3:36 Example: spam filter misclassification
4:07 Different ways to combat bias
4:46 Part 2: Explainable decision making
4:54 Definition of interpretability and explainability
5:20 What makes a good explanation?
5:33 Explainable algorithms
6:08 Other desirable algorithmic properties
6:36 How do humans make decisions?
7:14 How humans work around bias
7:42 RACI matrix for delegation
8:39 Why AI models can't be accountable
9:11 Part 3: Achieving more explainable AI
9:20 TECHNICAL: Idea 1: Interpretability via proxy model
9:43 TECHNICAL: ask introspective model to produce explanations
10:05 TECHNICAL: saliency maps to determine most important inputs
10:38 TECHNICAL: post-hoc via perturbations to weights
11:18 Idea 2: Lessons from cognitive psychology
11:49 Attempt to falsify other explanations
12:27 Steps for the psychological method
12:57 Idea 3: Create more explainable algorithms
13:19 Multiple correlations, multiple theories
13:44 New algorithm based on Occam's razor
14:30 Bayesian inference summary
14:58 Conclusion
15:57 Outro

Published On Jan 7, 2024

Share/Embed

Video Link