Population health thinking with Bayesian networks

Our comforting conviction that the world makes sense rests on a secure foundation: our almost unlimited ability to ignore our ignorance.— Daniel Kahneman [1]

Read working paper here: http://bit.ly/ph-thinking

Introduction: data science and decision quality

Population health is a systems framework for studying and improving the health of populations through collective action and learning [2]. Population health data science (PHDS) is the art and science of transforming data into actionable knowledge to improve health [2]. Actionable knowledge is information that informs, influences or optimizes decisions. A decision is a choice between two or more alternatives that involves an irrevocable allocation of resources [3,4]. Every decision has an opportunity cost—the lost net benefit of the better option not chosen. Hence: “The roads we take are more important than the goals we announce. Decisions determine destiny.”

Decisions drive strategic, tactical and operational execution. Decisions are based on causal and probabilistic assumptions (“choosing and doing action \(A\) (over say, \(B\)) will acheive net effect \(Y\) with probability \(p\).”). This “prior probability” is a prediction. Based on our evaluation, we adjust our causal and/or prediction assumptions—this is learning. Learning leads to new decisions and new actions (adaptation). Improvements are adaptations that make processes and/or results better. Continuous improvement is one of the foundational pillars of a learning organization.1 Population health improvement requires continuous decision improvement.

The human brain specializes in prediction (also called concepts, memory, schema, etc.) [6]. In 2002, psychologist Daniel Kahneman won the Nobel Prize in Economics for the pioneering studies that cataloged human cognitive biases and pitfalls that formed the foundation of the new field of behavioral economics [1]. It turns out that humans are not good at estimating probabilities (probabilistic reasoning), especially for novel circumstances. We also have nonconscious cognitive biases that affect our ability to draw valid causal inferences and to change course when we are wrong. We are prone to defensiveness to protect our ego and to avoid our fears [7]. This journey will require intellectual humility to acknowledge our innate cognitive limitations and curiosity to experiment with a new way of computational and inferential thinking [8].

In epidemiology, analyses are generally classified as descriptive or analytic. In PHDS we extend this to five analytic domains (Table 1), all of which should produce actionable knowledge in service of a strategic, tactical or operational goal or objective.

Table 1: Population health data science analytic domain level
Level Analysis Population health purpose
1 Description measuring risk or protective factors, and outcomes
2 Prediction early detecting and targeting of prevention interventions
3 Explanation discovering and estimating causal or intervention effects
4 Simulation modeling for epidemiologic or decision insights
5 Optimization informing, influencing or optimizing decision making
Population health data science landscape (source: http://www.bayesia.com/)

Figure 1: Population health data science landscape (source: http://www.bayesia.com/)

The purpose of this paper is to introduce a “Reasoning” framework (see Figure 1) which I call population health thinking (PHT); that is, the conceptual or computational use of Bayesian networks (BNs) for

  1. probabilistic reasoning (PR) with BNs,
  2. causal inference (CI) with causal BNs (i.e., directed acyclic graphs), and
  3. decision quality (DQ) with decision BNs (Table 2).

BNs are probabilistic graphical models that can be drawn using one’s expert knowledge and wisdom. First, to get the most out of PHT, master the BN concepts with pencil and paper. Second, explore deploying computational tools to work your intuition and build your confidence. In this article I illustrate the concepts using R—an open source language and environment for statistical computing and graphics [9]. Advanced PHT usually requires turning to computers or to colleagues for computational support.

For those who only want the minimum core PHT, (a) commit to intellectural humility and curiosity, (b) use DQ as a checklist, and (c) study program theory on p. . Every public health intervention has a program theory (“theory of change”).

Table 2: Decision quality requirements: A decision is only as strong as its weakest link
Name Quality requirements Key DQ questions
Frame Appropriate frame What are we deciding and why?
Reasoning Sound reasoning Are we thinking straight?
Data Actionable knowledge What do we need to know?
Results Clear values & trade-offs What consequences do we care about?
Choices Creative alternatives What choices do we have?
Commitment Commitment to action Is there commitment to action?

Read working paper here: http://bit.ly/ph-thinking


1. Kahneman D. Thinking, fast and slow. New York: Farrar, Straus; Giroux; 2011.

2. Aragón TJ. We will be the best at getting better: An introduction to population health lean. University of California, eScholarship [Internet]. 2017 Feb; Available from: http://www.escholarship.org/uc/item/825430qn

3. Howard RA, Abbas AE. Foundations of decision analysis. 1st ed. Pearson; 2015.

4. Spetzler C, Winter H, Meyer J. Decision quality: Value creation from better business decisions. 1st ed. Wiley; 2016.

5. Aragón TJ. PDSA problem-solving: With a gentle introduction to double-loop learning, program theory, and causal graphs. University of California, eScholarship [Internet]. 2017; Available from: http://www.escholarship.org/uc/item/8wp451vd

6. Barrett L. How emotions are made: The secret life of the brain. Mariner Books; 2018.

7. Hess E. Humility is the new smart: Rethinking human excellence in the smart machine age. Oakland, CA: Berrett-Koehler Publishers, a BK Business Book; 2017.

8. Adhikari A, DeNero J. Computational and inferential thinking: The foundations of data science [Internet]. Self-published; 2018. Available from: https://www.inferentialthinking.com/

9. R Core Team. R: A Language and Environment for Statistical Computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; Available from: https://www.r-project.org/

  1. A learning organization requires other components. For details, see Aragón, et al. [5]

Tomás J. Aragón
Health Officer, City & County of San Francisco; Director, Population Health Division


comments powered by Disqus