Wednesday, May 6, 2020

The Value of Business Experiments Part 3: Strategy and Alignment

In previous posts I have discussed the value proposition of business experiments from both a classical and behavioral economic perspective. This series of posts has been greatly influenced by Jim Manzi's book 'Uncontrolled: The Surprising Payoff of Trial-and-Error for Business, Politics, and Society.' Midway through the book Manzi highlights three important things that experiments in business can do:

1) They provide precision around the tactical implementation of strategy
2) They provide feedback on the performance of a strategy which allows for refinements to be driven by evidence
3) They help achieve organizational and strategic alignment

Manzi explains that within any corporation there are always silos and subcultures advocating competing strategies with perverse incentives and agendas in pursuit of power and control. How do we know who is right and which programs or ideas are successful considering the many factors that could be influencing any outcome of interest? Manzi describes any environment where the number of causes of variation are enormous as an environment that has 'high causal density.' We can claim to address this with a data driven culture, but what does that mean? Modern companies in a digital age with AI and big data are drowning in data. This makes it easy to adorn rhetoric in advanced analytical frameworks. Because data seldom speaks, anyone can speak for the data through wily data story telling.

As Jim Manzi and Stefan Thomke discuss in Harvard Business Review:

"business experiments can allow companies to look beyond correlation and investigate causality....Without it, executives have only a fragmentary understanding of their businesses, and the decisions they make can easily backfire."

In complex environments with high causal density, we don't know enough about the nature and causes of human behavior, decisions, and causal paths from actions to outcomes to list them all and measure and account for them even if we could agree how to measure them. This is the nature of decision making under uncertainty. But, as R.A. Fisher taught us with his agricultural experiments, randomized tests allow us to account for all of these hidden factors (Manzi calls them hidden conditionals). Only then does our data stand a chance to speak truth.

Having causal knowledge helps identify more informed and calculated risks vs. risks taken on the basis of gut instinct, political motivation, or overly optimistic data-driven correlational pattern finding analytics.

Experiments add incremental knowledge and value to business. No single experiment is going to be a 'killer app' that by itself will generate millions in profits. But in aggregate the knowledge created by experiments probably offers the greatest strategic value across an enterprise compared to any other analytic method.

As Luke Froeb writes in Managerial Economics, A Problem Solving Approach (3rd Edition):

"With the benefit of hindsight, it is easy to identify successful strategies (and the reasons for their success) or failed strategies (and the reason for their failures). Its much more difficult to identify successful or failed strategies before they succeed or fail."

Business experiments offer the opportunity to test strategies early on a smaller scale to get causal feedback about potential success or failure before fully committing large amounts of irrecoverable resources. This takes the concept of failing fast to a whole new level.

Achieving the greatest value from business experiments requires leadership commitment.  It also demands a culture that is genuinely open to learning through a blend of trial and error, data driven decision making, and the infrastructure necessary for implementing enough tests and iterations to generate the knowledge necessary for rapid learning and innovation. The result is a corporate culture that allows an organization to formulate, implement, and modify strategy faster and more tactfully than others.

See also:
The Value of Business Experiments: The Knowledge Problem
The Value of Business Experiments Part 2: A Behavioral Economics Perspective
Statistics is a Way of Thinking, Not a Box of Tools

Tuesday, April 21, 2020

The Value of Business Experiments Part 2: A Behavioral Economic Perspective

In my previous post I discussed the value proposition of business experiments from a classical economic perspective. In this post I want to view this from a behavioral economic perspective. From this point of view business experiments can prove to be invaluable with respect to challenges related to overconfidence and decision making under uncertainty.

Heuristic Data Driven Decision Making and Data Story Telling

 In a fast paced environment, decisions are often made quickly and often based on gut decisions. Progressive companies have tried as much as possible to leverage big data and analytics to be data driven organizations. Ideally, leveraging data would help to override biases and often gut instincts and ulterior motives that may stand behind a scientific hypothesis or business question. One of the many things we have learned from behavioral economics is that humans tend to over interpret data into unreliable patterns that lead to incorrect conclusions. Francis Bacon recognized this over 400 years ago:

"the human understanding is of its own nature prone to suppose the existence of more order and regularity in the world than it finds" 

Decision makers can be easily duped by big data, ML, AI, and various  BI tools into thinking that their data is speaking to them. As Jim Manzi and Stefan Thomke state in Harvard Business Review in the absence of formal randomized testing:

"executives end up misinterpreting statistical noise as causation—and making bad decisions"

Data seldom speaks, and when it does it is often lying. This is the impetus behind the introduction of what became the scientific method. The true art and science of data science is teasing out the truth, or what version of truth can be found in the story being told. I think this is where field experiments are most powerful and create the greatest value in the data science space. 

Decision Making Under Uncertainty, Risk Aversion, and The Dunning-Kruger Effect

Kahneman (in Thinking Fast and Slow) makes an interesting observation in relation to managerial decision making. Very often managers reward peddlers of even dangerously misleading information while disregarding or even punishing merchants of truth. Confidence in a decision is often based more on the coherence of a story than the quality of information that supports it. Those that take risks based on bad information, when it works out, are often rewarded. To quote Kahneman:

"a few lucky gambles can crown a reckless leader with a Halo of prescience and boldness"

As Kahneman discusses in Thinking Fast and Slow, those that often take the biggest risks are not necessarily any less risk averse, they simply are often less aware of the risks they are actually taking. This leads to overconfidence and lack of appreciation for uncertainty, and a culture where a solution based on pretended knowledge is often preferred and even rewarded. Its easy to see how the Dunning-Kruger effect would dominate. This feeds a viscous cycle that leads to collective blindness toward risk and uncertainty. It leads to taking risks that should be avoided in many cases, and prevents others from considering better but perhaps less audacious risks. Field experiments can help facilitate taking more educated gambles. Thinking through an experimental design (engaging Kahneman's system 2) provides a structured way of thinking about business problems and how to truly leverage data to solve them. And the data we get from experimental results can be interpreted causally. Identification of causal effects from an experiment helps us distinguish if outcomes are likely due to a business decision, as opposed to blindly trusting gut instincts, luck, or the noisy patterns we might find in the data. 

Just as rapid cycles of experiments in a business setting can aid in the struggle with the knowledge problem, they also provide an objective and structured way of thinking about our data and the conclusions we can reach from it while avoiding as much as possible some of these behavioral pitfalls. A business culture that supports risk taking coupled with experimentation will come to value a preferred solution over pretended knowledge. That's valuable. 

See also:

Monday, April 20, 2020

The Value of Business Experiments and the Knowledge Problem

Why should firms leverage randomized business experiments? With recent advancements in computing power and machine learning, why can't they simply base all of their decisions on historical observational data? Perhaps statisticians and econometricians and others have a simple answer. Experiments may be the best (often the golden standard) way of answering causal questions. I certainly can't argue against answering causal questions (just read this blog). However, here I want to focus on a number of more fundamental reasons that experiments are necessary in business settings from the perspective of both classical and behavioral economics:

1) The Knowledge Problem
2) Behavioral Biases
3) Strategy and Tactics

In this post I want to discuss the value of business experiments from more of a neoclassical economic perspective. The fundamental problem of economics, society, and business is the knowledge problem. In his famous 1945 American Economic Review article The Use of Knowledge in Society, Hayek argues:

"the economic problem of society is not merely a problem of how to allocate 'given resources' is a problem of the utilization of knowledge which is not given to anyone in its totality."

A really good parable explaining the knowledge problem is the essay I, Pencil by Leonard E. Read. The fact that no one person possesses the necessary information to make something that seems so simple as a basic number 2 pencil captures the essence of the knowledge problem.

If you remember your principles of economics, you know that the knowledge problem is solved by a spontaneous order guided by prices which reflect tradeoffs based on the disaggregated incomplete and imperfect knowledge and preferences of millions (billions) of individuals. Prices serve both the function of providing information and the incentives to act on that information. It is through this information creation and coordinating process that prices help solve the knowledge problem.

Prices solve the problem of calculation that Hayek alluded to in his essay, and they are what coordinate all of the activities discussed in I, Pencil. The knowledge problem explains how market economies work, while at the same time, socially planned economies historically have failed to allocate resources in a manner that has not resulted in shortages, surpluses, and collapse.

In Living Economics: Yesterday, Today, and Tommorow by Peter J. Boettke, he discusses the knowledge problem in the context of firms and the work of economics Murray Rothbard:

"firms cannot vertically integrate without facing a calculation problem....vertical integration elminates the external market for producer goods."

In essence, and this seems consistent with Coase, as firms integrate to eliminate transactions costs they also eliminate the markets which generate the prices that solve the knowledge problem! In a way firms could be viewed as little islands with socially planned economies in a sea of market competition. As Luke Froeb masterfully illustrates in his text Managerial Economics: A Problem Solving Approach (3rd Ed), decisions within firms in effect create regulations, taxes, and subsidies that destroy wealth creating transactions. Managers should make decisions that consummate the most wealth creating transactions (or do their best not to destroy, discourage, or prohibit wealth creating transactions).

So how do we solve the knowledge problems in firms without the information creating and coordinating role of prices? Whenever mistakes are made, Luke Froeb provides this problem solving algorithm that asks:

1) Who is making the bad decision?
2) Do they have enough information to make a good decision?
3) Do they have the incentive to make a good decision?

In essence, in absence of prices, we must try to answer the same questions that prices often resolve. And we could leverage business experiments to address the second question above. Experiments can provide important causal decision making information. While I would never argue that data science, advanced analytics, artificial intelligence, or any field experiment could ever solve the knowledge problem, I will argue that business experiments become extremely valuable because of the knowledge problem within firms.

Going back to I, Pencil and Hayek's essay, the knowledge problem is solved through the spontaneous coordination of multitudes of individual plans via markets. Through a trial and error process where feedback is given through prices, the plans that do the best job coordinating peoples choices are adopted. Within firms there are often only a few plans compared to the market through various strategies and tactics. But as discussed in Jim Manzi's book Uncontrolled, firms can mimic this trial and error process through iterative experimentation interspersed with theory and subject matter expertise. Experiments help establish causal facts, but it takes theory and subject matter expertise to understand which facts are relevant.

In essence, while experiments don't perfectly emulate the same kind of evolutionary feedback mechanisms prices deliver in market competition, an iterative test and learn culture within a business may provide the best strategy for dealing with the knowledge problem. And that is one of many ways that business experiments are able to contribute value.

See also:

Statistics is a Way of Thinking, Not a Box of Tools

Monday, April 6, 2020

Statistics is a Way of Thinking, Not Just a Box of Tools

If you have taken very many statistics courses you may have gotten the impression that it's mostly a mixed bag of computations and rules for conducting hypothesis tests or making predictions or creating forecasts. While this isn't necessarily wrong, it could leave you with the opinion that statistics is mostly just a box of tools for solving problems. Absolutely statistics provides us with important tools for understanding the world, but to think of statistics as 'just tools' can have some pitfalls (besides the most common pitfall of having a hammer and viewing every problem as a nail)

For one, there is a huge gap between the theoretical 'tools' and real world application. This gap is filled with critical thinking, judgment calls, and various social norms, practices, and expectations that differ from field to field, business to business, and stakeholder to stakeholder. The art and science of statistics is often about filling this gap. That's a stretch more than 'just tools.'

The proliferation of open source programming languages (like R and Python) and point and click automated machine learning solutions (like DataRobot and H2Oai) might give the impression that after you have done your homework in framing the business problem, data and feature engineering, then all that is left is hyper-parameter tuning and plugging and playing with a number of algorithms until the 'best' one is found. It might reduce to a mechanical (sometimes time consuming if not using automated tools) exercise. The fact that a lot of this work can in fact be automated probably contributes to the 'toolbox' mentality when thinking about the much broader field of statistics as a whole. In The Book of Why, Judea Pearl provides an example explaining why statistical inference (particularly causal inference) problems can't be reduced to easily automated mechanical exercises:

"path analysis doesn't lend itself to canned programs......path analysis requires scientific thinking as does every exercise in causal inference. Statistics, as frequently practiced, discourages it and encourages "canned" procedures instead. Scientists will always prefer routine calculations on data to methods that challenge their scientific knowledge."

Indeed, a routine practice that takes a plug and play approach with 'tools' can be problematic in many cases of statistical inference. A good example is simply plugging GLM models into a difference-in-differences context. Or combining matching with difference-in-differences. While we can get these approaches to 'play well together' under the correct circumstances its not as simple as calling the packages and running the code. Viewing methods of statistical inference and experimental design as just a box of tools to be applied to data could leave one open to the plug and play fallacy. There are times you might get by with using a flathead screwdriver to tighten up a phillips head screw, but we need to understand that inferential methods are not so easily substituted even if it looks like a snug enough fit on the surface.

Understanding the business problem and data story telling are in fact two other areas of data science that would be difficult to automate . But don't let that fool you into thinking that the remainder of data science including statistical inference is simply a mechanical exercise that allows one to apply the 'best' algorithm to 'big data'. You might get by with that for a minority set of use cases that require a purely predictive or pattern finding solution but the remainder of the world's problems are not so tractable. Statistics is about more than data or the patterns we find in it. It's a way of thinking about the data.

"Causal Analysis is emphatically not just about data; in causal analysis we must incorporate some understanding of the process that produces the data and then we get something that was not in the data to begin with." - Judea Pearl, The Book of Why

Statistics is A Way of Thinking

In their well known advanced text book "Principles and Procedures of Statistics, A Biometrical Approach", Steel and Torrie push back on the attitude that statistics is just about computational tools:

"computations are required in statistics, but that is arithmetic, not mathematics nor statistics...statistics implies for many students a new way of thinking; thinking in terms of uncertainties of probabilities.....this fact is sometimes overlooked and users are tempted to forget that they have to think, that statistics cannot think for them. Statistics can however help research workers design experiments and objectively evaluate the resulting numerical data."

At the end of the day we are talking about leveraging data driven decision making to override biases and often gut instincts and ulterior motives that may stand behind a scientific hypothesis or business question.  Objectively evaluating numerical data as Steel and Torrie put it above. But what do we actually mean by data driven decision making? Mastering (if possible) statistics, inference, and experimental design is part of a lifelong process of understanding and interpreting data to solve applied problems in business and the sciences. It's not just about conducting your own analysis and being your own worst critic, but also about interpreting, criticizing, translating and applying the work of others. Biologist and geneticist Kevin Folta put this well once in a Talking Biotech podcast:

"I've trained for 30 years to be able to understand statistics and experimental design and interpretation...I'll decide based on the quality of the data and the experimental design....that's what we do."

In 'Uncontrolled' Jim Manzi states:

"observing a naturally occurring event always leaves open the possibility of confounded causes...though in reality no experimenter can be absolutely certain that all causes have been held constant the conscious and rigorous attempt to do so is the crucial distinction between an experiment and an observation."

Statistical inference and experimental design provide us with a structured way to think about real world problems and the data we have to solve them while avoiding as much as possible the gut based data story telling that intentional or not, can sometimes be confounded and misleading. As Francis Bacon once stated:

"what is in observation loose and vague is in information deceptive and treacherous"

Statistics provides a rigorous way of thinking that moves us from mere observation to useful information.

*UPDATE: Kevin Gray wrote a very good article that really gets at the spirit of a lot of what I wanted to convey in this post.

See also:

To Explain or Predict

Applied Econometrics

Wednesday, February 12, 2020

Randomized Encouragement: When noncompliance may be a feature and not a bug

Many times in a randomized controlled trial (RCT) issues related to non-compliance arise. Subjects assigned to the treatment fail to comply, while in other cases subjects that were supposed to be in the control group actually receive treatment. Other times we may have a new intervention (maybe it is a mobile app or some kind of product, service, or employer or government benefit) that law, contract, or nature implies that it can be accessed by everyone in our population of interest. We know that if we let nature take its course, users, adopters, or engagers are very likely going to be a self selected group that is different from others in a number of important ways. In a situation like this it could be very hard to know if observed outcomes from the new intervention are related to the treatment itself, or explained by other factors related to characteristics of those who choose to engage.

In a 2008 article in the American Journal of Public Health, alternatives to randomized controlled trials are discussed, and for situations like this the authors discuss randomized encouragement:

 "participants may be randomly assigned to an opportunity or an encouragement to receive a specific treatment, but allowed to choose whether to receive the treatment."

In this scenario, less than full compliance is the norm, a feature and not a bug. The idea is to roll out access in conjunction with randomized encouragement. A randomized nudge.

For example, in Developing a Digital Marketplace for Family Planning: Pilot Randomized Encouragement Trial (Green, et. al;  2018) randomized encouragement was used to study the impact of a digital health intervention related to family planning:

“women with an unmet need for family planning in Western Kenya were randomized to receive an encouragement to try an automated investigational digital health intervention that promoted the uptake of family planning”

If you have a user base or population already using a mobile app you could randomize encouragement to utilize new features through the app. In other instances, you could randomize encouragement to use a new product, feature, or treatment through text messaging. Traditional ways this has been done is through mailers or phone calls.

While treatment assignment or encouragement is random, non-compliance or the choice to engage or not engage is not! How exactly do we analyze results from a randomized encouragement trial in a way that allows us to infer causal effects?  While common approaches include intent-to-treat (ITT) or maybe even per-protocol analysis, treatment effects for a randomized encouragement trial can also be estimated based on complier average causal effects or CACE.

CACEs compare outcomes for individuals in the treatment group who complied with treatment (engaged as a result of encouragement) with individuals in the control group who would have complied if given the opportunity to do so.  This is key. If you think this sounds a lot like local average treatment effects in an instrumental variables framework this is exactly what we are talking about.

Angrist and Pishke (2015) discuss how instrumental variables can be used in the context of a randomized controlled trial (RCT) with non-compliance issues:

 "Instrumental variable methods allow us to capture the causal effect of treatment on the treated in spite of the nonrandom compliance decisions made by participants in experiments....Use of randomly assigned intent to treat as an instrumental variable for treatment delivered eliminates this source of selection bias." 

Instrumental varaible analysis gives us an estimation of local average treatment effects (LATE), which are the same as CACE. In simplest terms, LATE is the average treatment effect for the sub-population of compliers in a RCT. Or, the compliers or engagers in a randomized encouragement design.

There are obviously some assumptions involved and more technical details. Please see the references and other links below to read more about the mechanics, assumptions, and details involved as well as some toy examples.


Mastering 'Metrics: The Path from Cause to Effect Joshua D. Angrist and Jörn-Steffen Pischke. 2015.

Connell A. M. (2009). Employing complier average causal effect analytic methods to examine effects of randomized encouragement trials. The American journal of drug and alcohol abuse, 35(4), 253–259. doi:10.1080/00952990903005882

Green EP, Augustine A, Naanyu V, Hess AK, Kiwinda L
Developing a Digital Marketplace for Family Planning: Pilot Randomized Encouragement Trial
J Med Internet Res 2018;20(7):e10756

Stephen G. West, Naihua Duan, Willo Pequegnat, Paul Gaist, Don C. Des Jarlais, David Holtgrave, José Szapocznik, Martin Fishbein, Bruce Rapkin, Michael Clatts, and Patricia Dolan Mullen, 2008:
Alternatives to the Randomized Controlled Trial
American Journal of Public Health 98, 1359_1366,

See also: 

Intent to Treat, Instrumental Variables and LATE Made Simple(er) 

Instrumental Variables and LATE 

Instrumental Variables vs. Intent to Treat 

Instrumental Explanations of Instrumental Variables

A Toy Instrumental Variable Application

Other posts on instrumental variables...

Monday, December 16, 2019

Some Recommended Podcasts and Episodes on AI and Machine Learning

Something I have been interested in for some time now is both is the convergence of big data and genomics and the convergence of causal inference and machine learning. 

I am a big fan of the Talking Biotech Podcast which allows me to keep up with some of the latest issues and research in biotechnology and medicine. A recent episode related to AI and machine learning covered a lot of topics that resonated with me. 

There was excellent discussion on the human element involved in this work, and the importance of data data prep/feature engineering (the 80% of work that has to happen before the ML/AI can do its job) and the challenges of non-standard 'omics' data.  Also the potential biases that researchers and developers can inadvertently introduce in this process. Much more including applications of machine learning and AI in this space and best ways to stay up to speed on fast changing technologies without having to be a heads down programmer. 

I've been in a data science role since 2008 and have transitioned from SAS to R to python. I've been able to keep up within the domain of causal inference to the extent possible, but I keep up with broader trends I am interested in via podcasts like Talking Biotech. Below is a curated list of my favorites related to data science with a few of my favorite episodes highlighted.

1) Casual Inference - This is my new favorite podcast by two biostatisticians covering epidemiology/biostatistics/causal inference - and keeping it casual.

Fairness in Machine Learning with Sherri Rose | Episode 03 -

This episode was the inspiration for my post: When Wicked Problems Meet Biased Data.

#093 Evolutionary Programming - 

#266 - Can we trust scientific discoveries made using machine learning

How social science research can inform the design of AI systems 

#37 Causality and potential outcomes with Irineo Cabreros -  

Andrew Gelman - Social Science, Small Samples, and the Garden of Forking Paths 
James Heckman - Facts, Evidence, and the State of Econometrics

Wednesday, December 11, 2019

When Wicked Problems Meet Biased Data

In "Dissecting racial bias in an algorithm used to manage the health of populations" (Science, Vol 366 25 Oct. 2019) the authors discuss inherent racial bias in widely adopted algorithms in healthcare. In a nutshell these algorithms use predicted cost as a proxy for health status. Unfortunately, in healthcare, costs can proxy for other things as well:

"Black patients generate lesser medical expenses, conditional on health, even when we account for specific comorbidities. As a result, accurate prediction of costs necessarily means being racially biased on health."

So what happened? How can it be mitigated? What can be done going forward?

 In data science, there are some popular frameworks for solving problems. One widely known approach is the CRISP-DM framework. Alternatively, in The Analytics Lifecycle Toolkit a similar process is proposed:

(1) - Problem Framing
(2) - Data Sense Making
(3) - Analytics Product Development
(4) - Results Activation

The wrong turn in Albuquerque here may have been at the corner of problem framing and data understanding or data sense making.

The authors state:

"Identifying patients who will derive the greatest benefit from these programs is a challenging causal inference problem that requires estimation of individual treatment effects. To solve this problem health systems make a key assumption: Those with the greatest care needs will benefit the most from the program. Under this assumption, the targeting problem becomes a pure prediction public policy problem."

The distinctions between 'predicting' and 'explaining' have been made in the literature by multiple authors in the last two decades. The problem with this substitution has important implications. To quote Galit Shmueli:

"My thesis is that statistical modeling, from the early stages of study design and data collection to data usage and reporting, takes a different path and leads to different results, depending on whether the goal is predictive or explanatory."

Almost a decade before, Leo Brieman encouraged us to think outside the box when solving problems by considering multiple approaches:

"Approaching problems by looking for a data model imposes an a priori straight jacket that restricts the ability of statisticians to deal with a wide range of statistical problems. The best available solution to a data problem might be a data model; then again it might be an algorithmic model. The data and the problem guide the solution. To solve a wider range of data problems, a larger set of tools is needed."

A number of data analysts today may not be cognizant of the differences in predictive vs explanatory modeling and statistical inference. It may not be clear to them how that impacts their work. This could be related to background, training, or the kinds of problems they have worked on given their experience.  It is also important that we don't compartmentalize so much that we miss opportunities to approach our problem from a number of different angles (Leo Breiman's 'straight jacket') This is perhaps what happened in the Science article, once the problem was framed as a predictive modeling problem other modes of thinking may have shut down even if developers were aware of all of these distinctions.

The take away is that we think differently when doing statistical inference/explaining vs. predicting or doing machine learning. Making the substitution of one for the other impacts the way we approach the problem (things we care about, things we consider vs. discount etc.) and this impacts the data preparation, modeling, and interpretation.

For instance, in the Science article, after framing the problem as a predictive modeling problem, a pivotal focus became the 'labels' or target for prediction.

"The dilemma of which label to choose relates to a growing literature on 'problem formulation' in data science: the task of turning an often amorphous concept we wish to predict into a concrete variable that can be predicted in a given dataset."

As noted in the paper 'labels are often measured with errors that reflect structural inequalities.'

Addressing the issue with label choice can come with a number of challenges briefly alluded to in the article:

1) deep understanding of the domain - i.e subject matter expertise
2) identification and extraction of relevant data - i.e. data engineering
3) capacity to iterate and experiment -i.e. statistical programming, simulation, and interdisciplinary collaboration

Data science problems in healthcare are wicked problems defined by interacting complexities with social, economic, and biological dimensions that transcend simply fitting a model to data. Expertise in a number of disciplines is required.

Bias in Risk Adjustment

In the Science article, the specific example was in relation to predictive models targeting patients for disease management programs. However, there are a number of other predictive modeling applications where these same issues can be prevalent in the healthcare space.

In Fair Regression for Health Care Spending, Sherri Rose and Anna Zink discuss these challenges in relation to popular regression based risk adjustment applications. Aligning with the analytics lifecycle discussed above, they point out there are several places where issues of bias can be addressed including pre-processing, model fitting, and post processing stages of analysis. In this article they focus largely on the modeling stage leveraging a number of constrained and penalized regression algorithms designed to optimize fairness. This work looks really promising, but the authors point out a number of challenges related to scalability and optimizing fairness across a number of metrics or groups.

Toward Causal AI and ML

Previously I referenced Galit Shmueli's work that discussed how differently we approach and think about predictive vs explanatory modeling. In the Book of Why, Judea Pearl discusses causal inferential thinking:

"Causal Analysis is emphatically not just about data; in causal analysis we must incorporate some understanding of the process that produces the data and then we get something that was not in the data to begin with." 

There is currently a lot of work fusing machine learning and causal inference that could create more robust learning algorithms. For example, Susan Athey's work with causal forests, Leon Bottou's work related to causal invariance, and Elias Barenboim's work on the data fusion problem.  This work, including the kind of work mentioned before related to fair regression will help inform the next generation of predictive modeling, machine learning, and causal inference models in the healthcare space that hopefully will represent a marked improvement over what is possible today.

However, we can't wait half a decade or more while the theory is developed and adopted by practitioners. In the Science article, the authors found alternative metrics for targeting disease management programs besides total costs that calibrate much more fairly across groups. Bridging the gap in other areas will require a combination of awareness of these issues and creativity throughout the analytics product lifecycle. As the authors conclude:

"careful choice can allow us to enjoy the benefits of algorithmic predictions while minimizing the risks."

References and Additional Reading:

This paper was recently discussed on the Casual Inference podcast.

Measures of Racism, Sexism, Heterosexism, and Gender Binarism for Health Equity Research: From Structural Injustice to Embodied Harm—an Ecosocial Analysis. Nancy Krieger

Annual Review of Public Health 2020 41:1

Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1, 206–215 (2019) doi:10.1038/s42256-019-0048-x

Breiman, Leo. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statist. Sci. 16 (2001), no. 3, 199--231. doi:10.1214/ss/1009213726.

Shmueli, G., "To Explain or To Predict?", Statistical Science, vol. 25, issue 3, pp. 289-310, 2010.

Fair Regression for Health Care Spending. Anna Zink, Sherri Rose. arXiv:1901.10566v2 [stat.AP]