Sunday, June 11, 2017

Instrumental Variables vs. Intent to Treat

 "ITT analysis includes every subject who is randomized according to randomized treatment assignment. It ignores noncompliance, protocol deviations, withdrawal, and anything that happens after randomization. ITT analysis is usually described as “once randomized, always analyzed”.

"ITT analysis avoids overoptimistic estimates of the efficacy of an intervention resulting from the removal of non-compliers by accepting that noncompliance and protocol deviations are likely to occur in actual clinical practice" 
- Gupta, 2011

 In Mastering Metrics, Angrist and Pischke describe intent-to-treat analysis:

"In randomized trials with imperfect compliance, when treatment assignment differs from treatment delivered, effects of random assignment...are called intention-to-treat (ITT) effects. An ITT analysis captures the causal effect of being assigned to treatment."

While treatment assignment is random, non-compliance is not! Therefore if instead of using intent to treat comparisons we compared those actually treated to those untreated we would get biased results, because this is essentially making uncontrolled comparisons between treated and untreated subjects.

Angrist and Pishke describe how instrumental variables can be used in this context:

 “The instrumental variables (IV) method harnesses partial or incomplete random assignment, whether naturally occurring or generated by researchers"

 "Instrumental variable methods allow us to capture the causal effect of treatment on the treated in spite of the nonrandom compliance decisions made by participants in experiments....Use of randomly assigned intent to treat as an instrumental variable for treatment delivered eliminates this source of selection bias."

In  Intent-to-Treat vs. Non-Intent-to-Treat Analyses under Treatment Non-Adherence in Mental Health Randomized Trials there is a nice discussion of ITT and IV methods with applications related to clinical research.  Below is a nice treatment of IV in this context:

“Instrumental variables are assumed to emulate randomization variables, unrelated to unmeasured confounders influencing the outcome. In the case of randomized trials, the same randomized treatment assignment variable used in defining treatment groups in the ITT analysis is instead used as the instrumental variable in IV analyses. In particular, the instrumental variable is used to obtain for each patient a predicted probability of receiving the experimental treatment. Under the assumptions of the IV approach, these predicted probabilities of receipt of treatment are unrelated to unmeasured confounders in contrast to the vulnerability of the actually observed receipt of treatment to hidden bias. Therefore, these predicted treatment probabilities replace the observed receipt of treatment or treatment adherence in the AT model to yield an estimate of the as-received treatment effect protected against hidden bias when all of the IV assumptions hold.”

A great example of IV and ITT applied to health care can be found in Finkelstein et. al. (2013 & 2014) - See the Oregon Medicaid Experiment, Applied Econometics, and Causal Inference.

Over at the Incidental Economist, there was a nice discussion of ITT in the context of medical research that does a good job of explaining the rationale as well as when departures from ITT make more sense (such as safety and non-inferiority trials).

See also:  
Instrumental Explanations of Instrumental Variables

A Toy IV Application

Other IV Related Posts


Mastering ’Metrics:
The Path from Cause to Effect
Joshua D. Angrist & Jörn-Steffen Pischke

Gupta, S. K. (2011). Intention-to-treat concept: A review. Perspectives in Clinical Research, 2(3), 109–112.

Ten Have, T. R., Normand, S.-L. T., Marcus, S. M., Brown, C. H., Lavori, P., & Duan, N. (2008). Intent-to-Treat vs. Non-Intent-to-Treat Analyses under Treatment Non-Adherence in Mental Health Randomized Trials. Psychiatric Annals, 38(12), 772–783.

"The Oregon Experiment--Effects of Medicaid on Clinical Outcomes," by Katherine Baicker, et al. New England Journal of Medicine, 2013; 368:1713-1722.

Medicaid Increases Emergency-Department Use: Evidence from Oregon's Health Insurance Experiment. Sarah L. Taubman,Heidi L. Allen, Bill J. Wright, Katherine Baicker, and Amy N. Finkelstein. Science 1246183Published online 2 January 2014 [DOI:10.1126/science.1246183] 

Detry MA, Lewis RJ. The Intention-to-Treat PrincipleHow to Assess the True Effect of Choosing a Medical Treatment. JAMA. 2014;312(1):85-86. doi:10.1001/jama.2014.7523

Tuesday, June 6, 2017

Professional Science Master's Degree Programs in Biotechnology and Management

As an undergraduate I always had an interest in biotechnology and molecular genetics. However, lab work did not particularly appeal to me. I also recognized early on that science does not occur in a vacuum- its subject to social, political, economic, and financial forces. This drew me to the field of economics, specifically public choice theory.

When it came time for graduate school I was still torn. I really wasn't interested in an MBA and didn't really have the background to work in a lab or do field work in genetic research. I really liked economics. The combination of mathematically precise theories (microeconomics/game theory) and empirically sound methods (econometrics) provided a powerful framework for applied problem solving.

I had two advisers make recommendations that got me thinking outside the box. One suggested ultimately I would find a niche that combined both economics and genetics. The other suggested I look at programs like the Bioscience Management program that was being offered at the time at George Mason University (now Bioinformatics Management). While there were not a lot of programs like that being offered at the time, the Agriculture Department at Western Kentucky University provided enough flexibility in their masters program to include courses in biostatistics, genetics, and applied economics. I was able to work on research projects analyzing consumer perceptions of biotechnology and biotech trait resistance management using tools from econometrics, game theory, and population genetics.  Additionally I took courses in applied economics and finance from both the Department of Agriculture and College of Business where I was exposed to tools related to investment analysis, options pricing, and analysis and valuation of biotech companies as well as the impacts of technological change and biotechnology on food and economic development.

With this combination of quantitative training and applied work I have been able to leverage SAS, R, and Python to solve a number of challenging problems throughout a number of professional analytics and consulting roles. 

Today there are a larger number of professional science masters programs with curriculums similar to the programs I contemplated over 10 years ago. 

According to National Professional Science Master’s Association:

"Professional Science Master's (PSMs) are designed for students who are seeking a graduate degree in science or mathematics and understand the need for developing workplace skills valued by top employers. A perfect fit for professionals because it allows you to pursue advanced training and excel in science or math without a Ph.D., while simultaneously developing highly-valued business skills....PSM programs consist of two years of coursework along with a professional component that includes business, communications and/or regulatory affairs."

In 2012 there was an article in Science detailing these degrees and some data related to salaries which seemed attractive. According to the article the first program was officially offered in 1997, reaching 140 programs by 2009 with over 247 at the time of printing.

This commentary from the article corroborates how I feel about my experience:

“There is a tendency for students to buy into the line that if you don't get a Ph.D., you're not a serious professional, that you're wasting your mind,” she says. After spending a decade talking with PSM students and graduates, she is certain that’s not true. “There is so much potential for growth and satisfaction with a PSM degree. You can become a person you didn’t even know you wanted to be.”

Below are some programs that would look interesting to me that students interested in this option should check out.  (there is a program locator you can find here) . Many of these programs are a mash up of biology/biotech and applied economics and business degrees.

George Mason University- PSM Bioinformatics Management

University of Illinois - Agricultural Production

Cornell- MPS Agriculture and Life Sciences

Washington State University - PSM Molecular Biosciences

Middle Tennesee State University - PSM Biotechnology

California State - MS Biotechnology/MBA 

Johns Hopkins - MBA/MS Biotechnology

Rice - PSM Bioscience and Health Policy

North Carolina State University - MBA (Biosciences Mgt Concentration)

Purdue/Kelley - MS-MBA  (not a heavy science emphasis but a very cool degree regardles from great schools)

See also:
Analytical Translators
Why Study Agricultural/Applied Economics

Monday, June 5, 2017

Game Theory with Python- TalkPython Podcast

Episode 104 of the TalkPython podcast discussed game theory.

Here are a few slices:

"Our guests this week, Vince Knight, Marc Harper, and Owen Campbell are here to discuss their Python project built to study and simulate one of the central problems in game theory, "The Prisoner's Dilemma"

"Yeah, so one of the things is how people end up cooperating. If we're all incentivized not to cooperate with each other yet we look around, we see all these situations where people are cooperating, so can we devise strategies that when we play this game repeatedly that coerce or convince our partners that they're better off cooperating with us than defecting against us......Okay, excellent. Give us a sense for some of the, you have some clever names for the different strategies or players, right? Strategy and player is kind of the same thing. You've got the basic ones. The cooperator and the defector, but what else?Probably the most famous one is the tit for tat strategy. Because in Axelrod's original tournament, one of the interesting results that came out with his work was that this strategy was one of the most successful."

And then they get into incorporating machine learning:

"We've extended that method of taking a strategy based on some kind of machine learning algorithm, training it against the other strategies and then adding the fact of the tournaments to see about those. Right now, those are amongst the best players in the library, in terms of performance."

See my previous post for some concepts and examples from game theory that were discussed in this podcast. You can find more references from this podcast including papers, code etc. here.

Game Theory- A Basic Introduction

When someone else’s choices impact you, it helps to have some way to anticipate their behavior. Game Theory provides the tools for doing so (Nicholson, 2002). Game Theory is a mathematical technique developed to study choice under conditions of strategic interaction (Zupan, 1998). It allows for the analysis of interdependent situations.

In game theory, a game is a decision-making situation with interdependent behavior between two or more individuals (Harris,1999). The individuals involved in making the decisions are the players. The set of possible choices made by the players are strategies. The outcomes of choices and strategies played are payoffs. Payoffs are often stated as levels of utility, income, profits, or some other stated objective particular to the game. A general assumption in game theory is that players seek the highest payoff attainable, preferring more utility to less (Nicholson, 2002). 

When a decision maker takes into account how other players will respond to his choices, a utility maximizing strategy may be found. It may allow one to predict in advance the actions, responses, and counter responses of others and then choose optimal strategies (Harris, 1999). Such optimal strategies that leave players with no incentive to change their behavior are equilibrium strategies

Games can be characterized by players, strategies, and payoffs. Below is one way to visualize a game.

Example: Overgrazing Game

                                                    RANCHER 2:
                                               Conserve   Overgraze
RANCHER 1:    Conserve     (20, 20)    |  (0, 30)
                           Overgraze    (30, 0)      |  (10, 10)

In this game, the players are rancher '1' and rancher '2'.  They can play one of two strategies, to conserve or overgraze a commonly shared or 'public' pasture. Suppose rancher 1 chooses a strategy (picks a row). Their payoff is depicted by the first number in each cell. Rancher 2 will choose a strategy in return (picking a column). Rancher 2’s payoff is indicated by the second number in each cell. 

In this case, the best strategy for rancher 2 (no matter what rancher 1 chooses to do) is to overgraze because the payoff  for rancher 2 (the 2nd number in each cell) associated with overgrazing is always the highest. Likewise, no matter what rancher 2 chooses to do, the best strategy for rancher 1 is to overgraze because the first number in each cell (the payoffs for rancher 1) associated with overgrazing is always the highest. Both players have a dominant strategy to overgraze This represents an equilibrium strategy of {overgraze, overgraze}. 

This outcome is also described as a prisoner’s dilemma or a Nash Equilibrium. In a Nash equilibrium each player’s choice is the best choice possible taking into consideration the choice of the other players (Zupan, 1998). This concept was generalized by the mathematician John Nash in 1951 in his paper “Equilibrium Points in n-Person Games.” 

It’s easy to see that if the players would conserve, they could both be made better off because the strategy {conserve, conserve} yields payoffs (20,20) which are much higher than the Nash Equilibrium strategy’s payoff of (10,10). 

Just as competitive market forces elicit cooperation by coordinating behavior through price mechanisms, so too must players in a game find some means of coordinating their behavior if they wish to escape the sub-optimal Nash Equilibrium.  

Some Additional Concepts  

Multiple Period Games- Multiple period games are games that are played more than once, or more than one time period. If we could imagine playing the prisoner’s dilemma game multiple times we would have a multi- period game. If games are played perpetually they are referred to infinite games
(Harris, 1999).  

Punishment Schemes - Punishment schemes are used to elicit cooperation or enforcement of agreements. 

In the game presented above, suppose both players wanted to cooperate to conserve grazing resources. If it turned out that rancher 2 cheated, then in the next period rancher 1 would refuse to cooperate. If the game is played repeatedly, rancher 2 would learn that if he sticks to the deal both players would be better off. In this way punishment schemes in multi-period games can elicit cooperation, allowing an escape from a Nash Equilibrium. This may not be possible in the single period games that we looked at before.

Tit-for-Tat - Tit-for-tat punishment mechanisms are schemes in which if one player fails to cooperate, the other player will refuse to cooperate in the next period. 

Trigger Strategy - In infinitely repeated games a trigger strategy involves a promise to play the optimal strategy as long as the other players comply (Nicholson, 2002).  

Grim Trigger Strategy - This is a trigger strategy that involves punishment for many periods if the other player does not cooperate. In other words if one player defects when he should cooperate, the other player(s) will not offer the chance to cooperate again for a long time. As a result both players will be confined to a N.E. for many periods or perpetually (Harris, 1999).  

Trembling Hand Trigger Strategy- This is a trigger strategy that allows for mistakes. Suppose in the first instance player 1 does not realize that player 2 is willing to cooperate. Instead of player 1 resorting to a long period of punishment as in the grim trigger strategy, player 1 allows player 2 a second chance to cooperate. It may be the case that instead of playing the grim trigger strategy, player 1 may invoke a single period tit-for-tat punishment scheme in hopes to elicit cooperation in later periods. 

Folk Theorems - Folk theorems result from the conclusion that players can escape the outcome of a Nash Equilibrium if games are played repeatedly, or are infinite period games (Nicholson,2002).
 In general, folk theorems state that players will find it in their best interest to maintain trigger strategies in infinitely repeated games. 

See also:
Matt Bogard. "An Econometric and Game Theoretic Analysis of Producer and Consumer Preferences Toward Agricultural Biotechnology" Western Kentucky University (2005) Available at:

Matt Bogard. "An Introduction to Game Theory: Applications in Environmental Economics and Public Choice with Mathematical Appendix" (2012) Available at:   

Matt Bogard. "Game Theory, A Foundation for Agricultural Economics" (2004) Available at:  


Nicholson, Walter R. “Microeconomic Theory: Basic Principles and Extensions.” Southwestern Thomson Learning. U.S.A. (2002).

Browning, Edward K. and Mark A. Zupan. “Microeconomic Theory and Applications.” 6th Edition. Addison-Wesley Longman Inc. Reading, MA. (1999)

Harris, Frederick H. et al. “Managerial Economics: Applications, Strategy, and Tactics.” Southwestern College Publishing. Cincinnati, OH. (1999).

Saturday, June 3, 2017

In Praise of The Citizen Data Scientist

There was actually a really good article I read over at Data Science Central titled "The Data Science Delusion." Here is an interesting slice:

"This democratization of algorithms and platforms, paradoxically, has a downside: the signaling properties of such skills have more or less been lost. Where earlier you needed to read and understand a technical paper or a book to implement a model, now you can just use an off-the-shelf model as a black-box. While this phenomenon affects many disciplines, the vague and multidisciplinary definition of data science certainly exacerbates the problem."

It is true there is some loss of signal. However, companies may need to look for new signals as technological change progresses and new forms of capital complements labor. Its this new labor complementing role of capital (in the form of open source statistical computing packages and computing power) that is creating demand for those that can leverage these tools competently, without knowing all  "the nitty-gritty mathematical academic formulas to everything about support vector machines or Kernels and stuff like that to apply it properly and get results."

Sure, as a result there are a lot of analytics programs popping up out there to take advantage of these advances, but its also the reason programs like applied economics are becoming so popular.  In fact, in promoting its program, Johns Hopkins University almost seems to echo some of the sentiment in the quotes above, but takes a positive spin:

"Economic analysis is no longer relegated to academicians and a small number of PhD-trained specialists. Instead, economics has become an increasingly ubiquitous as well as rapidly changing line of inquiry that requires people who are skilled in analyzing and interpreting economic data, and then using it to effect decisions about national and global markets and policy, involving everything from health care to fiscal policy, from foreign aid to the environment, and from financial risk to real risk." 

In fact, I admit for a while I was a little disappointed my alma mater did not embrace the data science/analytics degree trend, or offer more courses in applied programming or incorporate languages like R into more courses. However, now, while I think these things are great I realize the more important data science skills are related to the analytical thinking and firm theoretical, statistical, and quantitative foundations that programs in economics and finance already offer at the undergraduate and masters level. While formal data science training might be the way of the future, I would venture to say that the vast majority of today's 'data scientists' were academically trained in a quantitative discipline like the above and self trained (perhaps via coursera etc.) on the skills and tools most people think of when they think of data science.  As I have said before, sometimes you don't need someone with a PhD in computer science or an astrophysics. Sometimes you really just need a good MBA that understands regression and the basics of a left join.

The DSC article above concludes with a little jab at data science, that I tend to agree with wholeheartedly:

"Great data science work is being done in various places by people who go by other names (analyst, software engineer, product head, or just plain old scientist). It is not necessary to be a card-carrying data scientist to do good data science work. Blasphemy it may be to say so, but only time will tell whether the label itself has value, or is only helping create a delusion." 

See also:

What you really need to know to be a data scientist
Super Data Science podcast - credit scoring
How to think like a data scientist to become one
What makes a great data scientist
Are data scientists going extinct
More on data science from actual data scientists