Copula functions can be used to simulate a dependence structure independently from the marginal distributions.

Based on Sklar's theorem the multivariate distribution F can be represented by copula C as follows:

F(x

_{1}…x_{p}) = C{ F_{1}(x_{1}),…, F_{p}(x_{p}); θ}
The parameter θ represents the dependence between the two distributions F1 and F2. No let's set up the framework for what we are trying to model.

Suppose we want to predict some outcome Y. Let

Y = f(x,D)

where x is a vector of controls and D is a treatment indicator. We are interested in estimating the coefficient on D as our measure of the treatment effect. However, suppose that there is selection bias, such that those that choose to engage in the program indicated by D are more likely to have higher levels of Y regardless of treatment. (for the following for more on selection bias and unobserved heterogeneity and endogeneity).

We can model selection as follows:

D = g(x,z)

where x is a vector of controls and z is an instrument, correlated with the probability of D, but uncorrelated with selection. We can jointly model the outcome and selection functions using copulas where:

P(Y, D|x,z) = C{ F(.), G(.); θ}

As it turns out, the term θ captures the dependence between outcome and selection allowing for unbiased estimation of treatment effects associated with D. Han and Vytlacil extend the results to cases without instruments.

**References:**

Han, S. and E. Vytlacil (2015). Identification in a generalization of bivariate probit models with dummy endogenous regressors.Working paper, University of Texas at Austin.

A Note on IdentiÖcation of Discrete Bivariate Copulas. Pravin K. Trivedi and David M. Zimmer August 5, 2016

A Note on IdentiÖcation of Discrete Bivariate Copulas. Pravin K. Trivedi and David M. Zimmer August 5, 2016