tag:blogger.com,1999:blog-2474498300859593807.comments2016-03-20T17:16:34.326-04:00Econometric SenseMatt Bogardhttp://www.blogger.com/profile/10510725993509264716noreply@blogger.comBlogger141125tag:blogger.com,1999:blog-2474498300859593807.post-62067786825468357582016-03-15T07:52:57.288-04:002016-03-15T07:52:57.288-04:00Should the average marginal effect be a decrease o...Should the average marginal effect be a decrease of 2.5% or 2.5 percentage points? Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-7377115316647768602015-09-21T04:40:57.973-04:002015-09-21T04:40:57.973-04:00I still don't understand how can I tell whethe...I still don't understand how can I tell whether 's11' has any impact on Bush approval rating. Is it determined by the p value of 'T1-AR1'?Anonymoushttps://www.blogger.com/profile/03793987828101721824noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-80913892161239709872015-09-11T14:17:55.958-04:002015-09-11T14:17:55.958-04:00Fantastic post, Matt!
I'm an accounting acad...Fantastic post, Matt! <br /><br />I'm an accounting academic who does a lot of applied econometric work, and I've started reading up on the machine learning literature. As you and Noah Smith note, the focus is on prediction rather than estimating the effect of a particular predictor.<br /><br />Aside from some basic terminology differences--e.g. I'd never heard of training vs. testing samples until I started reading ML books--I've found most of the techniques quite accessible. And it was heartening to see that deep connections have been established between bread-and-butter econometric techniques like logistic regression and SVM methods. <br /><br />The only parts of ML that I'm still trying to get comfortable with are methods that might yield better prediction, but whose outputs aren't as readily interpretable, such as tree-based boosting. It seems like you'd have to feel quite confident that the underlying relationships are stable over time if you're going to use these more less interpretable methods, no?<br /><br />But I should already be thankful for ML methods, as they brought me to your blog...I found out about it through a suggested LinkedIn connection, a match developed via a ML-based algorithm. :)<br /><br />Best,<br />George BattaGeorge Battahttps://www.cmc.edu/academic/faculty/profile/george-battanoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-47353636189780229502015-09-09T23:05:07.440-04:002015-09-09T23:05:07.440-04:00Thanks Matt.
Here's the post by Kling on math...Thanks Matt.<br /><br />Here's the post by Kling on math in the profession:<br /><br />http://www.econlib.org/library/Columns/y2015/Klingmit.htmlLevi Russellhttp://www.farmerhayek.comnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-45076646726497995432015-09-09T21:57:06.462-04:002015-09-09T21:57:06.462-04:00Thanks. I really think Levi Russell above does a g...Thanks. I really think Levi Russell above does a great job covering this in his comments. I'm still thinking about it, but I really like his stuff over at the Farmer Hayek blog and was kind of hoping he would to a post on this as well. Matt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-13869685309463128142015-09-09T01:26:51.791-04:002015-09-09T01:26:51.791-04:00Thanks for the shout-out! :-)
to many economists,...Thanks for the shout-out! :-)<br /><br /><i>to many economists, causality is a theory driven phenomenon, and can never truly be determined by data. I won't expand on this any further.</i><br /><br />I would love to read a follow-up post about this...Noah Smithhttps://www.blogger.com/profile/09093917601641588575noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-28453668057486171702015-09-05T20:23:46.161-04:002015-09-05T20:23:46.161-04:00While I'm certainly glad that we have more/bet...While I'm certainly glad that we have more/better data and methods for analyzing it, I don't think we should ever forget the lessons from masters of the past. A lot of their work is increasingly relevant as we are increasingly incentivized to think we can engineer policy (whether private or public) using data analysis. Ultimately the things we analyze as economists are too complex for data to deliver what some seem to expect of it (see Hayek). <br /><br />I recently read Ronald Coase's paper "The Marginal Cost Controversy." This paper isn't emphasized in grad programs but it serves as a warning: what Coase calls "blackboard economics" can lead us to say things with confidence that are not so certain.<br /><br />His interview in 2012 with Russ Roberts is definitely illuminating. <br />http://www.econtalk.org/archives/2012/05/coase_on_extern.html<br /><br />Arnold Kling, a graduate of MIT, has some interesting thoughts about the mathematization of the profession that's really at the heart of this "data revolution."<br /><br />I have some posts on this subject on the blog. Type "blackboard economics" into the search bar on the Farmer Hayek blog and you'll get a few of them!Levi Russellhttp://www.farmerhayek.comnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-41915401921406792232015-06-17T15:21:31.526-04:002015-06-17T15:21:31.526-04:00My answer (so far) to the question has been (c) St...My answer (so far) to the question has been (c) Stata. :PL Rhttps://www.blogger.com/profile/09236538070053804833noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-10995812306281029842015-02-27T11:26:44.715-05:002015-02-27T11:26:44.715-05:00I need to look more into this Bayesian stuff.I need to look more into this Bayesian stuff.L Rhttps://www.blogger.com/profile/09236538070053804833noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-59276296045139921482014-12-02T18:10:11.073-05:002014-12-02T18:10:11.073-05:00Re: software implementation: It's also worth l...Re: software implementation: It's also worth looking into Python, especially for those new to applied work. There is a growing network of people using Python and it is a pretty easy language to pick up. Jonathon Scotthttp://about.me/jonathonrscottnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-32136834233254267132014-08-05T06:16:36.501-04:002014-08-05T06:16:36.501-04:00thank you for the reply. i really like your post; ...thank you for the reply. i really like your post; it helps clarify the difficult jargon in an intuitive way.<br /><br />i have a quick, related question. is the phrase, "correlated unobervables" referring to the same phenomenon as "unobserved heterogeneity"?<br /><br />thanks againAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-87276426245292972012014-08-01T06:09:57.054-04:002014-08-01T06:09:57.054-04:00YES! THANK YOU! It should say BIASED. I need to co...YES! THANK YOU! It should say BIASED. I need to correct that. Matt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-79713655468420648972014-08-01T03:07:00.440-04:002014-08-01T03:07:00.440-04:00good stuff
however, there is a typo in the first ...good stuff<br /><br />however, there is a typo in the first sentence: "When we estimate a regression such as (1) above and leave out an important variable such as X2 then our estimate of β1 can become unbiased and inconsistent."<br /><br />should "our estimate of β1 can become BIASED and inconsistentAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-36651135565662231102014-07-28T11:36:12.098-04:002014-07-28T11:36:12.098-04:00Thank you for this. Have you also discussed differ...Thank you for this. Have you also discussed difference in differences estimation especially the variety that takes on multiple time periods for a recurring treatment (as opposed to the common which involves only two periods)?Anonymoushttps://www.blogger.com/profile/10125018467656103799noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-50430374154416500532014-04-15T11:59:21.779-04:002014-04-15T11:59:21.779-04:00I really appreciated this excellent exposition, bu...I really appreciated this excellent exposition, but I'm also unable to reproduce the coefficients in the call to arimax. I'm wondering if the dataset you used was modified. Using the BUSHJOB.dta files from Professor Monogan's site, the ARIMA 0,1,0 model (mod2) gives me an error message and the AR(1) model (mod 2b) yields the following prediction equation:<br />y.pred <- 56.0327 + 27.6660*bush$s11 + 27.6660*(0.8984^(bush$t-9))*as.numeric(bush$t>9), which is the same as Professor Monogan's results. I know it's a minor detail, and thanks again for a nice review.<br /> Atul Sharmahttp://www.aksharma.orgnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-37930163796284850462014-04-10T17:04:34.305-04:002014-04-10T17:04:34.305-04:00Manski presented at a forum at the University of K...Manski presented at a forum at the University of Kentucky back in 2012 on this topic - the paper is linked here http://www.nber.org/papers/w16207. I guess this general idea has been something on his agenda for many years since the similar paper you reference is 12 years earlier.JTPnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-80928716371391608192014-01-20T12:43:41.680-05:002014-01-20T12:43:41.680-05:00This may also be useful to your readers:
The Las...This may also be useful to your readers: <br /><br />The Last Shall Be First: Matching Patients by Choosing the Least Popular First <br />Stefanie J. Millar and David J. Pasta <br />ICON Clinical Research, San Francisco, CA<br /><br />http://www.wuss.org/proceedings10/HOC/The%20Last%20Shall%20Be%20First%20-%20Matching%20Patients%20by%20Choosing%20the%20Least%20Popular%20First.pdf<br /><br />Best,<br />VictoriaAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-24648026745503871322014-01-20T11:59:54.854-05:002014-01-20T11:59:54.854-05:00Hi Matt. Wow! Thank you for all the helpful resour...Hi Matt. Wow! Thank you for all the helpful resources. I think the easiest - but not entirely ideal - solution for me is to first determine the smallest follow-up period for my treatment group, and then observe all cases based on that follow-up period (first matching, then analysis). I can then move on to a longer follow-up period and take a look at who is left. It's not the cleanest method, but it appears to be the simplest. If I come up with something nicer, I will be sure to post here! Thanks again for all your help!<br /><br />Best,<br />Victoria from Rutgers SCJAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-46918332965750147962014-01-17T23:26:02.269-05:002014-01-17T23:26:02.269-05:00I always thought, a general rule of blogging was t...I always thought, a general rule of blogging was to not dominate the comment section. I've violated that convention. But I wanted to point to a recent post that cites some articles that involve PS matching in the context of survival analysis: http://econometricsense.blogspot.com/2014/01/propensity-score-matching-meets.htmlMatt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-6193007176513631032014-01-16T23:56:08.645-05:002014-01-16T23:56:08.645-05:00Also, in the STATA discussion I previously linked ...Also, in the STATA discussion I previously linked to, one commenter mentioned that time varying propensity scores could introduce bias or other issues if the treatment itself impacts the observables and hence PS in later periods. So, in this regard it would make since to estimate PS and implement the matches prior to the analysis without later updates. Again, an IPTW approach may be useful if you are concerned with censorship breaking down your matches. Matt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-26131616290252660402014-01-16T23:50:49.746-05:002014-01-16T23:50:49.746-05:00I've also discussed IPTW regression on this bl...I've also discussed IPTW regression on this blog before, as a much more computationally tractable alternative to PS matching. My first thoughts when confronted with this issue was to implement some sort of IPTW based survival analysis if possible. This would take care (possibly) of the issue related to censored matches, since IPTW methods utilize PS adjustments without explicit matching. This is discussed using R here: http://www.stata.com/statalist/archive/2012-05/msg00517.htmlMatt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-6434458748659167922014-01-16T23:33:29.798-05:002014-01-16T23:33:29.798-05:00Sorry for the typos "attaching" vs "...Sorry for the typos "attaching" vs "matching" above. I'm commenting via my iPad which tends to have latency issues. Regardless, time dependent covariates are challenging enough at least in SAS: http://econometricsense.blogspot.com/2012/04/analysis-of-gmo-vs-non-gmo-corn-hybrid.htmlMatt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-83577233375104840572014-01-16T23:28:49.284-05:002014-01-16T23:28:49.284-05:00A colleague and I were discussing something very s...A colleague and I were discussing something very similar just last week. I really don't know, but I the mean time will work on a solution. It may take a while before I find out. I think it depends. If you have a selection on observable s situation that lends itself to attaching, I might venture to say that as long as the observable s don't change during followup periods then the propensity scores estimated and initial matching should hold throughout. If you the observables and hence propensity scores could somehow be percieved as time varying, then I would think you would have something similar to a survival analysis with time varying covariates, and may requre different matches and PS estimated for each period. And, another challenge, regardless of time varying PS or not, what do you do with an initial matched comparison if one of the matches becomes censored and the other doesn't. This may be more of an issue with 1:1 matches. I really need to think more about this. The scenarios I have mentioned would seem to be difficult to code, and I'm not sure of any procedures in SAS would handle them by default. I'm not a STATA user so I'm unsure there. If these are truly issues, I would not doubt someone has written an R package to deal with it. Matt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-35773033812232536682014-01-16T14:53:30.131-05:002014-01-16T14:53:30.131-05:00Hi Matt. Thanks for the great PSM resources! I hav...Hi Matt. Thanks for the great PSM resources! I have been doing a lot of reading about this and thought I might as well ask an expert...<br /><br />When coupling propensity score matching with survival analysis, do varying follow-up periods need to be considered during the matching process or does the survival analysis account for this?<br /><br />I am inclined to say that the varying follow-up period problem can be tackled during the survival analysis and can be ignored during the matching process because I have used the stset command with the origin() and failure () functions to tell Stata when the study period is to begin for each case and whether or not they were censored. Once matched, I can then use the stpower exponential command for uniform accrual. Basically Stata is now aware that there are varying entry and exit times and can control for that fact.<br /><br />Do you have any advice here? Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2474498300859593807.post-69683195427286797022014-01-16T12:52:00.030-05:002014-01-16T12:52:00.030-05:00I think I understand your question now, and after ...I think I understand your question now, and after a week back in the office I'm getting things caught up. DD methods can be thought of as a special case or type of fixed effects or panel estimator. See also: http://econometricsense.blogspot.com/2011/01/mixed-fixed-and-random-effects-models.html and http://econometricsense.blogspot.com/2012/12/difference-in-difference-estimators.html I am hoping at some point to update with some posts with some actual data and code for estimation either in SAS or R. Matt Bogardhttps://www.blogger.com/profile/10510725993509264716noreply@blogger.com