Monday, November 28, 2016

Gary Gorton, Harald Uhlig, and the Great Crisis

Gary Gorton has made clear that the financial crisis of 2007 was in essence a traditional banking panic, not unlike those of the ninetheeth century.  A key corollary is that the root cause of the Panic of 2007 can't be something relatively new, like "Too Big to Fail".  (See this.)  Lots of people blame residential mortgage-backed securities (RMBS's), but they're also too new.  Interestingly, in new work Juan Ospina and Harald Uhlig examine RBMS's directly.  Sure enough, and contrary to popular impression, they performed quite well through the crisis.

Sunday, November 20, 2016

Dense Data for Long Memory

From the last post, you might think that efficient learning about low-frequency phenomena requires tall data. Certainly efficient estimation of trend, as stressed in the last post, does require tall data. But it turns out that efficient estimation of other aspects of low-frequency dynamics sometimes requires only dense data. In particular, consider a pure long memory, or "fractionally integrated", process, \( (1-L)^d x_t = \epsilon_t \), 0 < \( d \) < 1/2. (See, for example, this or this.) In a general \( I(d) \) process, \(d\) governs only low-frequency behavior (the rate of decay of long-lag autocorrelations toward zero, or equivalently, the rate of explosion of low-frequency spectra toward infinity), so tall data are needed for efficient estimation of \(d\). But in a pure long-memory process, one parameter (\(d\)) governs behavior at all frequencies, including arbitrarily low frequencies, due to the self-similarity ("scaling law") of pure long memory. Hence for pure long memory a short but dense sample can be as informative about \(d\) as a tall sample. (And pure long memory often appears to be a highly-accurate approximation to financial asset return volatilities, as for example in ABDL.)

Monday, November 7, 2016

Big Data for Volatility vs.Trend

Although largely uninformative for some purposes, dense data (high-frequency sampling) are highly informative for others.  The massive example of recent decades is volatility estimation.  The basic insight traces at least to Robert Merton's early work. Roughly put, as we sample returns arbitrarily finely, we can infer underlying volatility (quadratic variation) arbitrarily well.

So, what is it for which dense data are "largely uninformative"?  The massive example of recent decades is long-term trend.  Again roughly put and assuming linearity, long-term trend is effectively a line segment drawn between a sample's first and last observations, so for efficient estimation we need tall data (long calendar span), not dense data.

Assembling everything, for estimating yesterday's stock-market volatility you'd love to have yesterday's 1-minute intra-day returns, but for estimating the expected return on the stock market (the slope of a linear log-price trend) you'd much rather have 100 years of annual returns, despite the fact that a naive count would say that 1 day of 1-minute returns is a much "bigger" sample.

So different aspects of Big Data -- in this case dense vs. tall -- are of different value for different things.  Dense data promote accurate volatility estimation, and tall data promote accurate trend estimation.

Thursday, November 3, 2016

StatPrize

Check out this new prize, http://statprize.org/ (Thanks, Dave Giles, for informing me via your tweet.) It should be USD 1 Million, ahead of the Nobel, as statistics is a key part (arguably the key part) of the foundation on which every science builds.

And obviously check out David Cox, the first winner. Every time I've given an Oxford econometrics seminar, he has shown up. It's humbling that he evidently thinks he might have something to learn from me. What an amazing scientist, and what an amazing gentleman.

And also obviously, the new StatPrize can't help but remind me of Ted Anderson's recent passing, not to mention the earlier but recent passings, for example, of Herman Wold, Edmond Mallinvaud, and Arnold Zellner. Wow -- sometimes the Stockholm gears just grind too slowly. Moving forward, StatPrize will presumably make such econometric recognition failures less likely.

Monday, October 31, 2016

Econometric Analysis of Recurrent Events


bookjacket
Don Harding and Adrian Pagan have a fascinating new book (HP) that just arrived in the snail mail.  Partly HP has a retro feel (think: Bry-Boshan (BB)) and partly it has a futurist feel (think: taking BB to wildly new places).  Notwithstanding the assertion in the conclusion of HP's first chapter (here), I remain of the Diebold-Rudebusch view that Hamilton-style Markov switching remains the most compelling way to think about nonlinear business-cycle events like "expansions" and "recessions" and "peaks" and "troughs".  At the very least, however, HP has significantly heightened my awareness and appreciation of alternative approaches.  Definitely worth a very serious read.

Monday, October 24, 2016

Machine Learning vs. Econometrics, IV

Some of my recent posts on this topic emphasized that (1) machine learning (ML) tends to focus on non-causal prediction, whereas econometrics and statistics (E/S) has both non-causal and causal parts, and (2) E/S tends to be more concerned with probabilistic assessment of forecast uncertainty. Here are some related thoughts.

As for (1), it's wonderful to see the ML and E/S literatures beginning to cross-fertilize, driven in significant part by E/S. Names like Athey, Chernozukov, and Imbens come immediately to mind. See, for example, the material here under "Econometric Theory and Machine Learning", and here under "Big Data: Post-Selection Inference for Causal Effects" and "Big Data: Prediction Methods". 

As for (2) but staying with causal prediction, note that the traditional econometric approach treats causal prediction as an estimation problem (whether by instrumental variables, fully-structural modeling, or whatever...) and focuses not only on point estimates, but also on inference (standard errors, etc.) and hence implicitly on interval prediction of causal effects (by inverting the test statistics).  Similarly, the financial-econometric "event study" approach, which directly compares forecasts of what would have happened in the absence of an intervention to what happened with the intervention, also focuses on inference for the treatment effect, and hence implicitly on interval prediction.

Sunday, October 16, 2016

Machine Learning vs. Econometrics, III

I emphasized here that both machine learning (ML) and econometrics (E) prominently feature prediction, one distinction being that ML tends to focus on non-causal prediction, whereas a significant part of E focuses on causal prediction. So they're both focused on prediction, but there's a non-causal vs. causal distinction.  [Alternatively, as Dean Foster notes, you can think of both ML and E as focused on estimation, but with different estimands.  ML tends to focus on estimating conditional expectations, whereas the causal part of E focuses on estimating partial derivatives.]

In any event, there's another key distinction between much of ML and Econmetrics/Statistics (E/S):   E/S tends to be more concerned with probabilistic assessment of uncertainty.  Whereas ML is often satisfied with point forecasts, E/S often wants interval, and ultimately density, forecasts.

There are at least two classes of reasons for the difference.  

First, E/S recognizes that uncertainty is often of intrinsic economic interest.  Think market risk, credit risk, counter-party risk, systemic risk, inflation risk, business cycle risk, etc.

Second, E/S is evidently uncomfortable with ML's implicit certainty-equivalence approach of simply plugging point forecasts into decision rules obtained under perfect foresight.  Evidently the linear-quadratic-Gaussian world in which certainty equivalence holds resonates less than completely with E/S types.  That sounds right to me.  [By the way, see my earlier piece on optimal prediction under asymmetric loss.]

Monday, October 10, 2016

Machine Learning vs. Econometrics, II

My last post focused on one key distinction between machine learning (ML) and econometrics (E):   non-causal ML prediction vs. causal E prediction.  I promised later to highlight another, even more important, distinction.  I'll get there in the next post.

But first let me note a key similarity.  ML vs. E in terms of non-causal vs. causal prediction is really only comparing ML to "half" of E (the causal part).  The other part of E (and of course statistics, so let's call it E/S), going back a century or so, focuses on non-causal prediction, just like ML.  The leading example is time-series E/S.  Just take a look at an E/S text like Elliott and Timmermann (contents and first chapter here; index here).  A lot of it looks like parts of ML.  But it's not "E/S people chasing ML ideas"; rather, E/S has been in the game for decades, often well ahead of ML.

For this reason the E/S crowd sometimes wonders whether "ML" and "data science" are just the same old wine in a new bottle.  (The joke goes, Q: What is a data scientist?  A: A statistician who lives in San Francisco.)  ML/DataScience is not the same old wine, but it's a blend, and a significant part of the blend is indeed E/S.

To be continued...

Sunday, October 2, 2016

Machine Learning vs. Econometrics, I

[If you're reading this in email, remember to click through on the title to get the math to render.]

Machine learning (ML) is almost always centered on prediction; think "\(\hat{y}\)".   Econometrics (E) is often, but not always, centered on prediction.  Instead it's also often interested on estimation and associated inference; think "\(\hat{\beta}\)".

Or so the story usually goes. But that misses the real distinction. Both ML and E as described above are centered on prediction.  The key difference is that ML focuses on non-causal prediction (if a new person \(i\) arrives with covariates \(X_i\), what is my minimium-MSE guess of her \(y_i\)?), whereas the part of econometrics highlighted above focuses on causal prediction (if I intervene and give person \(i\) a certain treatment, what is my minimum-MSE guess of \(\Delta y_i\)?).  
It just happens that, assuming linearity, a "minimum-MSE guess of \(\Delta y_i\)" is the same as a "minimum-MSE estimate of \(\beta_i\)".

So there is a ML vs. E distinction here, but it's not "prediction vs. estimation" -- it's all prediction.  Instead, the issue is non-causal prediction vs. causal prediction.



But there's another ML vs. E difference that's even more fundamental.  TO BE CONTINUED...

Monday, September 26, 2016

Fascinating Conference at Chicago

I just returned from the University of Chicago conference, "Machine Learning: What's in it for Economics?"  Lots of cool things percolating.  I'm teaching a Penn Ph.D. course later this fall on aspects of the ML/econometrics interface.  Feeling really charged.

By the way, hadn't yet been to the new Chicago economics "cathedral" (Saieh Hall for Economics) and Becker-Friedman Institute.  Wow.  What an institution, both intellectually and physically.