Author Archives: Tom Zimmermann

IMF unsurprises with lower growth forecast

The IMF lowered its forecast for global economic growth for 2018 and 2019, citing structural reasons such as increasing trade protections and a weaker outlook for emerging economies. The Wall Street Journal cites the IMF numbers as weighing on markets on Wednesday (see here).

Nobody should be surprised by downward revisions of the IMF forecasts, however. The IMF’s initial forecasts have been too optimistic each year since the Financial Crisis and have regularly been revised down. While structural economic reasons can plausibly explain the most recent revision, the revision is also consistent with a structural issue in the IMF’s model that leads to forecasts that are initially over-optimistic and require downward revisions afterwards.

To illustrate this point, I downloaded the IMF’s past forecasts for each year since the Financial Crisis, available here.

plot of chunk unnamed-chunk-15

To read this plot, follow any line (say the red one), and you’ll see the growth for that year (for the red line: 2012) as forecasted in different years. You will notice that the forecast history is such that the final forecast is lower than the initial forecast by a fair amount. In between, downward revisions typically happen gradually.

The point is that one desirable property of a forecast is that it is correct on average, i.e. the forecast is sometimes too high and sometimes too low and the forecast error is 0 on average. Here, though, the forecast revisions have mostly been negative, suggesting that there might be an issue with the forecasting model, making downward revisions over time necessary regardless of the structural economic reasons cited above.

Housing valuations in Germany (part 2)

How is the German housing market doing? In my second take on this question, I compare rent to purchase prices for apartments in Germany.

A common measure of valuation, the buy-to-annual rent ratio relates purchase price to annual rent (excluding utilities) for comparable apartments, that is, ideally, apartments of the same size, amenities and location. From an investor's perspective, the ratio answers the question: How many years does it take until my investment is amortized (excluding issues of taxation, maintenance, …), keeping rent fixed?

To compute the price-to-rent ratio, I group apartments by city and size and compute the mean purchase prices and annual rent in each group. I keep only groups with at least 10 apartments listed for rent and 10 apartments listed for purchase. The mean price-to-rent ratio is about 25.

A first result, illustrated in the figure below, is that the price-to-rent ratio increases in apartment size, so it seems that it is relatively better to rent rather than to buy larger apartments right now (unless purchase and rental listings differ more in other characteristics for larger apartments than for smaller ones).

plot of chunk unnamed-chunk-2

How do the big cities do? To illustrate, I focus on apartments with living space between 80 and 110 square meters. Munich and Frankfurt are about on par with price-to-rent ratios above 30, while buying seems relatively more attractive in Hamburg or Cologne.

##                City medianPriceToRent
## 1           München             32.62
## 2 Frankfurt am Main             31.09
## 3            Berlin             28.01
## 4           Hamburg             26.92
## 5              Köln             25.95
## 6         Stuttgart             22.11

Where are price-to-rent ratios highest? Perhaps surprisingly, the top 10 miss a few big cities and include a few unexpected ones (Rostock, Solingen?). In each of these cities, one can buy an apartment for the equivalent of 30 years of rent (without purchase fees, maintenance and such). To put these values in perspective, note that they are about on par with average price-to-rent ratios in cities such as Los Angeles, Seattle or Boston (see here).

##             City medianPriceToRent
## 1         Lübeck             38.94
## 2       Landshut             36.80
## 3        Rostock             36.43
## 4  Halle (Saale)             36.01
## 5       Erlangen             35.42
## 6     Ingolstadt             33.00
## 7        München             32.62
## 8       Solingen             32.56
## 9         Coburg             31.42
## 10       Leipzig             31.27

To provide more context, it would be nice to compare these values to historical price-to-rent ratios in Germany, data that I don't have. Overall, it seems fair to say, though, that buying does not seem like an attractive investment in many places as yields are very poor. Of course, buying comes with other non-monetary utility and benefits which might still make buying a good choice for some.

Housing valuations in Germany (part 1)

How is the German housing market doing? This morning, Zeit-Online reported results of a recent survey suggesting that most people looking for houses consider a price range of between EUR 200,000 and EUR 400,000. The price range aligns relatively well with German median income, yet, it is difficult to find housing in that price range close to the big cities (the article portrays a family that is willing to take on a 2 hour daily commute to find housing that price range).

To get a sense of valuations, I’m starting a series of posts about housing in Germany. I focus on apartments rather than houses (the apartment market seems more liquid and has more observations of listings).

In the first post, I am trying to get a quick sense of how much apartment rents are driven by apartment characteristics versus location characteristics. I start with a simple model of monthly apartment rent as a function of living space, the number of rooms, whether or not the apartment has a balcony, a garden or a kitchen. The idea is that the above apartment characteristics (to the extent available) can be viewed as fundamentals and are unrelated to location characteristics. Of course, location is important and can also provide quality of life. You can think of this post as trying to decompose rents in a part given by apartment quality and a part that might reflect location quality.

I omit the model details here, but the model explains about 70% of the variation in apartment rents. For each apartment, I compute the fundamental apartment rent as the prediced value from my model above.

The figure below shows the distribution of actual rent and rent based on fundamental value. As one can see, fundamental rent is often higher than actual rent.

plot of chunk unnamed-chunk-1

How is this difference between actual and fundamental rent distributed across regions? The figure below shows results. In particular, it shows by how much actual rent deviates from fundamental rent (in percent). Positive values mean actual rent is above fundamental rent and negative values mean the opposite.
Not surprisingly, renting in cities is more expensive conditional on fixed apartment characteristics. You can clearly see Munich, Frankfurt, Stuttgart, Cologne and Berlin on the map.

plot of chunk unnamed-chunk-2

Which are the cities with the highest share of apartments that are listed at least 50% above fundamental value? The top 25 are in the table below and they do include the usual suspects, in particular Munich and its surrounding region. It’s interesting to note that Cologne appears to have a much higher share of those fundamentally overvalued apartment listings than its neighboring city Dusseldorf. Does that suggest higher quality of life in Cologne?

##                               City ShareMorethan50
## 1                          München            0.76
## 2                  München (Kreis)            0.48
## 3                Starnberg (Kreis)            0.48
## 4         Fürstenfeldbruck (Kreis)            0.47
## 5                Frankfurt am Main            0.45
## 6                        Stuttgart            0.44
## 7                  Lörrach (Kreis)            0.43
## 8                            Fürth            0.35
## 9                       Heidelberg            0.35
## 10                            Köln            0.35
## 11                  Dachau (Kreis)            0.34
## 12                Miesbach (Kreis)            0.33
## 13                        Nürnberg            0.33
## 14                          Berlin            0.31
## 15                        Würzburg            0.31
## 16               Ebersberg (Kreis)            0.29
## 17                      Ingolstadt            0.29
## 18                      Regensburg            0.28
## 19                         Hamburg            0.27
## 20                      Düsseldorf            0.24
## 21                       Karlsruhe            0.23
## 22 Bad Tölz-Wolfratshausen (Kreis)            0.22
## 23                        Erlangen            0.22
## 24                         Münster            0.22
## 25               Oberhavel (Kreis)            0.22

Two-way clustering in Stata

How does one cluster standard errors two ways in Stata? This question comes up frequently in time series panel data (i.e. where data are organized by unit ID and time period) but can come up in other data with panel structure as well (e.g. firms by industry and region).

I’ll first show how two-way clustering does not work in Stata. I have seen this occasionally in practice, so I think it’s important to get it out of the way.

The standard regress command in Stata only allows one-way clustering. Getting around that restriction, one might be tempted to

  1. Create a group identifier for the interaction of your two levels of clustering
  2. Run regress and cluster by the newly created group identifier
## 
## 
## . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
## (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
## 
## . egen double_cluster=group(idcode year)
## 
## . regress ln_wage age i.race, vce(cluster double_cluster)
## 
## Linear regression                               Number of obs     =     28,510
##                                                 F(3, 28509)       =     905.75
##                                                 Prob > F          =     0.0000
##                                                 R-squared         =     0.0946
##                                                 Root MSE          =     .45494
## 
##                     (Std. Err. adjusted for 28,510 clusters in double_cluster)
## ------------------------------------------------------------------------------
##              |               Robust
##      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##          age |   .0196731   .0004233    46.48   0.000     .0188435    .0205028
##              |
##         race |
##       black  |  -.1377638   .0059505   -23.15   0.000    -.1494271   -.1261006
##       other  |   .0666999   .0284081     2.35   0.019     .0110187    .1223812
##              |
##        _cons |   1.141686    .012024    94.95   0.000     1.118119    1.165254
## ------------------------------------------------------------------------------

What goes wrong here? You can see already that something is off because the number of clusters is the same as the number of observations. Since, in this dataset, the combination of idcode and year uniquely identifies each observations, the above approach effectively does not cluster at all. Instead, it gives you heteroskedasticity-robust standard errors, which are typically too small. You can check this by comparing to the output the same regression as above but with the robust option.

## 
## 
## . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
## (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
## 
## . regress ln_wage age i.race, robust
## 
## Linear regression                               Number of obs     =     28,510
##                                                 F(3, 28506)       =     905.75
##                                                 Prob > F          =     0.0000
##                                                 R-squared         =     0.0946
##                                                 Root MSE          =     .45494
## 
## ------------------------------------------------------------------------------
##              |               Robust
##      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##          age |   .0196731   .0004233    46.48   0.000     .0188435    .0205028
##              |
##         race |
##       black  |  -.1377638   .0059505   -23.15   0.000    -.1494271   -.1261006
##       other  |   .0666999   .0284081     2.35   0.019     .0110187    .1223812
##              |
##        _cons |   1.141686    .012024    94.95   0.000     1.118119    1.165254
## ------------------------------------------------------------------------------

So how does two-way clustering in Stata work then? There are a couple of user-written commands that one can use. I recommend reghdfe by Sergio Correia because it is extremely versatile. Give him credit for it if you use the command! Other good options are ivreg2 by Baum, Schaffer and Stillman or cgmreg by Cameron, Gelbach and Miller.

One issue with reghdfe is that the inclusion of fixed effects is a required option. Sometimes you want to explore how results change with and without fixed effects, while still maintaining two-way clustered standard errors. A shortcut to make it work in reghdfe is to absorb a constant. In the example above:

## 
## 
## . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
## (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
## 
## . gen temp = 1
## 
## . reghdfe ln_wage age i.race, absorb(temp) cluster(idcode year)
## (converged in 1 iterations)
## 
## HDFE Linear regression                            Number of obs   =     28,510
## Absorbing 1 HDFE group                            F(   3,     14) =      99.06
## Statistics robust to heteroskedasticity           Prob > F        =     0.0000
##                                                   R-squared       =     0.0946
##                                                   Adj R-squared   =     0.0945
## Number of clusters (idcode)  =      4,710         Within R-sq.    =     0.0946
## Number of clusters (year)    =         15         Root MSE        =     0.4549
## 
##                            (Std. Err. adjusted for 15 clusters in idcode year)
## ------------------------------------------------------------------------------
##              |               Robust
##      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##          age |   .0196731   .0014594    13.48   0.000     .0165431    .0228032
##              |
##         race |
##       black  |  -.1377638   .0133762   -10.30   0.000     -.166453   -.1090747
##       other  |   .0666999   .0664563     1.00   0.333    -.0758347    .2092346
## ------------------------------------------------------------------------------
## 
## Absorbed degrees of freedom:
## ---------------------------------------------------------------+
##  Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
## -------------+-------------------------------------------------|
##         temp |            1               1              0     | 
## ---------------------------------------------------------------+

Compared to the initial incorrect approach, correctly two-way clustered standard errors differ substantially in this example.

What goes on at a more technical level is that two-way clustering amounts to adding up standard errors from clustering by each variable separately and then subtracting standard errors from clustering by the interaction of the two levels, see Cameron, Gelbach and Miller for details. The incorrect group ID approach only computes the interaction part.

European inflation trends in June

Eurostat released updated inflation numbers for Euro area countries last week. While data for some data is available through July, for most countries data are available through June 2018. Let us first look at annualized inflation based on the harmonized consumer price index in June 2018.

plot of chunk unnamed-chunk-2

Overall price growth has been close to 2 percent in June but has exceeded 2 percent in a number of large economies, including France, Germany and Spain. At the surface, if the trend continues, the ECB might get under pressure to raise rates before summer 2019. Core inflation (price growth excluding energy and food), however, is still low across most economies. Looks like we'll have low nominal rates for a while.

plot of chunk unnamed-chunk-3