Today my Quartz column on the changing economic geography of China was published. In this post I intend to cover some extensions of the article that did not make the cut, and in addition go through some of my data analysis procedures so as to provide a resource for fellow students doing similar research.
A central idea is that the base unit of analysis for the Chinese economy should be the province. This is because China's massive size make its provinces as large as entire countries. For example, Guangdong, a coastal province, has 108 million residents. In comparison, Mexico only has 112 million residents and the entire Western United States only has 71 million residents. The entire continent of Europe has only around 740 million people -- a little less than half that of China's 1.3 billion. As such, lumping all the Chinese provinces together into one entity called "China" papers over so much heterogeneity in income levels and growth rates -- resulting in a very misleading picture about the actual economic situation.
To get an idea of these massive income differences and why it's important to look at provincial data, consider the stories of Guangdong and Guangxi, two neighboring provinces in southern China. In 2011, Guangdong, the relatively rich coastal manufacturing center, had per capita income of about 51,000 yuan (~$8,300 USD). Yet Guangxi, an inland province right next door, had nominal per capita income of only 25,200 yuan (~$4,100). Does it really seem plausible that Chinese growth will slow down so suddenly that two neighboring provinces whose names differ by one Chinese character* will maintain such a large income gap into perpetuity? Given that Guangxi's per capita income increased by a factor of 3.36 from 2001 to 2011 and Guangdong's per capita income only increased by a factor of 2.06, I would have to say no. Moreover, even if income levels do not completely converge, income growth should. Since Guangxi's income growth rate is still so high, I have to conclude that it's growth will likely be sustained for some time. Had I not analyzed the provincial data, I would have instead seen a downward trend in national real GDP growth numbers and concluded that China will suddenly slow down. But by taking into account the way growth rates evolve across provinces, I arrive at a more optimistic GDP number.
Geography is especially important given that many of the arguments made by Krugman and a recent IMF working paper center on Chinese labor markets. The argument is that since China has become richer, China has reached "peak peasant" and can no longer sustain such high levels of growth. But I'm left asking -- which provinces have hit this peak? Given that the inland provinces are still relatively poor, there still seems to be a lot of room for these provinces to grow. Although the move towards manufacturing in inland provinces may be a sign that coastal provinces are facing labor shortages, the "reach for peasants" suggests that inland China still has plenty of labor market slack left As a result, I am left quite skeptical about these dramatic bear stories for a sudden slowdown the Chinese economy.
I also want to add one more graphic to this conversation about China's growth. While the scatterplot in the column does a good job of showing convergence, I wanted another plot to just show how much individual Chinese provinces have grown in the 10 years spanning 2001 to 2011. I settled on the chart below. Besides the components in the legend, the small numbers to the left and right of each dot is the nominal per capita income (in thousands) for the specified province and year. The black number in the middle of the band is then the ratio between 2011 and 2001 levels of real GDP.
The nominal number is useful because it allows relatively quick conversions into U.S. dollars. As such, it seems that the per capita income in Shanghai is around $13,300 -- a level slightly ahead of Mexico's per capita income of $10,247 and the U.S. poverty line for a single person household of $11,344. The black multiple then emphasizes how much individual Chinese provinces have grown. These above-three multiples correspond to over 12% growth, so if a child entered elementary school in 2001, then by the time he or she goes into elementary school, GDP in that province would have doubled.
Of course, there are risks to the bull case that I present in my Quartz column.
Chief among these risks is if there's an environmental constraint prevents the inland provinces from obtaining the same levels of income as the coastal provinces. The Solow model (on which convergence is based) does not take into account natural resources, so if natural resources run out this process of convergence could fall apart. This does not have to be a hard scientific constraint either -- public outcry against environmental destruction would have a similar effect. While I agree that China does face serious environmental challenges (particularly in air and water pollution), I don't think protests will play as large of a role that people suggest. Remember that the recent large scale environmental protests -- in Zhejiang against a petrochemical plant and in Guangdong against a nuclear plant -- have taken place in the richer coast. Therefore inland China still has a way to go before this environmental constraint becomes more severe.
Others may raise the issue that the Chinese provincial data are a dangerous form of "science fiction". Indeed, it is a bit peculiar as the sum of all the provincial GDP numbers does not equal the total national GDP. But as Princeton professor Gregory Chow notes, while year to year GDP growth rates may be easy to manipulate, levels are not. Since the levels are recollected every year, measurement errors accumulate and therefore any kind of fake data becomes unsustainable. As a result, I focused on a 10 year average growth rate to resolve the issue of year to year measurement errors. Moreover, a recent San Francisco Fed economic letter found that national Chinese data seems to be accurate and consistent with a wide variety of indicators. Thus it seems doubtful that the main convergence result was just the result of data manipulation.
The bottom line is that China's great size means that attention needs to be paid to the individual provinces. On the basis of the provincial levels of growth, I am left quite optimistic about the future of Chinese growth.
If you want to try and replicate it (please do), just consult the public dropbox folder. The workflow goes from running all the STATA do files first and then transitioning into solowQz.R file to draw all the pictures. I have also included a Makefile to go through this workflow. (A Makefile executes all the code in order according to the dependencies. If you plan on doing any major work with code you really should learn a little bit on how to use them)
The one interesting methodological issue was how I used convergence to forecast future provincial growth. What I did was run a weighted least squares regression of average growth rate against initial log income, in which data was weighted by population and the estimator minimized the sum of weighted square residuals. On the basis of this regression, I assumed that the same relationship between initial income and growth continued into the next ten years and constructed measures of what growth should look like. After I had per capita income estimates, I assumed that population in each province would stay, and on this basis calculated total real GDP numbers by adding up the GDP in each province.
I had a fun time drawing the maps as well. I used R to interface with the GADM databases, and you can look at the code in chinaMap.R to get a better idea of what's going on.
If there are any more questions on code, please reach out. My email can be found on my About Me page.
*Guangxi and Guangdong literally translate to the western and eastern expanses, respectively. They are really are two sides of a lingual coin.