FAQ

Comment on 2020 May update

Published June 4, 2020

The latest C-GIDD update went live on May 11, 2020. There are a number of important changes in this update beyond the usual incorporation of new forecast and historical data:

New base year

The base year in the database has been switched to 2015 for constant local currency, USD, and PPP values. This is inline with the latest UN National Accounts data published in December 2019.

New socioeconomic levels (SELs) definition

SEL definition has been changed based on the latest study by AMAI in Mexico. New definition has been applied uniformly across years and geographic regions in the latest update. As a result, the breakdown of SEL distribution is different from that in the older versions. Below is a comparison of world SEL distribution for 2018 in the latest update and the previous update:

Economic forecast

IMF's World Economic Outlook April edition only provided economic forecast for 2 years due to the uncertainties created by the Covid-19 Pandemic. C-GIDD has incorporated the latest outlook, and extended the forecast beyond the 2-year period:

Country Reports

Published February 24, 2015

We started selling country reports on Research and Markets in 2014. As of today, the following are available:

  • Angola
  • Brazil
  • Cuba
  • Ethiopia
  • Ghana
  • Iran
  • Mexico
  • Myanmar
  • Nigeria
  • Pakistan
  • Tanzania

The reports can be bought here:http://www.researchandmarkets.com/search.asp?q=canback&cat_id=
Visit our Wikipedia page

Comment on C-GIDD November 11 Update

Published November 17, 2014

The latest C-GIDD update went live on November 11, 2014. There are significant changes beyond the usual incorporation of IMF’s latest economic projections. The three additional main changes are discussed here: 

New Purchasing Power Parity (PPP) Estimates

C-GIDD's PPP/USD multipliers are collected from the World Bank’s International Comparison Program (ICP). In June 2014, ICP released the 2011 round. Comparing to the previous 2005 round, some countries’ PPP/USD multipliers have changed significantly. As a result, GDP (gross domestic product) and HCE (household consumption expenditure) levels in PPP$ are not comparable to previous C-GIDD data. In addition, since PPP$ values are used in socioeconomic level calculation, socioeconomic class breakdown may change significantly. 

The following graph shows a comparison between the new PPP/USD multiplier to the old multiplier for select countries with large differences. United States by definition has a PPP/USD multiplier of 1.

 Largest differences to the left in the graph

New city population

C-GIDD collects city population data from various sources. The three main ones are the UN’s Urbanization Prospects, Eurostat, and the US Census Bureau. All three main sources have updated their estimates for city population.

  • UN used to cover cities with more than 750,000 inhabitants. The 2014 revision covers cities down to 300,000 inhabitants. C-GIDD now use this instead of alternative sources for most cities outside the US and the EU. City population may have changed due to switching the source. UN also adjusted population for quite a few cities which is reflected in C-GIDD as well.  
  • Eurostat has released new city population data for EU cities and city definitions have sometimes been changed
  • US MSAs have been redefined by the Census Bureau and some cities now have quite different population

The table below lists a few examples where city populations have changed significantly:

COUNTRY CITY OLD 2014 POPULATION NEW 2014 POPULATION % CHANGE
Spain Cordoba

828,007

333,275

-59.7%

Bangladesh

Khulna

1,973,628

1,032,798

-47.7%

Togo

Lome

1,756,553

929,017

-47.1%

Nigeria

Warri

1,145,342

629,996

-45.0%

Iraq

Kirkuk

1,125,148

638,273

-43.3%

Turkmenistan

Ashgabat

1,241,303

731,828

-41.0%

China

Yichun

1,088,777

652,971

-40.0%

Nigeria

Kaduna

1,699,888

1,032,760

-39.2%

United States

Springfield

708,728

628,814

-11.3%

United States

Phoenix

4,903,766

4,451,666

-9.2%

United States

Charlotte

2,026,547

2,368,127

16.9%

China

Nanning

2,276,011

3,110,896

36.7%

Indonesia

Jakarta

20,941,035

29,390,394

40.3%

Germany

Dresden

935,366

1,354,010

44.8%

Egypt

Cairo

11,722,921

18,399,340

57.0%

Italy

Naples

2,201,545

3,514,002

59.6%

Bolivia

Cochabamba

687,531

1,202,658

74.9%

Japan

Fukuoka

2,945,051

5,520,235

87.4%

Liberia

Monrovia

615,520

1,206,165

96.0%

China

Shaoxing

1,007,073

2,000,044

98.6%

South Africa

Johannesburg

4,051,427

9,139,526

125.6%

Tunisia

Tunis

834,845

1,976,910

136.8%

Using new data, some cities are added to C-GIDD database and some cities are removed. Sometimes cities are dropped because they now have <500,000 inhabitants; sometimes because they have been absorbed into other cities. In total, 997 cities are now covered. The list of added and removed cities is summarized here:

NEW C-GIDD CITIES

Argentina China Germany Nigeria Syria

Santa Fe

Luohe

Mannheim-Ludwigshafen

Nnewi

Al-Raqqa

Qingyuan

Osnabruck

Osogbo

Belgium

Tianshui

Wurzburg

Owerri

Tunisia

Gent

Wuzhou

Uyo

Safaqis

Yongzhou

Ghana

Benin

Zhucheng

Sekondi Takoradi

Philippines

United Kingdom

Abomey-Calavi

General Santos City

Bournemouth

Colombia

Israel

Brazil

Ibague

Be'er Sheva

Russia

United States

Jundiai

Pereira

Makhachkala

Deltona-Daytona Beach

Libya

Tula

Spokane

Bulgaria

Congo-Kinshasa

Misratah

Winston-Salem

Plovdiv

Tshikapa

Saudi Arabia

Mexico

Hufuf-Mubarraz

Venezuela

China

France

Celaya

Barcelona-Puerto La Cruz

Chengde

Saint-Etienne

Somalia

Chenzhou

Netherlands

Hargeysa

Viet Nam

Deyang

Germany

Eindhoven

Bien Hoa

Linfen

Aachen

Spain

Liuyang

Heidelberg

Granada

DROPPED C-GIDD CITIES

Argentina China China Denmark Nigeria

Cordoba

Jiangjin

Songyuan

Aarhus

Ogbomoso

Jiangmen

Suizhou

Austria

Jingdezhen

Tongchuan

France

Poland

Linz

Jingmen

Tonghua

Douai-Lens

Bialystok

Jinjiang

Tongzhou

Szczecin

China

Kaiping

Wuchuan

Germany

Beihai

Kunshan

Wuhai

Bielefeld

South Africa

Cangzhou

Leizhou

Wujiang

Magdeburg

East Rand

Danyang

Lianjiang

Xingning

Danzhou

Liaoyuan

Yakeshi

Iraq

Spain

Dongtai

Linhai

Yangjiang

Nasiriyah

Badajoz

Dongyang

Longyan

Yixing

Santa Cruz de Tenerife

Feicheng

Meizhou

Yiyang

Japan

Fengcheng

Nanping

Yongchuan

Kagoshima

Turkey

Fengfeng

Puyang

Zengcheng

Matsuyama

Sanliurfa

Fuqing

Qidong

Zhangjiagang

Okayama

Fuzhou

Qinzhou

United States

Gaozhou

Sanya

Congo-Kinshasa

Korea, South

Poughkeepsie

Guangyuan

Shanwei

Kolwezi

Cheonan

Hebi

Shishi

Pohang

Yemen

Hechuan

Shouguang

Cote d'Ivoire

Hodeidah

Huazhou

Shuangyashan

Yamoussoukro

Malaysia

Jiangdu

Kota Kinabalu

New US City GDP data

US Bureau of Economic Analysis has changed US GDP estimates. However, MSA level GDP data seems to be disconnected with the top level GDP value. Hence, city level GDP data has changed in C-GIDD, in many cases dramatically. Our policy is to trust official sources even when we have doubts, but we had to make an adjustment to the raw data for Massachusetts.

In summary, the latest C-GIDD is significantly differently from the previous update due to the combinations of the changes above.

How do socioeconomic levels (SELs) correspond to income brackets?

Published September 28, 2014

SELs are highly correlated with corresponding income brackets, but it is not the same thing. That is, when asked, we are unable to provide  what income e.g. C-class household have. It varies significantly between countries and to some extent within countries.

There are two major reasons:

1. Household size varies across the world from a low of 2 people per household, to as much as 9 people per household. In larger households there are economies of scale and thus the marginal spending on household members decline. An example of adjusting for this is the definition of the US poverty line which varies non-linearly as a function of number of household members. The implication is that the income bracket for a socioeconomic class differs both when looking at the household, or when looking at individual household members.

2. Purchasing power varies across the world. This means that an income expressed in USD in an emerging country buys roughly twice as much goods and services as the same income in the US. Because of this, the income brackets for each SEL differ vastly.

We recommend working with either SELs or income brackets. Combining the two can be done, but requires a few years of experience in working with C-GIDD and similar data.

Comment on Nigeria's rebased national accounts

Published April 19, 2014

Nigeria's National Bureau of Statistics released rebased national accounts on 13 April 2014. This includes a switch to the SNA 2008 from SNA 1993 definitions.  This is the same process as in many other countries such as the United States a year ago, Ghana in 2010, Tanzania, and Mozambique. Many more countries will follow, in Africa and elsewhere.

The new GDP numbers increase GDP by 56%. The widely reported 89% increase is incorrect because that number also includes a revised (and better) inflation estimate. When taking out this inflation component, the increase is 56%.

The full set of data have not been released yet. The current release only includes 2010-2013 and the production view of the national accounts (which industries contribute to GDP). C-GIDD requires the expenditure view and longer historical time series, as well as purchasing power parity numbers. How do we handle this in the interim?

We have used Ghana's 2010 rebase as our template. Ghana increased its GDP by 60% then in a process similar to what Nigeria is doing now. Ghana's rebase (like Nigeria's) was vetted by the UN and the IMF.

  1. To estimate GDP before 2010, we applied the growth from the previous time series before 2010. The alternative to this (not used in Ghana) would be to say that the previous base year, 1990, reprented that years reality and consequently the years between 1990 and 2010 should have a higher growth rate than the growth in the previous time series. However, this would make the GDP growth rate 1990-2010 unrealistically high, almost at the level of China's 20 year performance.

    Thus, we parallel shift each year's GDP upward by around 56% so that previous growth rates are maintained.

  2. We currently don't have the household consumption expenditure (HCE) data since only the production view of national accounts has been released. We maintain the same ratio between HCE and GDP as the previous time series had. This is likely to change when more detail is released, but we believe that this interim method bring us closer to reality.

  3. We maintain the PPP$/USD multipliers from the International Comparison Programme and apply them to the new GDP and HCE estimates. That is, GDP in PPP terms increases by 56% as well. This is a questionable assumption. PPP estimates should be independent of changes in national currency estimates. However, this is how the Ghana rebase was handled, so we do the same.

We may reconsider the approach above when the full Nigerian rebase report is released. The Nigerian preliminary report is found here: http://www.nigerianstat.gov.ng/report/203

Note on national accounts versus income/expenditure surveys

Published January 20, 2014

There is typically a big difference between the household expenditure as measured by in the national accounts, and household expenditure as measured by household income and expenditure surveys (often called HIES). Using India data, we explain the issue and the C-GIDD way of handling it.

This is a very complex issue which academics have spent many many-years looking into, so our attempt to simplify the reasoning in one note is by definition flawed.

DEFINITIONS

In this note we use the following:

NA = National accounts. Here, this is the expenditure approach to GDP

HCE = Household consumption expenditure (sometime called Personal consumption expenditure or Private consumption expenditure). What households spend on goods and services according to NA. A macro/top-down view of spending

HIES = Household income and expenditure surveys. Surveys carried out through interviews with households, sometimes every year, usually every 5-10 years

HS = Household spending according to HIES. Should be close to HCE, but isn't

HI = Household income according to HIES = HS + Savings in theory, but essentially meaningless number in practice

(NAs do not capture household income)

ISSUES

It is well known among those who study household income and spending that the NA lens and the HIES lens give different results and the gap is widening. HCE is almost always higher than HS (and HI).

1st issue: Which definition should I use if I want to quantify market opportunities for a category or product?

2nd issue: How should I analyze sub-national data?

3rd issue: What happens when I convert into USD?

India is an excellent example of these issues. The following is from the Indian 2011/12 NA and HIES published data.

According to NA, India's monthly HCE per capita is 3,450 rupees. According to HIES, HS is 1,727 = 50.1% of HCE. That is, HIES claims that spending is half the size of what the national accounts say! (Note that the discrepancy is quite large in affluent countries as well. E.g., 35% in the US, 20% in the UK.)

Moreover, if we look at this in market exchange rate $ (USD) or purchasing power parity dollars (PPP) makes a big difference. If we use USD combined with HS, monthly household spending seems to be $37 per month. If we use PPP and HCE, monthly household spending seems to be $176 per month. 4.8 times larger!

C-GIDD RESOLUTION

1st issue: C-GIDD uses the NA definition of household spending (HCE) because we find that it reflects reality better. That is, we believe the 3,450 monthly per capita spending number reflects reality best.

This is in line with the Indian view. E.g., the 'National Seminar on NSS 61st Round Survey Results' notes that "The analysis shows that the survey results need to be juxtaposed with other macro variables to understand the patterns properly. This indicates the pitfalls in inter-temporal comparisons of the survey data alone. The survey data need to be appropriately adjusted using the macro data so as to conform to the macro picture." I.e., HIES needs to conform to NA. It is also in line with the view in other countries.

We recommend against using the HIES results without rescaling to the national accounts. In C-GIDD, household spending conforms to HCE (plus government spending on behalf of individuals).

2nd issue: Sub-national household spending data is sometimes only available from HIES (i.e., in HS form). We rescale this to the national level in the following manner, again using India as an example:

HCE as a % of GDP is 57.1%. HS as a % of GDP is 28.6%. The difference is 28.5%. For each state or other subnational unit, we add 28.5% to HS as reported in HIES. Why a proportional "shift"? Why not multiple by 1.997 (57.1%/28.6%) instead? There is no good answer to this question except that our method gives more sensical results. Under any circumstance, an adjustment has to be made.

This is an adaption of formula (7) in Deaton's paper "Measuring Poverty in a Growing World" (2003). As Deaton points out, there is not yet enough research on this issue.

3rd issue: We view both the PPP and USD exchange rate method as valid and the choice depends on the product or service being analyzed, and the reason for the analysis.

In general, we recommend PPP when analyzing demand  and marketing problems and countries are compared (if you only look at one country then local currency is best). However, if the product is a high end good sold at more or less the same price in USD around the world, then we recommend using USD.

For financial analysis, we recommend using USD since consolidated profits are measured in USD (or EUR, or similar).

                                                                                *  *  *

This reflects how C-GIDD looks at household spending. There are many other intricacies. The key point we want to make is: "Don't use household spending (or income data) from local surveys. You will almost always underestimate the spending and income levels. Sometimes by a very large factor. Instead, use the national accounts like we do in C-GIDD.

Comment on latest (Dec. 2013) UN national accounts release

Published December 30, 2013

The United Nations is the global repository for national accounts. Once a year (typically in December) one year is added and historical revisions are made based on input from national statistics offices. The latest update was released in December 2013 and covers 1970-2012 (the previous release covered 1970-2011).

It may come as a surprise to the casual observer that GDP and other data series can change dramatically from one year to the next. However, this is the case because statistics offices change calculation methods, get better data, etc.

The latest release in December 2013 was no exception. While most countries have the same historical GDP in the 2013 and 2012 releases, many countries changed substantially. A few examples:

- The United States increased historical GDP by 3-5%

- Saudi Arabia revised the last few years' GDP upward by around 20%

- Nicaragua increased GDP by around 30% from 1970 and onwards

There are many other large changes, especially in later years. We will incorporate these changes in the January 2014 C-GIDD update. A comparison between Dec. 2013 and Dec. 2012 constant GDP and constant HCE data is found at http://canback.com/un_comparison_2013-2012

Comment on Nigeria's GDP and its components

Published January 11, 2013

Over the past few years, customers have often asked us about the Nigerian GDP and income data. There are some major issues with Nigeria's national accounts. Unfortunately they have not yet been resolved, but the Financial Times reports that a major overhaul is due this year.

But even before this, there are some major changes. With the release of the UN's National Accounts Main Aggregates Database (NAMAD) in December 2012 (covering 1970-2011), same-year GDP in 2010 increased around 27% compared to the release in December 2011 (covering 1970-2010), and household consumption increased more than 100%. 2002-2010 has material increases.

The UN NAMAD is the global repository for national accounts data and feeds into other databases like the World Bank and the IMF, and quality control is strict. Thus, the changes have to do with re-estimates and changes in Nigeria.

As a consequence, our income and socioeconomic data sees major changes and the affluent part of the population has grown tremendously between our November 2012 and January 2013 release.

This is highly unsatisfactory given the importance of Nigeria. We hope the FT's report is correct and that the overhaul will bring more clarity.

Comment on Argentine subdivision GDP

Published December 9, 2012

Argentina does not have a method for delivering subdivision GDP on a timely basis. In the latest release (Nov. 2012), a few subdivisions have 2011 numbers, many have 2010, but a few small provinces have not been updated since 2005-2008. Ciudad de Buenos Aires and Provincia de Buenos Aires report till 2010. We estimate the missing historical data using the same method as when we create subdivisional forecasts. It is unlikely that reporting from the various statistical organizations in Argentina will improve soon.

Comment on current versus constant values and PPP$ versus US$

Published October 2, 2012

Some of our customers make basic mistakes when buying data. 1) They use current values; and/or they mix PPP$ and US$.

1) Only expert users should buy current value GDP or income. Current values include inflation and sometimes exchange rate fluctuations. This makes the metric meaningless for most analyses. The one good reason to use current values that we have come across is to rescale constant values to a new base year.

  • Current values in local currency (LCU) include local inflation
  • Current values in PPP$ include United States' inflation
  • Current values in US$ includes local inflation and the exchange rate fluctuation

Instead, always use constant values. The base year is 2005 (this means that 2005 current and constant values are the same). It also means that the 2005 exchange rate is applied to all years for constant values. Our income bracket data are always expressed in constant values.

2) PPP$ and US$ is not the same thing. PPP$ have an adjustment for cost-of-living. US$ has no such adjustment. In general, PPP$ gives a more accurate view of how large consumer markets are and for how to price products and services, although this is not always the case.

Colombia and Russia subdivision data

Published September 14, 2012

C-GIDD reports Colombian subdivision data at the NUTS level (7 regions) and Russian subdivision data at the Federal District level (8 regions). However, the underlying database works with more detail. For Colombia, we use the 33 departments, for Russia, the 80 subjects. This data is available to customers on request. 

The reasons for aggregating the data on the C-GIDD site are: 

1) The departments and subjects are too small compared to subdivisions in other countries and no reasonable analysis require this level of detail 

2) Including data at this level would slow the computational speed of the database which already at the limit of what is acceptable (an update of C-GIDD takes 2 hours of computation time)

Introducing the C-GIDD long-term database

Published July 26, 2012

Starting 26 July 2012, we offer a new database with 1970-2032 national data. The reason for the long time series is that many customers want longer projections than the 5 years we offered previously. These can e.g., be used for net present value analyses of investment opportunities. 

The difference from the main C-GIDD database is that the long-term database does not contain sub-national data (subdivisions or cities) and income brackets are not customizable. 

Sources for income data are, as before, the UN's National Accounts Main Aggregates Database for historical data and the IMF for the first 5 years of projections. For the 6-20 year projections we use the Economic Research Service data available from the USDA (with a gradual transition from IMF to ERS years 6-10). For population data, we use the UN's World Population Prospects database. Data sources are documented in "FAQ: C-GIDD data sources at the national level (with links)". 

The long-term C-GIDD dataset is equivalent to the main C-GIDD dataset at the national level for overlapping years (currently 1997-2017).

Definition and source data for cities in C-GIDD

Published July 7, 2012

As of July 2012, C-GIDD covers 1020 cities with population larger than 500,000 in 2010. C-GIDD uses the following city definitions and source data: 

Cities are defined as the continuous  urban area without regard to administrative borders. This, e.g.,  corresponds to the Metropolitan Statistical Area (MSA) in the United States or the Larger Urban Zone (LUZ) in Europe. To illustrate: Boston includes the city of Boston, Cambridge, Quincy, and dozens of other towns and cities. 

The primary population data is the UN’s World Urbanization Prospects. It covers (almost) all cities in the world with population larger than 750,000. For cities between 500,000 and 750,000, we use CityPopulation which is a highly reliable source for city data. 

There are two exceptions: For the United States, we use the Bureau of Economic Analysis’ MSA data. For the European Union, we use Eurostat’s LUZ data. 

There are 20 special cases because of errors in the UN dataset and a few omissions in Eurostat, and one known error that nevertheless is included in C-GIDD until we find a way to correct it (Guatemala City is defined by administrative area in the UN database). 

Finally, note that we (like the UN and CityPopulation) define Chinese cities based on urban areas, and not the Chinese concept of “prefecture-level cities.” A PLC is not a city; it is what in other countries would be called a state or province (see our “Comment on Chinese city definition and population” at http://goo.gl/PX9Dq).It is totally incorrect to think of a PLC as a city, or to use data from the China City Statistical Yearbook to analyze cities. 

World Urbanization Prospects:  http://esa.un.org/unpd/wup/index.htm

Definition and source data for urban and rural areas in C-GIDD

Published July 7, 2012

C-GIDD divides countries and subdivisions into 3 parts: Major cities, other urban areas, and rural areas.

 Major cities are described in “FAQ: Definition and source data for cities in C-GIDD”. Other urban areas and rural areas are defined by the UN in World Urbanization Prospects.The split between urban and rural varies by country but is roughly defined by communities with more or less than 2000-5000 inhabitants. (To clarify, other urban areas = urban areas minus major cities). 

At the national level, the urban/rural percentages come directly from World Urbanization Prospects. At the subdivision level, the data comes from each country’s national statistics office. Those numbers are then harmonized with the national number and city populations so that everything adds up. 

Definitions in World Urbanization Prospects  https://esa.un.org/unpd/wup/CD-ROM/

C-GIDD data sources at the national level (with links)

Published April 1, 2012

C-GIDD data at the sub-national level adds up to well-recognized national data from the UN, Worldbank and IMF. If you need the underlying national data, you can find them at: 

Historical national accounts

UN National Accounts Main Aggregates Database 

http://unstats.un.org/unsd/snaama/introduction.asp

Future GDP projections

IMF World Economic Outlook Database 

http://www.imf.org/external/ns/cs.aspx?id=28

Historical and future populations

UN World Population Prospects 

http://esa.un.org/unpd/wpp/

Historical and future urbanization

UN Urbanization Prospects 

http://esa.un.org/unpd/wup/index.htm

Purchasing Power Parity GDP and HCE

Worldbank International Comparison Program 

http://siteresources.worldbank.org/ICPEXT/Resources/ICP_2011.html

Income Distribution data

UNU-WIDER  World Income Inequality Database 

http://www.wider.unu.edu/research/Database/en_GB/database/

Other useful data sources

Worldbank World Development Indicators 

http://databank.worldbank.org/ddp/home.do?Step=12&id=4&CNO=2

ERS International Macroeconomic Data Set 

http://www.ers.usda.gov/Data/macroeconomics/

C-GIDD release dates

Published January 2, 2012

Major C-GIDD updates are made thrice a year.

Two updates follow the IMF's semi-annual release of forecasts for 190 countries' economic growth 5 years into the future. The IMF forecasts are typically released in April and October, but  months vary from year to year. C-GIDD is released 1-3 weeks after the IMF release. 

The third C-GIDD update is released in January. This follows the UN's annual release of 210 countries' national accounts. The UN's release is typically at the end of December, 12 months after the latest data year (e.g., 2010 data were released in December 2011). 

In these three releases, other data than the national data may change. E.g., if a country has a new census, then city and subdivision population for that country will change. The same applies to subdivision GDP, income, and Gini data. All such sub-national data are changed on an ad hoc basis and batched with the the three releases above. Further econometric  methods change on an ad hoc basis (whenever we find a way to improve the database). 

Changes are announced at 'C-GIDD News' on Twitter.

What is meant by "Entire City" in C-GIDD?

Published October 29, 2011

In the C-GIDD geographic unit selection box, some cities are labeled "Entire City". This happens when a city crosses administrative subdivision borders. E.g., the city of New York is located in the states of New York, New Jersey and Pennsylvania according to the common Metropolitan Statistical Area (MSA) definition. This means that: 

New York (Entire City) = New York, NY + New York, NJ + New York, PA 

Currently (October 2011), 45 of the 1018 cities in C-GIDD cross subdivision borders:

Allentown, Athens, Augusta, Berlin, Bogota, Boston, Brasilia, Bremen, Brussels, Buenos Aires, Chandigarh, Charlotte, Chattanooga, Chicago, Cincinnati, Delhi, Frankfurt, Hamburg, Jakarta, Kansas City, Lagos, London, Louisville, Manila, Memphis, Mexico City, Minneapolis-Saint Paul, New York, Omaha, Osaka, Ottawa, Paris, Philadelphia, Portland, Providence, Puebla, Saint Louis, Seoul, Tampico, Teresina, Tokyo, Torreon, Virginia Beach, Washington, Youngstown

Actual versus forecasted years in C-GIDD

Published October 13, 2011

This is written in October 2011: 

1. Population (individuals or households) numbers are actual till the latest census available and estimates thereafter. When the latest census happened varies. India conducted one in 2011, so that is in the dataset. Angola did its last census in 1970. 

 However, the post-census population numbers are quite correct since population growth is fairly stable, so in practical terms one can say that population is actual in the entire database. 

2. GDP and HCE numbers are more up to date. The current official actual GDP numbers run till 2009 and full year 2010 numbers are quite close to actual since they are based on 3 quarters of 2010. The 2010 numbers are estimated by the IMF (as are 2011-2016). The UN maintains the global database till 2009 (they will release full year 2010 numbers in November or December). 

3. Subdivision and city data are less up-to-date. Such numbers are typically released 1-2 years after the national numbers, sometimes even later. However, our subdivision/city forecasts add up to the national numbers so one may call them “semi-estimates” for the later years.

African data sources and quality

Published August 21, 2011

We are often asked where we we find our African data and what the quality of the data is. In fact, African statistics are not harder than to find than for other low- or middle income countries. Here are a few observations:

1. African population data are typically from national censuses performed every 5 or 10 years and sometimes with population surveys in-between. The censuses are often performed by the NSO or national census organization, but sometimes by the UN. Population data are typically available by subdivisions and cities within a given country. We use interpolation between census years similar to what all census organizations do to report yearly data

An important issue in the database is what is meant by a city. We use the UN’s definition of metropolitan areas as contiguous areas of economically integrated activities (same as MSA in the US and LUZ in Europe), not the administrative/judicial boundaries. In Africa (and some other countries), we verify this by looking at satellite photos of cities (Pietermaritzburg posed a special challenge).

2. GDP and income data within African countries come from a number of sources.

a) The UNDP, World Bank, or national statistical organizations almost always have estimates of subdivision GDP, HCE, or poverty statistics (poverty percentages are strongly correlated with HCE). We have only come across 3 countries that lack such data: Eritrea, Libya and Tunisia. Note that this kind of data is not measured every year (which is true for most countries in the rest of the world as well).

b) For Eritrea, Libya and Tunisia we use proxy variables at the subdivision/city level. Stunting for the first two, literacy for the last.

Thus, most African countries perform regular household economic surveys. This data is collected in cities, towns and rural areas and often reported at this level. An example is a recent survey in Somalia (2012).

3. Income distribution within African countries is, like overall household income, fairly is easy to come by because of the household economic surveys. Distributions are quite stable over time, so there is no pressing need for time series data.

Namibia has the most unequal income distribution in the world and, in general, African income is highly concentrated among the few. 

4. With these technical observations, how do we assess the quality of African data? Clearly, there is tremendous variation. The most affluent African countries like Mauritius, Botswana and South Africa have the best data, followed by progressive countries like Ghana and Uganda. Also, some of the worst economic performers like Liberia and Sierra Leone have excellent data because of the immense effort by NGOs. In general, African data is better than e.g., Indian, data, much better than data from the Middle East, behind Latin America, and far behind China, Russia, Europe and the US. Particularly difficult countries are the conflict countries (e.g., Somalia, Zimbabwe, Congo-Kinshasa) and the North African countries, which, like most Middle Eastern countries, perhaps lack an appreciation of the importance of working with facts.

C-GIDD anchor points

Published August 21, 2011

C-GIDD numbers are anchored in a reality with which (at least some) people are familiar. 

The first anchor for C-GIDD are the national GDP and population numbers issued by the UN once a year (very similar to the World Bank’s WDI database and what national statistics offices issue). E.g., if you download Senegal’s total GDP and population in 2002, you will find that our numbers are exactly the same as reported by the UN. (The single exception in Africa is Nigeria’s population pre-2006 which for some reason is incorrectly reported in all official reporting, but we have corrected.) Of course, one can debate if the official numbers are correct, but we view them as the best available estimates: always incorrect, but difficult to improve upon. See http://goo.gl/3xMEG

The second anchor is the IMF’s projection into the future. We use this as the benchmark for our 5-year projections at the national level. The IMF projections are released in April and October every year and we update our data shortly thereafter. See http://goo.gl/5q5jM

The third anchor is the International Comparison Program’s 2005 revision which gives purchasing power parity GDP and income numbers for almost all countries in the world. Thus, e.g., C-GIDD’s 2005 PPP-based GDP numbers for Mozambique is equivalent to the ICP estimate. The ICP's PPP data are updated every 5-10 years. The current revision is made in 2011 and will probably be released in 2012 and 2013. See http://goo.gl/m7xw0

In sum, national data in C-GIDD are equivalent to well known official sources.

C-GIDD data sources

Published April 15, 2011

C-GIDD incorporates historical data from many sources and in addition uses econometric analysis to harmonize data between countries and to fill data gaps. For projections, C-GIDD is uses a combination of other institutions forecasts (usually at the national level) and our own forecasts (usually at the subnational level).

The database is not just a reporting back data from different sources. Rather, it uses massively complex algorithms to extract knowledge hidden in the plain data accessible to anyone with patience and language skills. Thus, it is somewhat simplistic to ask what the data sources are because they are only a small part of the database. Yet it is of course important to know the source of the underlying data.

ANCHOR POINTS

It is important to note that we have anchored C-GIDD’s numbers in a reality with which at least some people are familiar. The first anchor for the database is the national GDP and population numbers issued by the UN once a year (very similar to the World Bank’s WDI database and what national statistics offices issue). E.g., if you download C-GIDD Senegal’s total GDP and population in 2002, you will find that our numbers are exactly the same as reported by the UN. Of course, one can debate if the official numbers are correct, but we view them as the best available estimates: always incorrect, but impossible to improve upon.

The second anchor is the IMF’s projection into the future. We use this as the benchmark for the current year and 5 years into the future at the national level. Around 25 small countries are not covered by the IMF and for these we use other sources like the ADB or we make our own forecasts..

The third anchor is the International Comparison Program’s (ICP) “2005 revision” which gives purchasing power parity (PPP) GDP and income numbers for 2005 for almost all countries in the world. As an example, C-GIDD’s 2005 GDP numbers for Mozambique are equivalent to the ICP estimate.

The fourth is the UN's World Population Prospects World Urbanization Prospects which contain time series data for population, urbanization, and city populations (cities in the US use census data and in the US use Eurostat data).

In sum, country-level data in C-GIDD are equivalent to well known official sources.

SUB-NATIONAL DATA

We provide three types sub-national data:

A) Subdivisions for 37 countries

Subdivisions are states, provinces, divisions, regions and other administrative units directly below the national level. Examples: Arkansas in the US, Punjab in India, Tokyo in Japan (the prefecture, not the city), Tierra del Fuego in Argentina.

For subdivisions, we collect data from the national statistics office. We have collected this data for a long time and by now have full data sets going back 20 years for most of the 37 countries. For a few, we have estimated the distant history using econometric analysis. For projections, we also use econometric analysis and impose the condition that the sum of subdivisions should equal the total country’s GDP, population, etc.

B) Cities with more than 500,000 inhabitants

We view cities as more important than subdivisions. For cities, we collect historical population numbers from various sources including the UN, Eurostat, country censuses and NSOs. GDP and income numbers are estimated econometrically. The basic approach s to find the GDP and income reported for the smallest administrative unit surrounding the city, usually a sub-subdivison. Examples are counties (below states) in the US, subjects (below federal districts) in Russia, and regencies (below provinces) in Indonesia.

Of the more than 1,000 cities we cover, more than 90% are “well-constrained” by such administrative units. Further, we differentiate the city GDP and income levels from the surrounding rural areas. E.g., GDP per capita for Sao Paulo City is higher than in the remainder of Sao Paulo State (other cities, other urban and rural areas).

C) Urban/rural split

Finally, we provide GDP, income, and socioeconomic profiles for urban and rural areas. This split is based on the widely differing income levels of urban and rural dwellers. E.g., in a typical emerging country, an urban dweller makes 2-3 times as much money as a rural dweller. For many countries (e.g., China) data on this income difference is available at the national or subdivision level. We apply econometric analysis when such data is not available.

INCOME DISTRIBUTION

Beyond the geographic income distribution discussed above, we also provide income distribution within each geographic unit (countries, subdivisions, cities, other urban, and rural).

The data for this comes from many sources. Importantly, the WIDER tracks income distribution surveys and serves as a basis for C-GIDD. Further, we collect income distribution statistics from NSOs, often found in “household economic surveys.”

Note that while WIDER and NSOs have income distribution data, but converting this data into actual incomes is a very difficult task. We are the only people in the world who can do this correctly and our methods are proprietary. 

Finally, we provide socioeconomic levels in terms of number of individuals or households within a given class. We use the Mexican definition from AMAI because we consider it the best SEL schema in the world. We apply the AMAI classification to all countries in a uniform fashion. A description of Mexican SEL definitions is available on the C-GIDD website.

Definition of socioeconomic levels

Published April 4, 2011

We are often asked how we define socioeconomic levels (SEL) around the world. We use the Mexican definition developed by AMAI and apply it consistently to all countries (regardless of if a particular country has its own SEL definition). This allows for comparability between countries, subdivisions and cities. The reason we use the Mexican definition is that it in our opinion is the best defined scheme and it is independent of climate and culture. 

The AMAI definition is found at https://www.cgidd.com/socioeconomic_definition.pdf

This link provides useful pictures of typical houses at various SELs: http://www.zonalatina.com/Zldata200.htm

The Mexican SEL definition can fairly easily be converted to other nations' SEL definitions, although we do not offer this service.

Difference between income and wage

Published March 4, 2011

Many people confuse income with (cash) wage. People earn money in many different ways. Wages, typically paid in cash, are an important part of income, but still represents less than 50% of household income in many countries.

A few examples: In Honduras, wages are 38% of income. In Tanzania, 13% of income. Even in the US, wages represent less than 2/3 of income.

What is the rest? It varies significantly from country to country.

In the US (to quote the Census Bureau) it includes interest, dividends, rental or royalty income, income from estates and trusts; Social Security and railroad retirement income; Supplemental Security Income (SSI); public assistance or welfare payments; retirement, survivor, or disability pensions; capital gains, money received from the sale of property (unless the recipient was engaged in the business of selling such property); the value of income ‘‘in kind’’ from food stamps, public housing subsidies, medical care, employer contributions for individuals, etc.; withdrawal of bank deposits; money borrowed; tax refunds; gifts and lump-sum inheritances, insurance payments, and other types of lump-sum receipts.

In poorer countries, a significant part of non-wage income comes from selling/bartering agricultural products, remittances from abroad and self-employment. In Mexico, the national household economic survey records more than 30 types of income.

C-GIDD uses the total household income concept, not the wage-part of income. Thus, our income is larger, and often much larger, than wages. Further, we do not calculate wage per wage earner (average wage) but total income per household or individual (see FAQ: Household vs individual income).

Household vs individual income

Published March 3, 2011

Some customers have asked what the difference is between household and individual income is in C-GIDD.

1. The fundamental unit is the household. We calculate income per household regardless of how many income earners there are in the household

2. The individual income is thereafter calculated as the household income divided by household size. That is, the income is spread equally regardless of whether household members are working, children or elderly. This is sometimes a meaningful number, sometimes not. Certain product purchase decisions are made with the consumer considering income/individual; most are made considering income per household.

3. The best income metric is probably income per household-equivalent (an adjusted household size that takes into account that e.g. children use up less resources). We use this for internal calculations but have found it hard to explain to customers and thus are not providing this metric in C-GIDD. (An example of household-equivalents is the US determination of poverty.)

We recommend that you carefully consider whether your use of C-GIDD income data is for individual or household-driven analysis.

Comment on Chinese city definition and population

Published January 21, 2011

There is a lot of confusion about the size of Chinese cities. 

1. Many people believe the Chinese administrative unit "prefecture-level city" (PLC) is a city. However, a PLC is a geographic region similar to a state or a province in other countries (and below the provincial level in China). A PLC usually contains several cities and towns, as well as rural areas. The area of a PLC is sometimes larger than Denmark. The total population of PLCs equals the total population of China (less a few special administrative units).

2. The four Municipalities (Beijing, Shanghai, Tianjin and Chongqing) are also not cities. They consist of urban and rural areas. In the case of Chongqing, the city of Chongqing (including suburbs) has 9 million inhabitants, while the municipality of Chongqing has 27 million inhabitants. Still, some journalists make the incorrect claim that Chongqing is the world's largest city.

3. This leaves the problem of defining and measuring true cities. This is almost impossible to do because China does not measure city size. One way to estimate city population is to study the level below PLCs. This often, but not always, consists of districts and counties. Districts usually correspond to urban areas and counties to rural areas. Summing up the population of adjacent districts thus gives a reasonable estimate of large city population ((>2 million inh.).

For cities in the 500,000-2 million range, this method often does not work because there are no districts in many smaller PLCs. The only way to estimate city size is to use the total urban population within a PLC. This will overestimate the city population since many urban dwellers live in smaller cities or towns within the PLC. Still, this is currently the only method available to us.

In sum, prefecture-level cities are not cities. "True" cities are not defined in China but can be estimated. The precision of estimates  is good for large cities, but overshoots for medium-sized cities (often by more than 100%).