Table of Contents
3. The Models
Abstract - The literature on returns to immigrants has paid little attention to female immigrants despite continuous increases in female labor force participation and their peculiarities. Using the 2011 National Households Survey of Canada, this paper investigates the effect of language proficiencies on returns to female immigrant groups in Canada and the effect across wage distributions. Our results show that returns to female immigrant groups increases with level of language proficiencies and language penalizes immigrants at higher quantiles of wage distribution more. Also, we find that OLS estimates are biased and inconsistent where sample selection problems exist.
The deterioration in entry earnings among recent Canadian immigrant cohorts has attracted the attention of researchers and policymakers alike (see Aydemir and Skuterud, 2005; Boudarbat and Lemieux, 2010; Green and Worswick, 2012; Skuterud, 2011 for example). Apart from affecting its reputation as a reference for successful immigrants’ assimilation policies in the world, declining immigrants’ earnings may mar Canada’s objective of competitively attracting productive immigrants to help fill its labour demand gap and engender growth. Among other factors, inadequate or low language proficiency has been identified as a major contributor to the decline in immigrants’ earnings (see Skuterud, 2011 for a review). Low language proficiency leads to job-skills mismatch and limits immigrant workers productivity (Chiswick and Miller, 2003, Imai et al., 2011).
However, most extant literature on language proficiency and returns to immigrants focus on only male immigrants (see for example Berman et al, 2003; Chiswick and Miller, 2003, 2007; Budria and Swedberg, 2015; Adsera and Ferrer, 2015). A few others consider female immigrants but they do not model their returns appropriately (see for example Miranda and Zhu, 2013; Yao and Van Ours, 2015).1 Arguably, the results from these studies may not be applicable to female immigrants. It is well documented in labor economics literature that female labor supply decisions differ significantly from those of males. More so, apart from the fact that female immigrants account for almost half of all Canadian immigrants, female labor force participation across all age groups in Canada has been increasing since the 1970s (Emery and Ferrer, 2009). This implies that it is important to pay attention to the earnings of female immigrants, and to model the determinants of their earnings properly. This is the motivation for this paper: we examine the effect of language proficiency on returns to female immigrants in Canada.
Of specific interest are two important, but often ignored, issues in the literature on modelling immigrant returns. These relate to the heterogeneity of immigrant groups across different source countries and across income levels. First, on average, different immigrant groups have different levels of host country language proficiencies (see Dustmann and Fabbri, 2003). Given that language proficiency is an important determinant of earnings, differences in proficiencies across immigrants groups will have implications for between group income inequalities. Therefore, analyzing immigrants as sub-groups promises to be insightful. However, previous studies treat immigrants as single units rather than as non-random cohorts of people.2 Thus, their results may neither capture group differences nor heterogeneity across groups, making it hard to tease out the effect of differences in immigrant groups’ language proficiencies on earnings. Analysis that captures group differences are more insightful for the design of appropriate immigration policy since policies can target group of immigrants rather than individuals. Therefore, the first objective of this paper is to examine the effect of differences in female immigrant groups’ language proficiency on their returns. In other words, do more proficient female immigrants groups do better than the less proficient ones in the Canadian labor market?
Second, the effects of many variables on earnings tend to differ across income distributions. The same relationship may hold for language proficiency and returns to immigrants. Thus, estimations done at the mean may hide different trends across the earnings distribution (Boudarbat and Lemieux, 2010). So, studies that apply the ordinary least squares (OLS) estimations do so under the assumption that the impacts of these exogenous variables along the conditional distribution of the dependent variable are unimportant (Martin and Pereira, 2004). If this is incorrect, as is the case when the dependent variable is not normally distributed, OLS estimators may be inefficient. Therefore, the second objective of this paper is to examine the effect of language proficiency on female immigrants’ earnings across their income distribution. In other words, does language proficiency penalize the rich more or the poor more? While our results are applicable to female immigrants, the lessons from analyzing immigrants as a group may be useful for studies focusing on male immigrants too.
Finally, we address the two objectives in line with the literature on female labour supply, motivated by Heckman (1979) seminal paper on correction for possible selection bias. Hence, the contributions of this paper are in two-folds. Empirically, to our knowledge, this is the first paper to analyze the impact of language proficiency on returns to female immigrant groups in Canada, following from Adsera and Ferrer (2015) which focuses on men. Also, it is also a pioneer study in examining the effect of language proficiency across income distributions of immigrants in Canada. In terms of methodology, the study contributes to the existing literature by applying the Buchinsky (1998, 2001) motivated method of estimating quantile regression with selectivity bias to female immigrants in Canada, another first attempt.
The next section describes the data while section three presents the models. Section four contains the results and section five concludes.
This study uses data from two sources. The primary source of data is the 2011 National Household Survey (NHS 2011) Public Use Micro Files (PUMF) Individual Files. The NHS 2011 contains data on Canadians’ (both natives and immigrants) demographic characteristics, labor market participation, income and immigration information. It is a replacement for the long form census questionnaire and thus contains the same items as the 2006 Census questionnaire. The advantage of the NHS 2011 data is that it contains large, detailed and representative information. It contains over 880 000 individual records and represents about 2.7 percent of the Canadian population. The sample for this analysis contains female adult immigrants within the ages of 18 to 65. We focus on this age for two reasons. First, it is the labour force age bracket and thus it is in line with our interest on individuals who earn income.
Second, we restrict our analysis to immigrants who arrive as adults (i.e. 18 years and above) since immigrants who arrive as teenagers or infants tend to have a better grasp of the host country language as part of their development (see Bleaky and Chin, 2004). Including these younger immigrants may bias our results. We include female immigrants living in all provinces except Quebec because our language variable compares better with English. Our final subsample contains over 12,000 female immigrants. Detailed description of the variables used is provided in Table A1 in the Appendix.
For some immigrants, the NHS 2011 data does not provide their country of birth. Rather the region in which such countries are located is provided as a result of aggregation (e.g. Southeast Asia, Latin America, North Africa, etc.). Therefore, we exclude such countries from our analysis since we cannot explore crosscountry variation in such instance. We limit our analysis to female immigrants from the twelve major Canadian immigrants’ source countries (i.e. United States, China, Jamaica, India, United Kingdom, Pakistan, Germany, Poland, Portugal, Hong Kong, Italy and Philippines).
Language Proximity Index
This is the second major data for our analysis. One of the characteristics of a good measure of language proficiency is its ability to reflect immigrants’ prior exposure to host countries’ language (see Chiswick and Miller, 1995). A good example is language proximity indices. In this paper, we use the measure of language proximity developed in Adsera and Pytlikova (2015). The index shows the level of linguistic tree shared (or proximity) between different languages and English. Based on this measure, we compare the major language or official language of female immigrants’ source countries with English. We observe that the immigrants can be divided into three sub-groups: those whose source country language shares no link with English, those whose source country language shares some link to English and those whose mother tongue is English.
1 These studies capture females by the inclusion of the usual gender dummies in their regression. While we argue that the approach is not sufficient, we do not imply that these studies have not done a good job.
2 Rather than examine the effect of group-specific language proficiency measures, this class of studies often use self-reported or other individual specific measures and then control for differences in immigrants groups by including place of birth dummies and other source country specific characteristics in their analysis. However, a study which identifies the magnitude of differences in the effects of language proficiency on returns to different groups of immigrants is arguably more insightful for policy design, since policies can be targeted at groups rather than specific individuals.