Menu Top
Non-Rationalised Economics NCERT Notes, Solutions and Extra Q & A (Class 9th to 12th)
9th 10th 11th 12th

Class 11th Chapters
Indian Economic Development
1. Indian Economy On The Eve Of Independence 2. Indian Economy 1950-1990 3. Liberalisation, Privatisation And Globalisation : An Appraisal
4. Poverty 5. Human Capital Formation In India 6. Rural Development
7. Employment: Growth, Informalisation And Other Issues 8. Infrastructure 9. Environment And Sustainable Development
10. Comparative Development Experiences Of India And Its Neighbours
Statistics For Economics
1. Introduction 2. Collection Of Data 3. Organisation Of Data
4. Presentation Of Data 5. Measures Of Central Tendency 6. Measures Of Dispersion
7. Correlation 8. Index Numbers 9. Use Of Statistical Tools



Chapter 7 Correlation



1. Introduction

In previous chapters, we learned how to summarize data using measures of central tendency (like the mean) and measures of dispersion (like standard deviation). These summary measures help us understand the characteristics of a single variable.

This chapter moves on to a different kind of analysis: examining the relationship between two variables. This is a fundamental concept in economics and our daily lives. For example, we observe that as temperatures rise, ice cream sales increase. Similarly, when the supply of a vegetable increases in the market, its price tends to fall. Correlation analysis is the statistical tool used to systematically study and measure such relationships.

Correlation helps us answer key questions such as:



2. Types Of Relationship

When we observe a relationship between two variables, it's crucial to understand its nature. Not all relationships imply a direct cause and effect.

Some relationships are based on a clear cause-and-effect link. For example, low rainfall (cause) is often related to low agricultural productivity (effect). However, some relationships are purely coincidental. The relationship between the arrival of migratory birds at a sanctuary and the local birth rate is a coincidence; one does not cause the other.

In other cases, a relationship between two variables (say, X and Y) might be driven by a third, hidden variable. For example, a study might find a positive correlation between ice cream sales and the number of drowning deaths. This does not mean that eating ice cream causes people to drown. The third variable at play is temperature. Hot weather (the cause) leads to both increased ice cream sales and more people going swimming, which in turn might lead to more drowning incidents.

What Does Correlation Measure?

Correlation analysis measures the direction and intensity of the relationship between two variables. It is important to remember a fundamental principle of statistics:

Correlation does not imply causation.

The presence of a correlation between two variables simply means that they exhibit covariation—when one variable changes, the other tends to change in a definite way. It does not prove that one variable is causing the change in the other.

Types of Correlation:



3. Techniques For Measuring Correlation

There are three primary techniques used to study and measure correlation:

  1. Scatter Diagrams: A graphical method to visualize the relationship.
  2. Karl Pearson’s Coefficient of Correlation: A numerical measure of the strength and direction of a linear relationship.
  3. Spearman’s Rank Correlation: A numerical measure used when data is in the form of ranks or when the relationship is not linear.

Scatter Diagram

A scatter diagram is a simple but powerful visual tool used to examine the relationship between two variables. To create one, the values of the two variables are plotted as points on a graph paper, with one variable on the X-axis and the other on the Y-axis.

By observing the pattern of the plotted points, we can get a good idea of both the nature (positive, negative, or none) and the intensity (strong or weak) of the relationship.

Here are some common patterns and their interpretations:

A scatter diagram where points are clustered around an upward-sloping line from left to right.

Fig 7.1 Positive Correlation: The points are scattered around an upward-rising line, indicating that as X increases, Y also tends to increase.

A scatter diagram where points are clustered around a downward-sloping line from left to right.

Fig 7.2 Negative Correlation: The points are scattered around a downward-sloping line, indicating that as X increases, Y tends to decrease.

A scatter diagram where points are spread out randomly with no discernible pattern or line.

Fig 7.3 No Correlation: The points are spread out randomly, showing no clear relationship between the variables.

A scatter diagram where all points lie exactly on an upward-sloping straight line.

Fig 7.4 Perfect Positive Correlation: All points lie exactly on an upward-sloping straight line.

A scatter diagram where all points lie exactly on a downward-sloping straight line.

Fig 7.5 Perfect Negative Correlation: All points lie exactly on a downward-sloping straight line.

A scatter diagram where points form a clear upward-trending curve.

Fig 7.6 Non-linear Correlation: The points form a clear pattern, but it is a curve, not a straight line. This indicates a non-linear relationship.

Karl Pearson’s Coefficient of Correlation

This method, also known as the product-moment correlation coefficient, provides a precise numerical value for the degree of linear relationship between two variables, X and Y. It is denoted by the symbol 'r'.

It is crucial to first examine a scatter diagram. If the relationship appears linear, then Pearson's 'r' is an appropriate measure. If the relationship is non-linear (curved), using Pearson's 'r' can be misleading.

The formula for Pearson's 'r' can be expressed in several ways:

Based on Covariance: $r = \frac{\sum xy}{\sqrt{\sum x^2 \sum y^2}}$

Where $x = X - \bar{X}$ and $y = Y - \bar{Y}$ are the deviations from the mean.

Or using direct values: $r = \frac{N \sum XY - (\sum X)(\sum Y)}{\sqrt{N \sum X^2 - (\sum X)^2} \sqrt{N \sum Y^2 - (\sum Y)^2}}$

Properties Of Correlation Coefficient

Example 1. Calculate 'r' between years of schooling of farmers and their annual yield per acre.

Years of Schooling (X) 0 2 4 6 8 10 12
Yield ('000 ₹) (Y) 4 4 6 10 10 8 7

Answer:

To solve this, we need to calculate the deviations from the mean for both variables.

Step 1: Calculate the means, $\bar{X}$ and $\bar{Y}$.

$\bar{X} = \frac{\sum X}{N} = \frac{42}{7} = 6$

$\bar{Y} = \frac{\sum Y}{N} = \frac{49}{7} = 7$

Step 2: Calculate the required sums for the formula.

XY$x = (X-\bar{X})$$y = (Y-\bar{Y})$$x^2$$y^2$$xy$
04-6-336918
24-4-316912
46-2-1412
61003090
81023496
108411614
127603600
$\sum X=42$$\sum Y=49$$\sum x=0$$\sum y=0$$\sum x^2=112$$\sum y^2=38$$\sum xy=42$

Step 3: Apply the formula.

$r = \frac{\sum xy}{\sqrt{\sum x^2 \sum y^2}} = \frac{42}{\sqrt{112 \times 38}} = \frac{42}{\sqrt{4256}} = \frac{42}{65.24} \approx +0.644$

The result of +0.644 indicates a moderately strong positive linear correlation between years of schooling and agricultural yield.

Spearman’s Rank Correlation

Developed by C.E. Spearman, this method measures the linear association between the ranks assigned to the values of two variables. It is particularly useful in the following situations:

The formula for Spearman's rank correlation coefficient ($r_s$ or $r_k$) is:

$r_s = 1 - \frac{6 \sum D^2}{n(n^2 - 1)}$

Where $D$ is the difference between the ranks of the two variables ($R_X - R_Y$), and $n$ is the number of pairs of observations.

Calculation Of Rank Correlation Coefficient

The calculation differs slightly based on whether the data has tied ranks.

Case 1: Ranks are given. Simply find the differences (D), square them, sum them ($\sum D^2$), and apply the formula.

Case 2: Ranks are not given. You must first assign ranks to the values of each variable separately (usually rank 1 for the highest value, 2 for the next highest, and so on). Then proceed as in Case 1.

Case 3: Ranks are repeated (Tied Ranks). If two or more items have the same value, they are assigned the average of the ranks they would have occupied. For example, if two items are tied for the 3rd and 4th positions, both are assigned the rank $(3+4)/2 = 3.5$. A correction factor is then added to the formula:

$r_s = 1 - \frac{6[\sum D^2 + \frac{m_1(m_1^2 - 1)}{12} + \frac{m_2(m_2^2 - 1)}{12} + \dots]}{n(n^2 - 1)}$

Where $m_1, m_2, \dots$ are the number of times each rank is repeated.



4. Conclusion

This chapter has introduced several techniques for studying the relationship between two variables. The scatter diagram offers a powerful visual representation, suitable for identifying linear and non-linear relationships. For a precise numerical measure of linear association, Karl Pearson’s coefficient is used for quantitative data, while Spearman’s rank correlation is used for ranked or qualitative data.

It is crucial to remember that all these measures indicate the degree and direction of covariation, not causation. A strong correlation provides an idea of how one variable might change when the other changes, but it does not explain why that change occurs. The knowledge of correlation is a vital first step in understanding the complex relationships within economic data.



Recap



Exercises

This section contains questions for practice and self-assessment, designed to test the learner's understanding of the concepts discussed in the chapter, such as identifying the type of correlation, interpreting the value of 'r', and calculating both Pearson's and Spearman's correlation coefficients for different datasets.



Activity

This section is not included in the provided text.



Suggested Additional Activities

This section is not included in the provided text.