| Latest Economics NCERT Notes, Solutions and Extra Q & A (Class 9th to 12th) | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9th | 10th | 11th | 12th | ||||||||||||||||
Chapter 6 Correlation
This chapter explores Correlation, a statistical tool used to measure and analyse the relationship between two variables. It helps determine if a change in one variable is associated with a change in another, and it quantifies the direction and strength of this linear relationship. The chapter explains the difference between Positive Correlation, where both variables move in the same direction (e.g., income and consumption), and Negative Correlation, where they move in opposite directions (e.g., price and quantity demanded). A crucial point emphasized is that correlation measures association, not causation; it shows that two variables move together, but does not prove that one causes the other.
Three main techniques for measuring correlation are discussed. The Scatter Diagram provides a visual representation of the relationship, allowing for a quick assessment of its nature and strength. For a precise numerical measure, Karl Pearson’s Coefficient of Correlation (r) is used, which ranges from -1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 indicating no linear relationship. For qualitative data or data containing extreme values, Spearman’s Rank Correlation ($r_k$) is a more suitable alternative, as it measures the association between the ranks of the data rather than their actual values.
Introduction to Correlation
In previous chapters, you've learned how to summarise a single dataset. Now, we will explore how to examine the relationship between two variables. This is the essence of correlation analysis.
In our daily lives, we often observe relationships between different phenomena. For example:
- As the summer temperature rises, the number of visitors to hill stations increases, and so do ice-cream sales.
- As the supply of tomatoes increases in the local market, their price drops.
Correlation analysis provides a systematic means to examine such relationships. It helps answer questions like:
- Is there a relationship between two variables?
- If the value of one variable changes, does the value of the other also change?
- Do both variables move in the same direction or in opposite directions?
- How strong is the relationship?
What Does Correlation Measure?
Correlation studies and measures the direction and intensity of the relationship between variables. It's crucial to understand that correlation measures covariation, not causation. The presence of a correlation between two variables, X and Y, simply means that when X changes, Y also changes in a definite way. It does not mean that a change in X causes a change in Y.
For instance, a brisk sale of ice-creams might be correlated with a higher number of deaths by drowning. This doesn't mean eating ice cream causes drowning. Instead, a third variable—rising temperature—leads to both increased ice-cream sales and more people going swimming, which in turn might increase the number of drowning incidents.
For simplicity, we will assume that the correlation, if it exists, is linear, meaning the relationship can be represented by a straight line on a graph.
Types of Correlation
Correlation is commonly classified based on the direction of the relationship between the two variables.
1. Positive Correlation
The correlation is said to be positive when the two variables move together in the same direction. If one variable increases, the other variable also increases. If one decreases, the other also decreases.
Examples:
- As income rises, consumption also rises.
- As temperature rises, the sale of ice-cream also rises.
2. Negative Correlation
The correlation is negative when the two variables move in opposite directions. If one variable increases, the other variable decreases, and vice versa.
Examples:
- When the price of apples falls, the demand for them increases.
- As you spend more time studying, the chances of failing decline.
3. No Correlation
When there is no discernible relationship between the movements of two variables, it is a case of no correlation. Changes in one variable do not lead to any definite change in the other.
Example: The relationship between the size of your shoes and the amount of money in your pocket is likely a coincidence with no real correlation.
Techniques for Measuring Correlation
There are three important tools used to study and measure correlation:
- Scatter Diagrams: A graphical method to visually assess the nature of the relationship.
- Karl Pearson’s Coefficient of Correlation: A numerical measure of the linear relationship between two variables.
- Spearman’s Rank Correlation: A measure of the linear association between the ranks of variables, particularly useful for qualitative data.
Scatter Diagram
A scatter diagram is a graphical technique where the values of two variables are plotted as points on a graph paper. This visual representation provides a good idea of the nature and strength of the relationship without calculating any numerical value.
The pattern of the scattered points reveals the relationship:
- If the points cluster around an upward rising line, it indicates a positive correlation (Fig. 6.1).
- If the points cluster around a downward sloping line, it indicates a negative correlation (Fig. 6.2).
- If the points are spread randomly with no discernible pattern, it indicates no correlation (Fig. 6.3).
- If all the points lie exactly on an upward or downward sloping line, it is a case of perfect positive or perfect negative correlation, respectively (Fig. 6.4 and Fig. 6.5).
- The relationship can also be non-linear if the points form a curve (Fig. 6.6 and Fig. 6.7).
Karl Pearson’s Coefficient of Correlation
Also known as the product-moment correlation coefficient, this is a precise numerical measure of the degree of linear relationship between two variables, X and Y. It is denoted by 'r'. It is advisable to first examine a scatter diagram to ensure the relationship is linear before calculating this coefficient.
Formula for Karl Pearson’s Coefficient (r)
The correlation coefficient is the covariance of X and Y divided by the product of their standard deviations.
$ r = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $
Where:
- $Cov(X,Y) = \frac{\sum (X - \bar{X})(Y - \bar{Y})}{N} = \frac{\sum xy}{N}$
- $\sigma_x$ is the standard deviation of X.
- $\sigma_y$ is the standard deviation of Y.
The formula can be expressed in various forms for calculation:
$ r = \frac{\sum xy}{\sqrt{\sum x^2 \sum y^2}} \quad \text{where } x = X - \bar{X} \text{ and } y = Y - \bar{Y} $
Or, for direct calculation from raw data:
$ r = \frac{N \sum XY - (\sum X)(\sum Y)}{\sqrt{[N \sum X^2 - (\sum X)^2][N \sum Y^2 - (\sum Y)^2]}} $
Properties of the Correlation Coefficient (r)
- No Unit: 'r' is a pure number and has no unit.
- Direction of Relationship:
- A negative value of r indicates an inverse relationship (variables move in opposite directions).
- A positive value of r indicates a direct relationship (variables move in the same direction).
- Range of Value: The value of r always lies between -1 and +1 (i.e., $-1 \le r \le +1$).
- Interpretation of Value:
- If $r = 0$, the variables are uncorrelated (no linear relationship).
- If $r = +1$ or $r = -1$, the correlation is perfect.
- A value of r close to +1 or -1 indicates a strong linear relationship.
- A value of r close to 0 indicates a weak linear relationship.
- Unaffected by Change of Origin and Scale: The value of r is not affected by adding/subtracting a constant from the variables (change of origin) or by multiplying/dividing them by a constant (change of scale). This property is used in the step-deviation method to simplify calculations.
Spearman’s Rank Correlation
Developed by the British psychologist C.E. Spearman, this method measures the linear association between the ranks assigned to individual items, not their actual values. It is denoted by $r_k$.
When to Use Rank Correlation
Spearman’s rank correlation is particularly useful in the following situations:
- When the data is qualitative and cannot be measured numerically but can be ranked. For example, ranking individuals based on attributes like honesty, beauty, or intelligence.
- When the relationship between variables is clearly non-linear, but the direction of the relationship is consistent.
- When the data contains extreme values (outliers), as rank correlation is not affected by them, unlike Pearson’s correlation.
Formula for Spearman’s Rank Correlation ($r_k$)
The formula for calculating the rank correlation coefficient is:
$ r_k = 1 - \frac{6 \sum D^2}{n(n^2 - 1)} $
Where:
- $D$ = The difference between the ranks assigned to a pair of observations ($R_X - R_Y$).
- $n$ = The number of pairs of observations.
Calculation of Rank Correlation
The process involves three main scenarios:
- When Ranks are Given: Simply calculate the differences (D) between the ranks for each pair, square them ($D^2$), sum them up ($\sum D^2$), and apply the formula.
- When Ranks are Not Given: First, assign ranks to the values of each variable separately. The highest value is typically given rank 1, the next highest rank 2, and so on. Then, proceed as above.
- When Ranks are Repeated (Tied Ranks): If two or more items have the same value, a common rank is assigned to them. This common rank is the average of the ranks they would have occupied if they were slightly different. For example, if two items are tied for the 3rd and 4th positions, both are assigned the rank $ (3+4)/2 = 3.5 $.
When ranks are repeated, a correction factor is added to the $\sum D^2$ term in the formula:
$ r_k = 1 - \frac{6 \left( \sum D^2 + \frac{m_1(m_1^2 - 1)}{12} + \frac{m_2(m_2^2 - 1)}{12} + \dots \right)}{n(n^2 - 1)} $
Where $m_1, m_2, \dots$ are the number of times each rank is repeated.
NCERT Questions Solution
Question 1. The unit of correlation coefficient between height in feet and weight in kgs is
(i) kg/feet
(ii) percentage
(iii) non-existent
Answer:
Question 2. The range of simple correlation coefficient is
(i) 0 to infinity
(ii) minus one to plus one
(iii) minus infinity to infinity
Answer:
Question 3. If rxy is positive the relation between X and Y is of the type
(i) When Y increases X increases
(ii) When Y decreases X increases
(iii) When Y increases X does not change
Answer:
Question 4. If rxy = 0 the variable X and Y are
(i) linearly related
(ii) not linearly related
(iii) independent
Answer:
Question 5. Of the following three measures which can measure any type of relationship
(i) Karl Pearson’s coefficient of correlation
(ii) Spearman’s rank correlation
(iii) Scatter diagram
Answer:
Question 6. If precisely measured data are available the simple correlation coefficient is
(i) more accurate than rank correlation coefficient
(ii) less accurate than rank correlation coefficient
(iii) as accurate as the rank correlation coefficient
Answer:
Question 7. Why is r preferred to covariance as a measure of association?
Answer:
Question 8. Can r lie outside the –1 and 1 range depending on the type of data?
Answer:
Question 9. Does correlation imply causation?
Answer:
Question 10. When is rank correlation more precise than simple correlation coefficient?
Answer:
Question 11. Does zero correlation mean independence?
Answer:
Question 12. Can simple correlation coefficient measure any type of relationship?
Answer:
Question 13. Collect the price of five vegetables from your local market every day for a week. Calculate their correlation coefficients. Interpret the result.
Answer:
Question 14. Measure the height of your classmates. Ask them the height of their benchmate. Calculate the correlation coefficient of these two variables. Interpret the result.
Answer:
Question 15. List some variables where accurate measurement is difficult.
Answer:
Question 16. Interpret the values of r as 1, –1 and 0.
Answer:
Question 17. Why does rank correlation coefficient differ from Pearsonian correlation coefficient?
Answer:
Question 18. Calculate the correlation coefficient between the heights of fathers in inches (X) and their sons (Y)
| X | 65 | 66 | 57 | 67 | 68 | 69 | 70 | 72 |
|---|---|---|---|---|---|---|---|---|
| Y | 67 | 56 | 65 | 68 | 72 | 72 | 69 | 71 |
Answer:
Question 19. Calculate the correlation coefficient between X and Y and comment on their relationship:
| X | –3 | –2 | –1 | 1 | 2 | 3 |
|---|---|---|---|---|---|---|
| Y | 9 | 4 | 1 | 1 | 4 | 9 |
Answer:
Question 20. Calculate the correlation coefficient between X and Y and comment on their relationship
| X | 1 | 3 | 4 | 5 | 7 | 8 |
|---|---|---|---|---|---|---|
| Y | 2 | 6 | 8 | 10 | 14 | 16 |
Answer: