Menu Top
Latest Economics NCERT Notes, Solutions and Extra Q & A (Class 9th to 12th)
9th 10th 11th 12th

Class 11th Chapters
Indian Economic Development
1. The Indian Economy On The Eve Of Independence 2. Indian Economy 1950-1990 3. Liberalisation, Privatisation And Globalisation: An Appraisal
4. Human Capital Formation In India 5. Rural Development 6. Employment: Growth, Informalisation And Other Issues
7. Environment And Sustainable Development 8. Comparative Development Experiences Of India And Its Neighbours
Statistics For Economics
1. Introduction 2. Collection Of Data 3. Organisation Of Data
4. Presentation Of Data 5. Measures Of Central Tendency 6. Correlation
7. Index Numbers 8. Use Of Statistical Tools



Chapter 3 Organisation Of Data



After collecting raw data, the next essential step in statistical analysis is to organize or classify it. Just as a junk dealer sorts items to manage their trade efficiently or you arrange books by subject for easier access, classifying data brings order to unorganized information. This organization makes the data easier to understand, compare, and analyze using statistical methods. Classification involves arranging data into groups or classes based on specific criteria, ensuring that similar characteristics are placed together.

Introduction

This chapter follows the data collection process by explaining how to classify collected data. Classification is crucial for organizing raw data and making it suitable for statistical analysis. Like sorting junk or arranging books by subject, classifying data brings order, making it easier to manage and retrieve information.




Raw Data

Raw data refers to unclassified or unorganized data. Like the junk dealer's unorganized collection, raw data can be large, cumbersome, and difficult to handle. Extracting meaningful conclusions or insights from raw data is challenging because it is not readily amenable to systematic statistical analysis. Therefore, organizing and classifying raw data is a necessary step after collection and before undertaking any systematic analysis. Tables 3.1 and 3.2 provide examples of raw data, showing unarranged numbers.

Table 3.1 Marks in Mathematics Obtained by 100 Students in an Examination:

47 45 10 60 51 56 66 100 49 40
60 59 56 55 62 48 59 55 51 41
42 69 64 66 50 59 57 65 62 50
64 30 37 75 17 56 20 14 55 90
62 51 55 14 25 34 90 49 56 54
70 47 49 82 40 82 60 85 65 66
49 44 64 69 70 48 12 28 55 65
49 40 25 41 71 80 0 56 14 22
66 53 46 70 43 61 59 12 30 35
45 44 57 76 82 39 32 14 90 25

Table 3.2 Monthly Household Expenditure (in Rupees) on Food of 50 Households:

1904 1559 3473 1735 2760
2041 1612 1753 1855 4439
5090 1085 1823 2346 1523
1211 1360 1110 2152 1183
1218 1315 1105 2628 2712
4248 1812 1264 1183 1171
1007 1180 1953 1137 2048
2025 1583 1324 2621 3676
1397 1832 1962 2177 2575
1293 1365 1146 3222 1396

Raw data is summarized and made comprehensible through classification. Grouping similar facts together simplifies locating information, making comparisons, and drawing inferences. For example, Census data, initially vast and fragmented, becomes understandable when classified by characteristics like gender, education, and occupation.




Classification Of Data

Data classification involves arranging or organizing data into groups or classes based on certain criteria. The method of classification depends on the purpose of the analysis. Data can be classified in several ways:

Example 1: Population of India (in crores) - Chronological Classification

Year Population (Crores)
1951 35.7
1961 43.8
1971 54.6
1981 68.4
1991 81.8
2001 102.7
2011 121.0

Example 2: Yield of Wheat for Different Countries (2013) - Spatial Classification

Country Yield of wheat (kg/hectare)
Canada 3594
China 5055
France 7254
Germany 7998
India 3154
Pakistan 2787

Example 3: Qualitative Classification of Population by Gender and Marital Status

Population
Male Female
Married Unmarried Married Unmarried

Example 4: Frequency Distribution of Marks in Mathematics of 100 Students - Quantitative Classification

Marks Frequency
0–10 1
10–20 8
20–30 6
30–40 7
40–50 21
50–60 23
60–70 19
70–80 6
80–90 5
90–100 4
Total 100



Variables: Continuous And Discrete

Variables can be classified as continuous or discrete based on the values they can take.




What Is A Frequency Distribution?

A frequency distribution is a way to classify raw data of a quantitative variable. It shows how frequently different values or ranges of values of a variable occur within specific classes.

In a frequency distribution table (like Example 4):

Table 3.3: The Lower Class Limits, the Upper Class Limits and the Class Mark:

Class Frequency Lower Class Limit Upper Class Limit Class Mark
0–10 1 0 10 5
10–20 8 10 20 15
20–30 6 20 30 25
30–40 7 30 40 35
40–50 21 40 50 45
50–60 23 50 60 55
60–70 19 60 70 65
70–80 6 70 80 75
80–90 5 80 90 85
90–100 4 90 100 95
Total 100
Frequency Curve: Diagrammatic presentation of frequency distribution.

A **Frequency Curve** is a graphical representation of a frequency distribution, plotting class marks on the X-axis and frequencies on the Y-axis (Fig. 3.1).


How To Prepare A Frequency Distribution?

Constructing a frequency distribution involves addressing several decisions:


Should We Have Equal Or Unequal Sized Class Intervals?

Unequal intervals are used when the data range is very wide (e.g., income from near zero to very high values) or when observations are highly concentrated in a small part of the range. Otherwise, equal-sized intervals are preferred.


How Many Classes Should We Have?

The number of classes is usually between 6 and 15. With equal intervals, it can be estimated by dividing the range (difference between largest and smallest values) by the class interval size.


What Should Be The Size Of Each Class?

This depends on the number of classes and the data range, as these are interlinked decisions. In Example 4, with a range of 100 and 10 classes, the equal class interval is 10.


How Should We Determine The Class Limits?

Class limits should be clear and definite, ideally avoiding open-ended classes. They should be set so that observations within a class are concentrated around the class midpoint. Class intervals can be Inclusive (limits included in the class) or Exclusive (one limit, usually upper, excluded). Exclusive intervals are common for continuous variables to maintain continuity.

Example of Inclusive Class Intervals (Discrete Variable, Marks 0-100):

Example of Exclusive Class Intervals (Discrete Variable, Marks 0-100):

Example of Inclusive Class Intervals (Continuous Variable, Weight):

For continuous variables represented using the inclusive method, an adjustment is needed to create continuity between classes.


Adjustment In Class Interval

To adjust inclusive class intervals for continuous data (e.g., Table 3.4), find the gap between the upper limit of a class and the lower limit of the next class, divide by two, subtract from all lower limits, and add to all upper limits. This creates adjusted class limits (e.g., Table 3.5) and adjusted class marks.

Table 3.4: Frequency Distribution of Incomes of 550 Employees of a Company (Inclusive)

Income (Rs) Number of Employees
800–899 50
900–999 100
1000–1099 200
1100–1199 150
1200–1299 40
1300–1399 10
Total 550

Table 3.5: Frequency Distribution of Incomes of 550 Employees of a Company (Adjusted/Exclusive)

Income (Rs) Number of Employees
799.5–899.5 50
899.5–999.5 100
999.5–1099.5 200
1099.5–1199.5 150
1199.5–1299.5 40
1299.5–1399.5 10
Total 550

How Should We Get The Frequency For Each Class?

Frequency of an observation is how many times it appears. Class frequency is the number of observations in a class. This is determined by **tally marks**.


Finding Class Frequency by Tally Marking

A tally (/) is marked for each observation falling into a class. Tallies are grouped in fives for easier counting (//// then ). The total number of tallies in a class is its frequency (Table 3.6).

Table 3.6: Tally Marking of Marks of 100 Students in Mathematics:

Class Observations Tally Marks Frequency Class Mark
0–10 0 / 1 5
10–20 10, 14, 17, 12, 14, 12, 14, 14 //// /// 8 15
20–30 25, 25, 20, 22, 25, 28 //// / 6 25
30–40 30, 37, 34, 39, 32, 30, 35, //// // 7 35
40–50 47, 42, 49, 49, 45, 45, 47, 44, 40, 44, 49, 46, 41, 40, 43, 48, 48, 49, 49, 40, 41 //// //// //// //// / 21 45
50–60 59, 51, 53, 56, 55, 57, 55, 51, 50, 56, 59, 56, 59, 57, 59, 55, 56, 51, 55, 56, 55, 50, 54 //// //// //// //// /// 23 55
60–70 60, 64, 62, 66, 69, 64, 64, 60, 66, 69, 62, 61, 66, 60, 65, 62, 65, 66, 65 //// //// //// //// 19 65
70–80 70, 75, 70, 76, 70, 71 ///// 6 75
80–90 82, 82, 82, 80, 85 //// 5 85
90–100 90, 100, 90, 90 //// 4 95
Total 100

Loss Of Information

Classifying data into a frequency distribution involves a loss of detailed information. Once data is grouped, individual observation values are lost, and only the class frequency and class mark are used in further calculations. While this summarizes data, it means less detailed information is available compared to raw data.


Frequency Distribution With Unequal Classes

Frequency distributions can have unequal class intervals, especially when data is concentrated in certain ranges. This allows for more representative class marks in those ranges (e.g., splitting wider classes into narrower ones where data is dense - Table 3.7).

Table 3.7: Frequency Distribution of Unequal Classes:

Class Observations Frequency Class Mark
0–10 0 1 5
10–20 10, 14, 17, 12, 14, 12, 14, 14 8 15
20–30 25, 25, 20, 22, 25, 28 6 25
30–40 30, 37, 34, 39, 32, 30, 35, 7 35
40–45 42, 44, 40, 44, 41, 40, 43, 40, 41 9 42.5
45–50 47, 49, 49, 45, 45, 47, 49, 46, 48, 48, 49, 49 12 47.5
50–55 51, 53, 51, 50, 51, 50, 54 7 52.5
55–60 59, 56, 55, 57, 55, 56, 59, 56, 59, 57, 59, 55, 56, 55, 56, 55 16 57.5
60–65 60, 64, 62, 64, 64, 60, 62, 61, 60, 62, 10 62.5
65–70 66, 69, 66, 69, 66, 65, 65, 66, 65 9 67.5
70–80 70, 75, 70, 76, 70, 71 6 75
80–90 82, 82, 82, 80, 85 5 85
90–100 90, 100, 90, 90 4 95
Total 100
Frequency Curve with Unequal Classes.

Frequency Array

For a discrete variable, the classification of its data is a **Frequency Array**. It shows the frequency for each distinct value the variable takes (Table 3.8).

Table 3.8: Frequency Array of the Size of Households:

Size of the Household Number of Households
1 5
2 15
3 25
4 35
5 10
6 5
7 3
8 2
Total 100



Bivariate Frequency Distribution

When data is collected for two variables from each unit of a sample (bivariate data), it can be summarized in a **Bivariate Frequency Distribution**. This distribution shows the frequency of observations for combinations of classes of the two variables (Table 3.9).

Table 3.9: Bivariate Frequency Distribution of Sales (in Lakh Rs) and Advertisement Expenditure (in Thousand Rs) of 20 Firms:

Advertisement Expenditure (Thousand Rs) Sales (Lakh Rs) Total
115–125 125–135 135–145 145–155 155–165 165–175
62–64 2 1 3
64–66 1 3 4
66–68 1 1 2 1 5
68–70 2 2 4
70–72 1 1 1 1 4
Total 4 5 6 3 1 1 20



Conclusion

Raw data collected from various sources needs to be classified for effective statistical analysis. Classification organizes the data, making it ordered and manageable. A frequency distribution is a comprehensive method for classifying quantitative data, showing how values are distributed across classes with their frequencies. Understanding techniques like forming classes (equal/unequal intervals, number of classes, limits), adjusting for continuity, and tally marking is crucial for constructing frequency distributions. While classifying data involves some loss of detail, the gain in making the data comprehensible for analysis outweighs this. Frequency arrays are used for discrete variables, and bivariate frequency distributions summarize data for two variables simultaneously.


Recap:

56.4s