Menu Top
Non-Rationalised NCERT Books Solution
6th 7th 8th 9th 10th 11th 12th

Class 11th Chapters
1. Sets 2. Relations and Functions 3. Trigonometric Functions
4. Principle of Mathematical Induction 5. Complex Numbers and Quadratic Equations 6. Linear Inequalities
7. Permutations and Combinations 8. Binomial Theorem 9. Sequences and Series
10. Straight Lines 11. Conic Sections 12. Introduction to Three Dimensional Geometry
13. Limits and Derivatives 14. Mathematical Reasoning 15. Statistics
16. Probability

Content On This Page
Example 1 to 7 (Before Exercise 15.1) Exercise 15.1 Example 8 to 12 (Before Exercise 15.2)
Exercise 15.2 Example 13 to 15 (Before Exercise 15.3) Exercise 15.3
Example 16 to 19 - Miscellaneous Examples Miscellaneous Exercise On Chapter 15


Chapter 15 Statistics

Welcome to the solutions for Chapter 15: Statistics. While previous encounters with statistics likely focused on summarizing data using measures of central tendency (like mean, median, and mode), which describe the 'typical' value within a dataset, this chapter delves into another crucial aspect of data analysis: understanding its dispersion or variability. Central tendency measures alone provide an incomplete picture. For instance, two datasets might have the exact same mean but differ vastly in how spread out their values are. One set might cluster tightly around the mean, while the other might have values scattered widely. Measuring this spread, or dispersion, is essential for comprehending the distribution's nature, assessing consistency, comparing different datasets reliably, and making informed inferences. This chapter introduces several key statistical tools designed specifically to quantify the extent to which data points deviate from the average or spread out across the range of observations. We will explore methods applicable to both ungrouped (raw) data and grouped data presented in frequency distributions.

The solutions explore various measures of dispersion, starting with the simplest and progressing to more robust and widely used metrics:

Beyond calculation, the solutions emphasize the interpretation of these measures. For instance, they cover the analysis of frequency distributions that might share the same mean but exhibit different variances, illustrating how standard deviation effectively quantifies the consistency or spread within each dataset – a smaller $\sigma$ indicates data points are clustered more closely around the mean (more consistent), while a larger $\sigma$ signifies greater variability. For comparing the relative variability of two or more datasets, especially if they have different means or different units, the Coefficient of Variation (CV) is introduced. It's a unit-less measure calculated as $CV = \left(\frac{\sigma}{\bar{x}}\right) \times 100\%$. A lower CV indicates greater consistency relative to the mean. These tools provide a far more comprehensive understanding of data characteristics than central tendency alone.



Example 1 to 7 (Before Exercise 15.1)

Example 1: Find the mean deviation about the mean for the following data:

671012134812

Answer:

The given data is: 6, 7, 10, 12, 13, 4, 8, 12.

The number of observations is $n = 8$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$

Sum of observations = $6 + 7 + 10 + 12 + 13 + 4 + 8 + 12 = 72$

$\overline{x} = \frac{72}{8} = 9$

The mean of the data is 9.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}|$.

$|6 - 9| = |-3| = 3$

$|7 - 9| = |-2| = 2$

$|10 - 9| = |1| = 1$

$|12 - 9| = |3| = 3$

$|13 - 9| = |4| = 4$

$|4 - 9| = |-5| = 5$

$|8 - 9| = |-1| = 1$

$|12 - 9| = |3| = 3$


Now, we find the sum of the absolute deviations.

$\sum |x_i - \overline{x}| = 3 + 2 + 1 + 3 + 4 + 5 + 1 + 3 = 22$


Finally, we calculate the mean deviation about the mean.

Mean Deviation about the mean = $\frac{\sum |x_i - \overline{x}|}{n}$

MD($\overline{x}$) = $\frac{22}{8} = \frac{11}{4} = 2.75$

The mean deviation about the mean for the given data is 2.75.

Example 2: Find the mean deviation about the mean for the following data :

12318174917192015
8172316113105

Answer:

The given data is:

12, 3, 18, 17, 4, 9, 17, 19, 20, 15, 8, 17, 2, 3, 16, 11, 3, 1, 0, 5.

The number of observations is $n$. By counting the data points, we have $n = 20$.


First, we find the mean of the data ($\overline{x}$).

$\overline{x} = \frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum x_i}{n}$

Sum of observations ($\sum x_i$) = $12 + 3 + 18 + 17 + 4 + 9 + 17 + 19 + 20 + 15 + 8 + 17 + 2 + 3 + 16 + 11 + 3 + 1 + 0 + 5 = 200$

$\overline{x} = \frac{200}{20} = 10$

The mean of the data is 10.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 10|$.

$|12 - 10| = 2$

$|3 - 10| = 7$

$|18 - 10| = 8$

$|17 - 10| = 7$

$|4 - 10| = 6$

$|9 - 10| = 1$

$|17 - 10| = 7$

$|19 - 10| = 9$

$|20 - 10| = 10$

$|15 - 10| = 5$

$|8 - 10| = 2$

$|17 - 10| = 7$

$|2 - 10| = 8$

$|3 - 10| = 7$

$|16 - 10| = 6$

$|11 - 10| = 1$

$|3 - 10| = 7$

$|1 - 10| = 9$

$|0 - 10| = 10$

$|5 - 10| = 5$


Now, we find the sum of the absolute deviations.

$\sum |x_i - \overline{x}| = 2 + 7 + 8 + 7 + 6 + 1 + 7 + 9 + 10 + 5 + 2 + 7 + 8 + 7 + 6 + 1 + 7 + 9 + 10 + 5 = 124$


Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum |x_i - \overline{x}|}{n} = \frac{124}{20}$

MD($\overline{x}$) = $6.2$

The mean deviation about the mean for the given data is 6.2.

Example 3: Find the mean deviation about the median for the following data:

39531210184719
21

Answer:

The given data is: 3, 9, 5, 3, 12, 10, 18, 4, 7, 19, 21.

The number of observations is $n = 11$.


First, we need to arrange the data in ascending order to find the median.

Arranged data: 3, 3, 4, 5, 7, 9, 10, 12, 18, 19, 21.


Since the number of observations ($n = 11$) is odd, the median (M) is the value of the $\left(\frac{n+1}{2}\right)^{\text{th}}$ observation.

Median (M) = $\left(\frac{11+1}{2}\right)^{\text{th}}$ observation = $6^{\text{th}}$ observation.

From the arranged data, the $6^{\text{th}}$ observation is 9.

So, the median M = 9.


Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 9|$.

$|3 - 9| = |-6| = 6$

$|3 - 9| = |-6| = 6$

$|4 - 9| = |-5| = 5$

$|5 - 9| = |-4| = 4$

$|7 - 9| = |-2| = 2$

$|9 - 9| = |0| = 0$

$|10 - 9| = |1| = 1$

$|12 - 9| = |3| = 3$

$|18 - 9| = |9| = 9$

$|19 - 9| = |10| = 10$

$|21 - 9| = |12| = 12$


Now, we find the sum of the absolute deviations.

$\sum |x_i - M| = 6 + 6 + 5 + 4 + 2 + 0 + 1 + 3 + 9 + 10 + 12 = 58$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum |x_i - M|}{n} = \frac{58}{11}$

MD(M) $\approx 5.27$ (approximately)

The mean deviation about the median for the given data is $\frac{58}{11}$ or approximately 5.27.

Example 4: Find mean deviation about the mean for the following data :

$x_i$ 2 5 6 8 10 12
$f_i$ 2 8 10 7 8 5

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
22
58
610
87
108
125

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each class and the total frequency $\sum f_i$ and the sum $\sum f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
22$2 \times 2 = 4$
58$8 \times 5 = 40$
610$10 \times 6 = 60$
87$7 \times 8 = 56$
108$8 \times 10 = 80$
125$5 \times 12 = 60$
$\sum f_i = 40$ $\sum f_i x_i = 300$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum f_i x_i}{\sum f_i}$

$\overline{x} = \frac{300}{40} = \frac{30}{4} = 7.5$

The mean of the data is 7.5.


Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 7.5|$, and the product $f_i |x_i - 7.5|$.

$x_i$ $f_i$ $|x_i - 7.5|$ $f_i |x_i - 7.5|$
22$|2 - 7.5| = 5.5$$2 \times 5.5 = 11.0$
58$|5 - 7.5| = 2.5$$8 \times 2.5 = 20.0$
610$|6 - 7.5| = 1.5$$10 \times 1.5 = 15.0$
87$|8 - 7.5| = 0.5$$7 \times 0.5 = 3.5$
108$|10 - 7.5| = 2.5$$8 \times 2.5 = 20.0$
125$|12 - 7.5| = 4.5$$5 \times 4.5 = 22.5$
$\sum f_i = 40$ $\sum f_i |x_i - 7.5| = 92.0$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum f_i |x_i - \overline{x}|}{\sum f_i}$

MD($\overline{x}$) = $\frac{92.0}{40} = \frac{92}{40} = \frac{23}{10} = 2.3$

The mean deviation about the mean for the given data is 2.3.

Example 5: Find the mean deviation about the median for the following data:

$x_i$ 3 6 9 12 13 15 21 22
$f_i$ 3 4 5 2 4 5 4 3

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
33
64
95
122
134
155
214
223

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the total frequency $N = \sum f_i$ and the cumulative frequencies (c.f.).

$x_i$ $f_i$ Cumulative Frequency (c.f.)
333
64$3 + 4 = 7$
95$7 + 5 = 12$
122$12 + 2 = 14$
134$14 + 4 = 18$
155$18 + 5 = 23$
214$23 + 4 = 27$
223$27 + 3 = 30$
$N = \sum f_i = 30$

The total number of observations is $N = 30$, which is an even number.

For an even number of observations, the median is the average of the $\left(\frac{N}{2}\right)^{\text{th}}$ and $\left(\frac{N}{2} + 1\right)^{\text{th}}$ observations.

$\frac{N}{2} = \frac{30}{2} = 15^{\text{th}}$ observation.

$\frac{N}{2} + 1 = 15 + 1 = 16^{\text{th}}$ observation.

From the cumulative frequency table, the $15^{\text{th}}$ observation falls in the class where c.f. is 18, which corresponds to $x_i = 13$.

The $16^{\text{th}}$ observation also falls in the class where c.f. is 18, which corresponds to $x_i = 13$.

So, the median (M) = $\frac{13 + 13}{2} = 13$.

The median of the data is 13.


Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 13|$, and the product $f_i |x_i - 13|$.

$x_i$ $f_i$ $|x_i - 13|$ $f_i |x_i - 13|$
33$|3 - 13| = 10$$3 \times 10 = 30$
64$|6 - 13| = 7$$4 \times 7 = 28$
95$|9 - 13| = 4$$5 \times 4 = 20$
122$|12 - 13| = 1$$2 \times 1 = 2$
134$|13 - 13| = 0$$4 \times 0 = 0$
155$|15 - 13| = 2$$5 \times 2 = 10$
214$|21 - 13| = 8$$4 \times 8 = 32$
223$|22 - 13| = 9$$3 \times 9 = 27$
$\sum f_i = 30$ $\sum f_i |x_i - 13| = 149$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum f_i |x_i - M|}{\sum f_i}$

MD(M) = $\frac{149}{30}$

MD(M) $\approx 4.97$ (approximately)

The mean deviation about the median for the given data is $\frac{149}{30}$ or approximately 4.97.

Example 6: Find the mean deviation about the mean for the following data

Marks obtained 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Number of students 2 3 8 14 8 3 2

Answer:

The given data is a grouped frequency distribution:

Marks obtained (Class Interval) Number of students ($f_i$)
10-202
20-303
30-408
40-5014
50-608
60-703
70-802

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

For grouped data, the mean is calculated using the midpoints of the class intervals.

Let $x_i$ be the midpoint of the $i$-th class interval and $f_i$ be the corresponding frequency.

Calculate the midpoints ($x_i$) and the product $f_i x_i$:

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
10-202$\frac{10+20}{2} = 15$$2 \times 15 = 30$
20-303$\frac{20+30}{2} = 25$$3 \times 25 = 75$
30-408$\frac{30+40}{2} = 35$$8 \times 35 = 280$
40-5014$\frac{40+50}{2} = 45$$14 \times 45 = 630$
50-608$\frac{50+60}{2} = 55$$8 \times 55 = 440$
60-703$\frac{60+70}{2} = 65$$3 \times 65 = 195$
70-802$\frac{70+80}{2} = 75$$2 \times 75 = 150$
Total $\sum f_i = 40$ $\sum f_i x_i = 1800$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum f_i x_i}{\sum f_i}$

$\overline{x} = \frac{1800}{40} = 45$

The mean of the data is 45.


Next, we calculate the absolute deviation of each midpoint from the mean, $|x_i - \overline{x}| = |x_i - 45|$, and the product $f_i |x_i - 45|$.

Class Interval $x_i$ $f_i$ $|x_i - 45|$ $f_i |x_i - 45|$
10-20152$|15 - 45| = |-30| = 30$$2 \times 30 = 60$
20-30253$|25 - 45| = |-20| = 20$$3 \times 20 = 60$
30-40358$|35 - 45| = |-10| = 10$$8 \times 10 = 80$
40-504514$|45 - 45| = |0| = 0$$14 \times 0 = 0$
50-60558$|55 - 45| = |10| = 10$$8 \times 10 = 80$
60-70653$|65 - 45| = |20| = 20$$3 \times 20 = 60$
70-80752$|75 - 45| = |30| = 30$$2 \times 30 = 60$
Total $\sum f_i = 40$ $\sum f_i |x_i - 45| = 400$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum f_i |x_i - \overline{x}|}{\sum f_i}$

MD($\overline{x}$) = $\frac{400}{40} = 10$

The mean deviation about the mean for the given data is 10.

Example 7: Calculate the mean deviation about median for the following data :

Class 0-10 10-20 20-30 30-40 40-50 50-60
Frequency 6 7 15 16 4 2

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
0-106
10-207
20-3015
30-4016
40-504
50-602

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum f_i$.

Class Interval Frequency ($f_i$) Cumulative Frequency (c.f.)
0-1066
10-207$6 + 7 = 13$
20-3015$13 + 15 = 28$
30-4016$28 + 16 = 44$
40-504$44 + 4 = 48$
50-602$48 + 2 = 50$
Total $N = \sum f_i = 50$

The total number of observations is $N = 50$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.

$\frac{N}{2} = \frac{50}{2} = 25^{\text{th}}$ observation.

The cumulative frequency just greater than or equal to 25 is 28, which corresponds to the class interval 20-30.

So, the median class is 20-30.

For the median class (20-30):

Lower boundary (L) = 20

Frequency of the median class (f) = 15

Cumulative frequency of the class preceding the median class (c.f.) = 13 (c.f. of 10-20 class)

Class size (h) = $30 - 20 = 10$

The median (M) is calculated using the formula:

$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$

$M = 20 + \frac{25 - 13}{15} \times 10$

$M = 20 + \frac{12}{15} \times 10$

$M = 20 + \frac{4}{5} \times 10$

$M = 20 + 4 \times 2$

$M = 20 + 8$

$M = 28$

The median of the data is 28.


Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - 28|$, and the product $f_i |x_i - 28|$.

Class Interval $f_i$ Midpoint ($x_i$) $|x_i - 28|$ $f_i |x_i - 28|$
0-106$\frac{0+10}{2} = 5$$|5 - 28| = |-23| = 23$$6 \times 23 = 138$
10-207$\frac{10+20}{2} = 15$$|15 - 28| = |-13| = 13$$7 \times 13 = 91$
20-3015$\frac{20+30}{2} = 25$$|25 - 28| = |-3| = 3$$15 \times 3 = 45$
30-4016$\frac{30+40}{2} = 35$$|35 - 28| = |7| = 7$$16 \times 7 = 112$
40-504$\frac{40+50}{2} = 45$$|45 - 28| = |17| = 17$$4 \times 17 = 68$
50-602$\frac{50+60}{2} = 55$$|55 - 28| = |27| = 27$$2 \times 27 = 54$
Total $\sum f_i = 50$ $\sum f_i |x_i - 28| = 508$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum f_i |x_i - M|}{\sum f_i}$

MD(M) = $\frac{508}{50} = 10.16$

The mean deviation about the median for the given data is 10.16.



Exercise 15.1

Find the mean deviation about the mean for the data in Exercises 1 and 2.

Question 1.

478910121317

Answer:

The given data is: 4, 7, 8, 9, 10, 12, 13, 17.

The number of observations is $n = 8$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$

Sum of observations = $4 + 7 + 8 + 9 + 10 + 12 + 13 + 17 = 80$

$\overline{x} = \frac{80}{8} = 10$

The mean of the data is 10.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 10|$.

$|4 - 10| = |-6| = 6$

$|7 - 10| = |-3| = 3$

$|8 - 10| = |-2| = 2$

$|9 - 10| = |-1| = 1$

$|10 - 10| = |0| = 0$

$|12 - 10| = |2| = 2$

$|13 - 10| = |3| = 3$

$|17 - 10| = |7| = 7$


Now, we find the sum of the absolute deviations.

$\sum_{i=1}^{8} |x_i - \overline{x}| = 6 + 3 + 2 + 1 + 0 + 2 + 3 + 7 = 24$


Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum_{i=1}^{n} |x_i - \overline{x}|}{n}$

MD($\overline{x}$) = $\frac{24}{8} = 3$

The mean deviation about the mean for the given data is 3.

Question 2.

38704840425563465444

Answer:

The given data is: 38, 70, 48, 40, 42, 55, 63, 46, 54, 44.

The number of observations is $n$. By counting the data points, we have $n = 10$.


First, we find the mean of the data ($\overline{x}$).

$\overline{x} = \frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum x_i}{n}$

Sum of observations ($\sum x_i$) = $38 + 70 + 48 + 40 + 42 + 55 + 63 + 46 + 54 + 44 = 500$

$\overline{x} = \frac{500}{10} = 50$

The mean of the data is 50.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 50|$.

$|38 - 50| = |-12| = 12$

$|70 - 50| = |20| = 20$

$|48 - 50| = |-2| = 2$

$|40 - 50| = |-10| = 10$

$|42 - 50| = |-8| = 8$

$|55 - 50| = |5| = 5$

$|63 - 50| = |13| = 13$

$|46 - 50| = |-4| = 4$

$|54 - 50| = |4| = 4$

$|44 - 50| = |-6| = 6$


Now, we find the sum of the absolute deviations.

$\sum |x_i - \overline{x}| = 12 + 20 + 2 + 10 + 8 + 5 + 13 + 4 + 4 + 6 = 84$


Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum |x_i - \overline{x}|}{n} = \frac{84}{10}$

MD($\overline{x}$) = $8.4$

The mean deviation about the mean for the given data is 8.4.

Find the mean deviation about the median for the data in Exercises 3 and 4.

Question 3.

13171614111310161118
1217

Answer:

The given data is: 13, 17, 16, 14, 11, 13, 10, 16, 11, 18, 12, 17.

The number of observations is $n$. By counting the data points, we have $n = 12$.


To find the mean deviation about the median, we first need to calculate the median (M).

We arrange the data in ascending order:

10, 11, 11, 12, 13, 13, 14, 16, 16, 17, 17, 18.


Since the number of observations ($n = 12$) is even, the median is the average of the $\left(\frac{n}{2}\right)^{\text{th}}$ and $\left(\frac{n}{2} + 1\right)^{\text{th}}$ observations.

$\frac{n}{2} = \frac{12}{2} = 6^{\text{th}}$ observation.

$\frac{n}{2} + 1 = 6 + 1 = 7^{\text{th}}$ observation.

The $6^{\text{th}}$ observation in the arranged data is 13.

The $7^{\text{th}}$ observation in the arranged data is 14.

Median (M) = $\frac{6^{\text{th}} \text{ observation} + 7^{\text{th}} \text{ observation}}{2} = \frac{13 + 14}{2} = \frac{27}{2} = 13.5$

The median of the data is 13.5.


Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 13.5|$.

$|10 - 13.5| = 3.5$

$|11 - 13.5| = 2.5$

$|11 - 13.5| = 2.5$

$|12 - 13.5| = 1.5$

$|13 - 13.5| = 0.5$

$|13 - 13.5| = 0.5$

$|14 - 13.5| = 0.5$

$|16 - 13.5| = 2.5$

$|16 - 13.5| = 2.5$

$|17 - 13.5| = 3.5$

$|17 - 13.5| = 3.5$

$|18 - 13.5| = 4.5$


Now, we find the sum of the absolute deviations.

$\sum_{i=1}^{12} |x_i - M| = 3.5 + 2.5 + 2.5 + 1.5 + 0.5 + 0.5 + 0.5 + 2.5 + 2.5 + 3.5 + 3.5 + 4.5 = 28$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum_{i=1}^{n} |x_i - M|}{n}$

MD(M) = $\frac{28}{12} = \frac{7}{3}$

MD(M) $\approx 2.33$ (approximately)

The mean deviation about the median for the given data is $\frac{7}{3}$ or approximately 2.33.

Question 4.

36724642604553465149

Answer:

The given data is: 36, 72, 46, 42, 60, 45, 53, 46, 51, 49.

The number of observations is $n$. By counting the data points, we have $n = 10$.


To find the mean deviation about the median, we first need to calculate the median (M).

We arrange the data in ascending order:

36, 42, 45, 46, 46, 49, 51, 53, 60, 72.


Since the number of observations ($n = 10$) is even, the median is the average of the $\left(\frac{n}{2}\right)^{\text{th}}$ and $\left(\frac{n}{2} + 1\right)^{\text{th}}$ observations.

$\frac{n}{2} = \frac{10}{2} = 5^{\text{th}}$ observation.

$\frac{n}{2} + 1 = 5 + 1 = 6^{\text{th}}$ observation.

The $5^{\text{th}}$ observation in the arranged data is 46.

The $6^{\text{th}}$ observation in the arranged data is 49.

Median (M) = $\frac{5^{\text{th}} \text{ observation} + 6^{\text{th}} \text{ observation}}{2} = \frac{46 + 49}{2} = \frac{95}{2} = 47.5$

The median of the data is 47.5.


Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 47.5|$.

$|36 - 47.5| = |-11.5| = 11.5$

$|42 - 47.5| = |-5.5| = 5.5$

$|45 - 47.5| = |-2.5| = 2.5$

$|46 - 47.5| = |-1.5| = 1.5$

$|46 - 47.5| = |-1.5| = 1.5$

$|49 - 47.5| = |1.5| = 1.5$

$|51 - 47.5| = |3.5| = 3.5$

$|53 - 47.5| = |5.5| = 5.5$

$|60 - 47.5| = |12.5| = 12.5$

$|72 - 47.5| = |24.5| = 24.5$


Now, we find the sum of the absolute deviations.

$\sum_{i=1}^{10} |x_i - M| = 11.5 + 5.5 + 2.5 + 1.5 + 1.5 + 1.5 + 3.5 + 5.5 + 12.5 + 24.5 = 70.0$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum_{i=1}^{n} |x_i - M|}{n}$

MD(M) = $\frac{70.0}{10} = 7.0$

The mean deviation about the median for the given data is 7.

Find the mean deviation about the mean for the data in Exercises 5 and 6.

Question 5.

$x_i$ 5 10 15 20 25
$f_i$ 7 4 6 3 5

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
57
104
156
203
255

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum f_i$ and the sum $\sum f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
57$7 \times 5 = 35$
104$4 \times 10 = 40$
156$6 \times 15 = 90$
203$3 \times 20 = 60$
255$5 \times 25 = 125$
$\sum f_i = 25$ $\sum f_i x_i = 350$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum f_i x_i}{\sum f_i}$

$\overline{x} = \frac{350}{25} = 14$

The mean of the data is 14.


Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 14|$, and the product $f_i |x_i - 14|$.

$x_i$ $f_i$ $|x_i - 14|$ $f_i |x_i - 14|$
57$|5 - 14| = 9$$7 \times 9 = 63$
104$|10 - 14| = 4$$4 \times 4 = 16$
156$|15 - 14| = 1$$6 \times 1 = 6$
203$|20 - 14| = 6$$3 \times 6 = 18$
255$|25 - 14| = 11$$5 \times 11 = 55$
$\sum f_i = 25$ $\sum f_i |x_i - 14| = 158$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum f_i |x_i - \overline{x}|}{\sum f_i}$

MD($\overline{x}$) = $\frac{158}{25}$

MD($\overline{x}$) = $6.32$

The mean deviation about the mean for the given data is 6.32.

Question 6.

$x_i$ 10 30 50 70 90
$f_i$ 4 24 28 16 8

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
104
3024
5028
7016
908

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum f_i$ and the sum $\sum f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
104$4 \times 10 = 40$
3024$24 \times 30 = 720$
5028$28 \times 50 = 1400$
7016$16 \times 70 = 1120$
908$8 \times 90 = 720$
$\sum f_i = 80$ $\sum f_i x_i = 4000$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum f_i x_i}{\sum f_i}$

$\overline{x} = \frac{4000}{80} = 50$

The mean of the data is 50.


Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 50|$, and the product $f_i |x_i - 50|$.

$x_i$ $f_i$ $|x_i - 50|$ $f_i |x_i - 50|$
104$|10 - 50| = 40$$4 \times 40 = 160$
3024$|30 - 50| = 20$$24 \times 20 = 480$
5028$|50 - 50| = 0$$28 \times 0 = 0$
7016$|70 - 50| = 20$$16 \times 20 = 320$
908$|90 - 50| = 40$$8 \times 40 = 320$
$\sum f_i = 80$ $\sum f_i |x_i - 50| = 1280$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum f_i |x_i - \overline{x}|}{\sum f_i}$

MD($\overline{x}$) = $\frac{1280}{80} = 16$

The mean deviation about the mean for the given data is 16.

Find the mean deviation about the median for the data in Exercises 7 and 8.

Question 7.

$x_i$ 5 7 9 10 12 15
$f_i$ 8 6 2 2 2 6

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
58
76
92
102
122
156

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the total frequency $N = \sum f_i$ and the cumulative frequencies (c.f.).

$x_i$ $f_i$ Cumulative Frequency (c.f.)
588
76$8 + 6 = 14$
92$14 + 2 = 16$
102$16 + 2 = 18$
122$18 + 2 = 20$
156$20 + 6 = 26$
$N = \sum f_i = 26$

The total number of observations is $N = 26$, which is an even number.

For an even number of observations in a discrete frequency distribution, the median is the average of the values of the $\left(\frac{N}{2}\right)^{\text{th}}$ and $\left(\frac{N}{2} + 1\right)^{\text{th}}$ observations.

$\frac{N}{2} = \frac{26}{2} = 13^{\text{th}}$ observation.

$\frac{N}{2} + 1 = 13 + 1 = 14^{\text{th}}$ observation.

From the cumulative frequency table, the $13^{\text{th}}$ observation falls in the class where c.f. is 14, which corresponds to $x_i = 7$.

The $14^{\text{th}}$ observation also falls in the class where c.f. is 14, which corresponds to $x_i = 7$.

So, the median (M) = $\frac{7 + 7}{2} = 7$.

The median of the data is 7.


Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 7|$, and the product $f_i |x_i - 7|$.

$x_i$ $f_i$ $|x_i - 7|$ $f_i |x_i - 7|$
58$|5 - 7| = 2$$8 \times 2 = 16$
76$|7 - 7| = 0$$6 \times 0 = 0$
92$|9 - 7| = 2$$2 \times 2 = 4$
102$|10 - 7| = 3$$2 \times 3 = 6$
122$|12 - 7| = 5$$2 \times 5 = 10$
156$|15 - 7| = 8$$6 \times 8 = 48$
$\sum f_i = 26$ $\sum f_i |x_i - 7| = 84$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum f_i |x_i - M|}{\sum f_i}$

MD(M) = $\frac{84}{26} = \frac{42}{13}$

MD(M) $\approx 3.23$ (approximately)

The mean deviation about the median for the given data is $\frac{42}{13}$ or approximately 3.23.

Question 8.

$x_i$ 15 21 27 30 35
$f_i$ 3 5 6 7 8

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
153
215
276
307
358

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the total frequency $N = \sum f_i$ and the cumulative frequencies (c.f.).

$x_i$ $f_i$ Cumulative Frequency (c.f.)
1533
215$3 + 5 = 8$
276$8 + 6 = 14$
307$14 + 7 = 21$
358$21 + 8 = 29$
$N = \sum f_i = 29$

The total number of observations is $N = 29$, which is an odd number.

For an odd number of observations in a discrete frequency distribution, the median is the value of the $\left(\frac{N+1}{2}\right)^{\text{th}}$ observation.

$\frac{N+1}{2} = \frac{29+1}{2} = \frac{30}{2} = 15^{\text{th}}$ observation.

From the cumulative frequency table, the $15^{\text{th}}$ observation falls in the class where c.f. is 21, which corresponds to $x_i = 30$.

So, the median (M) = 30.

The median of the data is 30.


Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 30|$, and the product $f_i |x_i - 30|$.

$x_i$ $f_i$ $|x_i - 30|$ $f_i |x_i - 30|$
153$|15 - 30| = 15$$3 \times 15 = 45$
215$|21 - 30| = 9$$5 \times 9 = 45$
276$|27 - 30| = 3$$6 \times 3 = 18$
307$|30 - 30| = 0$$7 \times 0 = 0$
358$|35 - 30| = 5$$8 \times 5 = 40$
$\sum f_i = 29$ $\sum f_i |x_i - 30| = 148$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum f_i |x_i - M|}{\sum f_i}$

MD(M) = $\frac{148}{29}$

MD(M) $\approx 5.103$ (approximately)

The mean deviation about the median for the given data is $\frac{148}{29}$ or approximately 5.103.

Find the mean deviation about the mean for the data in Exercises 9 and 10.

Question 9.

Income per day in ₹ 0-100 100-200 200-300 300-400 400-500 500-600 600-700 700-800
Number of persons 4 8 9 10 7 5 4 3

Answer:

The given data is a grouped frequency distribution:

Income per day in $\textsf{₹}$ (Class Interval) Number of persons ($f_i$)
0-1004
100-2008
200-3009
300-40010
400-5007
500-6005
600-7004
700-8003

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

For grouped data, the mean is calculated using the midpoints of the class intervals.

Let $x_i$ be the midpoint of the $i$-th class interval and $f_i$ be the corresponding frequency.

Calculate the midpoints ($x_i$) and the product $f_i x_i$:

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
0-1004$\frac{0+100}{2} = 50$$4 \times 50 = 200$
100-2008$\frac{100+200}{2} = 150$$8 \times 150 = 1200$
200-3009$\frac{200+300}{2} = 250$$9 \times 250 = 2250$
300-40010$\frac{300+400}{2} = 350$$10 \times 350 = 3500$
400-5007$\frac{400+500}{2} = 450$$7 \times 450 = 3150$
500-6005$\frac{500+600}{2} = 550$$5 \times 550 = 2750$
600-7004$\frac{600+700}{2} = 650$$4 \times 650 = 2600$
700-8003$\frac{700+800}{2} = 750$$3 \times 750 = 2250$
Total $\sum f_i = 50$ $\sum f_i x_i = 17900$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum f_i x_i}{\sum f_i}$

$\overline{x} = \frac{17900}{50} = \frac{1790}{5} = 358$

The mean income per day is $\textsf{₹}$ 358.


Next, we calculate the absolute deviation of each midpoint from the mean, $|x_i - \overline{x}| = |x_i - 358|$, and the product $f_i |x_i - 358|$.

Class Interval $x_i$ $f_i$ $|x_i - 358|$ $f_i |x_i - 358|$
0-100504$|50 - 358| = |-308| = 308$$4 \times 308 = 1232$
100-2001508$|150 - 358| = |-208| = 208$$8 \times 208 = 1664$
200-3002509$|250 - 358| = |-108| = 108$$9 \times 108 = 972$
300-40035010$|350 - 358| = |-8| = 8$$10 \times 8 = 80$
400-5004507$|450 - 358| = |92| = 92$$7 \times 92 = 644$
500-6005505$|550 - 358| = |192| = 192$$5 \times 192 = 960$
600-7006504$|650 - 358| = |292| = 292$$4 \times 292 = 1168$
700-8007503$|750 - 358| = |392| = 392$$3 \times 392 = 1176$
Total $\sum f_i = 50$ $\sum f_i |x_i - 358| = 7896$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum f_i |x_i - \overline{x}|}{\sum f_i}$

MD($\overline{x}$) = $\frac{7896}{50} = \frac{3948}{25} = 157.92$

The mean deviation about the mean for the given data is $\textsf{₹}$ 157.92.

Question 10.

Height in cms 95-105 105-115 115-125 125-135 135-145 145-155
Number of boys 9 13 26 30 12 10

Answer:

The given data is a grouped frequency distribution of height and number of boys:

Height in cms (Class Interval) Number of boys ($f_i$)
95-1059
105-11513
115-12526
125-13530
135-14512
145-15510

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum f_i$.

Class Interval Frequency ($f_i$) Cumulative Frequency (c.f.)
95-10599
105-11513$9 + 13 = 22$
115-12526$22 + 26 = 48$
125-13530$48 + 30 = 78$
135-14512$78 + 12 = 90$
145-15510$90 + 10 = 100$
Total $N = \sum f_i = 100$

The total number of observations is $N = 100$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.

$\frac{N}{2} = \frac{100}{2} = 50^{\text{th}}$ observation.

The cumulative frequency just greater than or equal to 50 is 78, which corresponds to the class interval 125-135.

So, the median class is 125-135.

For the median class (125-135):

Lower boundary (L) = 125

Frequency of the median class (f) = 30

Cumulative frequency of the class preceding the median class (c.f.) = 48 (c.f. of 115-125 class)

Class size (h) = $135 - 125 = 10$

The median (M) is calculated using the formula:

$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$

$M = 125 + \frac{50 - 48}{30} \times 10$

$M = 125 + \frac{2}{30} \times 10$

$M = 125 + \frac{1}{15} \times 10$

$M = 125 + \frac{10}{15} = 125 + \frac{2}{3}$

$M = \frac{125 \times 3 + 2}{3} = \frac{375 + 2}{3} = \frac{377}{3}$

The median of the data is $\frac{377}{3}$.


Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - \frac{377}{3}|$, and the product $f_i |x_i - \frac{377}{3}|$.

Class Interval $f_i$ Midpoint ($x_i$) $|x_i - \frac{377}{3}|$ $f_i |x_i - \frac{377}{3}|$
95-1059100$|100 - \frac{377}{3}| = |\frac{300 - 377}{3}| = \frac{77}{3}$$9 \times \frac{77}{3} = 3 \times 77 = 231$
105-11513110$|110 - \frac{377}{3}| = |\frac{330 - 377}{3}| = \frac{47}{3}$$13 \times \frac{47}{3} = \frac{611}{3}$
115-12526120$|120 - \frac{377}{3}| = |\frac{360 - 377}{3}| = \frac{17}{3}$$26 \times \frac{17}{3} = \frac{442}{3}$
125-13530130$|130 - \frac{377}{3}| = |\frac{390 - 377}{3}| = \frac{13}{3}$$30 \times \frac{13}{3} = 10 \times 13 = 130$
135-14512140$|140 - \frac{377}{3}| = |\frac{420 - 377}{3}| = \frac{43}{3}$$12 \times \frac{43}{3} = 4 \times 43 = 172$
145-15510150$|150 - \frac{377}{3}| = |\frac{450 - 377}{3}| = \frac{73}{3}$$10 \times \frac{73}{3} = \frac{730}{3}$
Total $\sum f_i = 100$ $\sum f_i |x_i - \frac{377}{3}| = 231 + \frac{611}{3} + \frac{442}{3} + 130 + 172 + \frac{730}{3}$

Sum of $f_i |x_i - \frac{377}{3}| = (231 + 130 + 172) + (\frac{611 + 442 + 730}{3})$

$= 533 + \frac{1783}{3} = \frac{533 \times 3 + 1783}{3} = \frac{1599 + 1783}{3} = \frac{3382}{3}$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum f_i |x_i - M|}{\sum f_i}$

MD(M) = $\frac{\frac{3382}{3}}{100} = \frac{3382}{3 \times 100} = \frac{3382}{300}$

MD(M) = $\frac{1691}{150}$

MD(M) $\approx 11.2733...$

The mean deviation about the median for the given data is $\frac{1691}{150}$ or approximately 11.27.

Question 11. Find the mean deviation about median for the following data :

Marks 0-10 10-20 20-30 30-40 40-50 50-60
Number of Girls 6 8 14 16 4 2

Answer:

The given data is a grouped frequency distribution of marks obtained by girls:

Marks (Class Interval) Number of Girls ($f_i$)
0-106
10-208
20-3014
30-4016
40-504
50-602

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum f_i$.

Class Interval Frequency ($f_i$) Cumulative Frequency (c.f.)
0-1066
10-208$6 + 8 = 14$
20-3014$14 + 14 = 28$
30-4016$28 + 16 = 44$
40-504$44 + 4 = 48$
50-602$48 + 2 = 50$
Total $N = \sum f_i = 50$

The total number of observations is $N = 50$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.

$\frac{N}{2} = \frac{50}{2} = 25^{\text{th}}$ observation.

The cumulative frequency just greater than or equal to 25 is 28, which corresponds to the class interval 20-30.

So, the median class is 20-30.

For the median class (20-30):

Lower boundary (L) = 20

Frequency of the median class (f) = 14

Cumulative frequency of the class preceding the median class (c.f.) = 14 (c.f. of 10-20 class)

Class size (h) = $30 - 20 = 10$

The median (M) is calculated using the formula:

$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$

$M = 20 + \frac{25 - 14}{14} \times 10$

$M = 20 + \frac{11}{14} \times 10$

$M = 20 + \frac{110}{14} = 20 + \frac{55}{7}$

$M = \frac{20 \times 7 + 55}{7} = \frac{140 + 55}{7} = \frac{195}{7}$

The median of the data is $\frac{195}{7}$.


Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - \frac{195}{7}|$, and the product $f_i |x_i - \frac{195}{7}|$.

Note: $\frac{195}{7} \approx 27.857$

Class Interval $f_i$ Midpoint ($x_i$) $|x_i - \frac{195}{7}|$ $f_i |x_i - \frac{195}{7}|$
0-1065$|5 - \frac{195}{7}| = |\frac{35 - 195}{7}| = \frac{160}{7}$$6 \times \frac{160}{7} = \frac{960}{7}$
10-20815$|15 - \frac{195}{7}| = |\frac{105 - 195}{7}| = \frac{90}{7}$$8 \times \frac{90}{7} = \frac{720}{7}$
20-301425$|25 - \frac{195}{7}| = |\frac{175 - 195}{7}| = \frac{20}{7}$$14 \times \frac{20}{7} = 2 \times 20 = 40$
30-401635$|35 - \frac{195}{7}| = |\frac{245 - 195}{7}| = \frac{50}{7}$$16 \times \frac{50}{7} = \frac{800}{7}$
40-50445$|45 - \frac{195}{7}| = |\frac{315 - 195}{7}| = \frac{120}{7}$$4 \times \frac{120}{7} = \frac{480}{7}$
50-60255$|55 - \frac{195}{7}| = |\frac{385 - 195}{7}| = \frac{190}{7}$$2 \times \frac{190}{7} = \frac{380}{7}$
Total $\sum f_i = 50$ $\sum f_i |x_i - \frac{195}{7}| = \frac{960}{7} + \frac{720}{7} + 40 + \frac{800}{7} + \frac{480}{7} + \frac{380}{7}$

Sum of $f_i |x_i - \frac{195}{7}| = \frac{960 + 720 + 800 + 480 + 380}{7} + 40$

$= \frac{3340}{7} + 40 = \frac{3340 + 40 \times 7}{7} = \frac{3340 + 280}{7} = \frac{3620}{7}$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum f_i |x_i - M|}{\sum f_i}$

MD(M) = $\frac{\frac{3620}{7}}{50} = \frac{3620}{7 \times 50} = \frac{362}{7 \times 5} = \frac{362}{35}$

MD(M) $\approx 10.34$ (approximately)

The mean deviation about the median for the given data is $\frac{362}{35}$ or approximately 10.34.

Question 12. Calculate the mean deviation about median age for the age distribution of 100 persons given below:

Age (in years) 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-55
Number 5 6 12 14 26 12 16 9

[Hint: Convert the given data into continuous frequency distribution by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class interval]

Answer:

The given data has discrete class intervals. As per the hint, we convert it into a continuous frequency distribution by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class interval.

The continuous frequency distribution is:

Age (in years) (Continuous Class Interval) Number ($f_i$)
15.5-20.55
20.5-25.56
25.5-30.512
30.5-35.514
35.5-40.526
40.5-45.512
45.5-50.516
50.5-55.59

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum f_i$.

Class Interval Frequency ($f_i$) Cumulative Frequency (c.f.)
15.5-20.555
20.5-25.56$5 + 6 = 11$
25.5-30.512$11 + 12 = 23$
30.5-35.514$23 + 14 = 37$
35.5-40.526$37 + 26 = 63$
40.5-45.512$63 + 12 = 75$
45.5-50.516$75 + 16 = 91$
50.5-55.59$91 + 9 = 100$
Total $N = \sum f_i = 100$

The total number of observations is $N = 100$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.

$\frac{N}{2} = \frac{100}{2} = 50^{\text{th}}$ observation.

The cumulative frequency just greater than or equal to 50 is 63, which corresponds to the class interval 35.5-40.5.

So, the median class is 35.5-40.5.

For the median class (35.5-40.5):

Lower boundary (L) = 35.5

Frequency of the median class (f) = 26

Cumulative frequency of the class preceding the median class (c.f.) = 37 (c.f. of 30.5-35.5 class)

Class size (h) = $40.5 - 35.5 = 5$

The median (M) is calculated using the formula:

$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$

$M = 35.5 + \frac{50 - 37}{26} \times 5$

$M = 35.5 + \frac{13}{26} \times 5$

$M = 35.5 + \frac{1}{2} \times 5$

$M = 35.5 + 2.5$

$M = 38$

The median age is 38 years.


Next, we calculate the midpoints ($x_i$) of each continuous class interval, the absolute deviation from the median $|x_i - 38|$, and the product $f_i |x_i - 38|$.

Class Interval $f_i$ Midpoint ($x_i$) $|x_i - 38|$ $f_i |x_i - 38|$
15.5-20.55$\frac{15.5+20.5}{2} = 18$$|18 - 38| = |-20| = 20$$5 \times 20 = 100$
20.5-25.56$\frac{20.5+25.5}{2} = 23$$|23 - 38| = |-15| = 15$$6 \times 15 = 90$
25.5-30.512$\frac{25.5+30.5}{2} = 28$$|28 - 38| = |-10| = 10$$12 \times 10 = 120$
30.5-35.514$\frac{30.5+35.5}{2} = 33$$|33 - 38| = |-5| = 5$$14 \times 5 = 70$
35.5-40.526$\frac{35.5+40.5}{2} = 38$$|38 - 38| = |0| = 0$$26 \times 0 = 0$
40.5-45.512$\frac{40.5+45.5}{2} = 43$$|43 - 38| = |5| = 5$$12 \times 5 = 60$
45.5-50.516$\frac{45.5+50.5}{2} = 48$$|48 - 38| = |10| = 10$$16 \times 10 = 160$
50.5-55.59$\frac{50.5+55.5}{2} = 53$$|53 - 38| = |15| = 15$$9 \times 15 = 135$
Total $\sum f_i = 100$ $\sum f_i |x_i - 38| = 835$

Sum of $f_i |x_i - 38| = 100 + 90 + 120 + 70 + 0 + 60 + 160 + 135 = 835$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum f_i |x_i - M|}{\sum f_i}$

MD(M) = $\frac{835}{100} = 8.35$

The mean deviation about the median age for the given data is 8.35 years.



Example 8 to 12 (Before Exercise 15.2)

Example 8: Find the variance of the following data:

681012141618202224

Answer:

The given data is: 6, 8, 10, 12, 14, 16, 18, 20, 22, 24.

The number of observations is $n = 10$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum x_i}{n}$

Sum of observations ($\sum x_i$) = $6 + 8 + 10 + 12 + 14 + 16 + 18 + 20 + 22 + 24 = 150$

$\overline{x} = \frac{150}{10} = 15$

The mean of the data is 15.


Next, we calculate the deviations from the mean ($x_i - \overline{x}$) and the squared deviations ($(x_i - \overline{x})^2$).

$x_i$ $x_i - \overline{x} = x_i - 15$ $(x_i - \overline{x})^2$
6$6 - 15 = -9$$(-9)^2 = 81$
8$8 - 15 = -7$$(-7)^2 = 49$
10$10 - 15 = -5$$(-5)^2 = 25$
12$12 - 15 = -3$$(-3)^2 = 9$
14$14 - 15 = -1$$(-1)^2 = 1$
16$16 - 15 = 1$$1^2 = 1$
18$18 - 15 = 3$$3^2 = 9$
20$20 - 15 = 5$$5^2 = 25$
22$22 - 15 = 7$$7^2 = 49$
24$24 - 15 = 9$$9^2 = 81$
Total $\sum (x_i - \overline{x}) = 0$ $\sum (x_i - \overline{x})^2 = 330$

The variance ($\sigma^2$) for ungrouped data is given by the formula:

$\sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \overline{x})^2}{n}$

$\sigma^2 = \frac{330}{10}$

$\sigma^2 = 33$

The variance of the given data is 33.

Example 9: Find the variance and standard deviation for the following data:

$x_i$ 4 8 11 17 20 24 32
$f_i$ 3 5 9 5 4 3 1

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
43
85
119
175
204
243
321

To find the variance and standard deviation, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum f_i$ and the sum $\sum f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
43$3 \times 4 = 12$
85$5 \times 8 = 40$
119$9 \times 11 = 99$
175$5 \times 17 = 85$
204$4 \times 20 = 80$
243$3 \times 24 = 72$
321$1 \times 32 = 32$
$\sum f_i = 30$ $\sum f_i x_i = 420$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum f_i x_i}{\sum f_i}$

$\overline{x} = \frac{420}{30} = 14$

The mean of the data is 14.


Next, we calculate the deviations from the mean ($x_i - \overline{x}$), the squared deviations ($(x_i - \overline{x})^2$), and the product $f_i (x_i - \overline{x})^2$.

$x_i$ $f_i$ $x_i - 14$ $(x_i - 14)^2$ $f_i (x_i - 14)^2$
43$4 - 14 = -10$$(-10)^2 = 100$$3 \times 100 = 300$
85$8 - 14 = -6$$(-6)^2 = 36$$5 \times 36 = 180$
119$11 - 14 = -3$$(-3)^2 = 9$$9 \times 9 = 81$
175$17 - 14 = 3$$3^2 = 9$$5 \times 9 = 45$
204$20 - 14 = 6$$6^2 = 36$$4 \times 36 = 144$
243$24 - 14 = 10$$10^2 = 100$$3 \times 100 = 300$
321$32 - 14 = 18$$18^2 = 324$$1 \times 324 = 324$
$\sum f_i = 30$ $\sum f_i (x_i - 14)^2 = 1374$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum f_i (x_i - \overline{x})^2}{\sum f_i}$

$\sigma^2 = \frac{1374}{30} = \frac{137.4}{3} = 45.8$

The variance of the data is 45.8.


The standard deviation ($\sigma$) is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{45.8}$

Calculating the square root:

$\sqrt{45.8} \approx 6.76757$

The standard deviation is approximately 6.77.

Example 10: Calculate the mean, variance and standard deviation for the following distribution :

Class 30-40 40-50 50-60 60-70 70-80 80-90 90-100
Frequency 3 7 12 15 8 3 2

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
30-403
40-507
50-6012
60-7015
70-808
80-903
90-1002

First, we calculate the mean ($\overline{x}$).

We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum f_i$ and the sum $\sum f_i x_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
30-403$\frac{30+40}{2} = 35$$3 \times 35 = 105$
40-507$\frac{40+50}{2} = 45$$7 \times 45 = 315$
50-6012$\frac{50+60}{2} = 55$$12 \times 55 = 660$
60-7015$\frac{60+70}{2} = 65$$15 \times 65 = 975$
70-808$\frac{70+80}{2} = 75$$8 \times 75 = 600$
80-903$\frac{80+90}{2} = 85$$3 \times 85 = 255$
90-1002$\frac{90+100}{2} = 95$$2 \times 95 = 190$
Total $N = \sum f_i = 50$ $\sum f_i x_i = 3100$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum f_i x_i}{N}$

$\overline{x} = \frac{3100}{50} = 62$

The mean of the distribution is 62.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

Class Interval $x_i$ $f_i$ $x_i - \overline{x} = x_i - 62$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
30-40353$35 - 62 = -27$$(-27)^2 = 729$$3 \times 729 = 2187$
40-50457$45 - 62 = -17$$(-17)^2 = 289$$7 \times 289 = 2023$
50-605512$55 - 62 = -7$$(-7)^2 = 49$$12 \times 49 = 588$
60-706515$65 - 62 = 3$$3^2 = 9$$15 \times 9 = 135$
70-80758$75 - 62 = 13$$13^2 = 169$$8 \times 169 = 1352$
80-90853$85 - 62 = 23$$23^2 = 529$$3 \times 529 = 1587$
90-100952$95 - 62 = 33$$33^2 = 1089$$2 \times 1089 = 2178$
Total $N = 50$ $\sum (x_i - \overline{x}) = 0$ $\sum f_i (x_i - \overline{x})^2 = 10050$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{10050}{50}$

$\sigma^2 = 201$

The variance of the distribution is 201.


Finally, we calculate the standard deviation ($\sigma$), which is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{201}$

Using a calculator, $\sqrt{201} \approx 14.177$

The standard deviation is approximately 14.18.

Example 11: Find the standard deviation for the following data :

$x_i$ 3 8 13 18 23
$f_i$ 7 10 15 10 6

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
37
810
1315
1810
236

First, we find the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum f_i$ and the sum $\sum f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
37$7 \times 3 = 21$
810$10 \times 8 = 80$
1315$15 \times 13 = 195$
1810$10 \times 18 = 180$
236$6 \times 23 = 138$
$\sum f_i = 48$ $\sum f_i x_i = 614$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum f_i x_i}{\sum f_i} = \frac{614}{48} = \frac{307}{24}$

The mean of the data is $\frac{307}{24} \approx 12.79$.


Next, we calculate the variance ($\sigma^2$) and standard deviation ($\sigma$).

We can use the formula $\sigma^2 = \frac{1}{N} \sum f_i x_i^2 - (\overline{x})^2$. For this, we need $x_i^2$ and $f_i x_i^2$.

$x_i$ $f_i$ $x_i^2$ $f_i x_i^2$
37$3^2 = 9$$7 \times 9 = 63$
810$8^2 = 64$$10 \times 64 = 640$
1315$13^2 = 169$$15 \times 169 = 2535$
1810$18^2 = 324$$10 \times 324 = 3240$
236$23^2 = 529$$6 \times 529 = 3174$
$N = \sum f_i = 48$ $\sum f_i x_i^2 = 9652$

The variance ($\sigma^2$) is:

$\sigma^2 = \frac{\sum f_i x_i^2}{N} - (\overline{x})^2$

$\sigma^2 = \frac{9652}{48} - \left(\frac{614}{48}\right)^2$

$\sigma^2 = \frac{2413}{12} - \left(\frac{307}{24}\right)^2$

$\sigma^2 = \frac{2413}{12} - \frac{94249}{576}$

To combine the fractions, we find a common denominator, which is 576 ($12 \times 48 = 576$).

$\sigma^2 = \frac{2413 \times 48}{12 \times 48} - \frac{94249}{576}$

$\sigma^2 = \frac{115824}{576} - \frac{94249}{576}$

$\sigma^2 = \frac{115824 - 94249}{576} = \frac{21575}{576}$

The variance is $\frac{21575}{576}$.


The standard deviation ($\sigma$) is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{21575}{576}} = \frac{\sqrt{21575}}{\sqrt{576}} = \frac{\sqrt{21575}}{24}$

Calculating the square root of 21575:

$\sqrt{21575} \approx 146.8849$

$\sigma \approx \frac{146.8849}{24} \approx 6.1199$

The standard deviation is approximately 6.12.

Example 12: Calculate mean, variance and standard deviation for the following distribution.

Classes 30-40 40-50 50-60 60-70 70-80 80-90 90-100
Frequency 3 7 12 15 8 3 2

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
30-403
40-507
50-6012
60-7015
70-808
80-903
90-1002

We will use the step-deviation method to calculate the mean, variance, and standard deviation.

First, calculate the midpoints ($x_i$) for each class interval. The class size is $h = 40 - 30 = 10$. Let's take the assumed mean $A = 65$ (midpoint of class 60-70).

Calculate $u_i = \frac{x_i - A}{h} = \frac{x_i - 65}{10}$, and the products $f_i u_i$ and $f_i u_i^2$. We also find the total frequency $N = \sum f_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $u_i = \frac{x_i - 65}{10}$ $f_i u_i$ $u_i^2$ $f_i u_i^2$
30-40335$\frac{35 - 65}{10} = -3$$3 \times (-3) = -9$$(-3)^2 = 9$$3 \times 9 = 27$
40-50745$\frac{45 - 65}{10} = -2$$7 \times (-2) = -14$$(-2)^2 = 4$$7 \times 4 = 28$
50-601255$\frac{55 - 65}{10} = -1$$12 \times (-1) = -12$$(-1)^2 = 1$$12 \times 1 = 12$
60-701565$\frac{65 - 65}{10} = 0$$15 \times 0 = 0$$0^2 = 0$$15 \times 0 = 0$
70-80875$\frac{75 - 65}{10} = 1$$8 \times 1 = 8$$1^2 = 1$$8 \times 1 = 8$
80-90385$\frac{85 - 65}{10} = 2$$3 \times 2 = 6$$2^2 = 4$$3 \times 4 = 12$
90-100295$\frac{95 - 65}{10} = 3$$2 \times 3 = 6$$3^2 = 9$$2 \times 9 = 18$
Total $N = \sum f_i = 50$ $\sum f_i u_i = -15$ $\sum f_i u_i^2 = 105$

Mean ($\overline{x}$)

The mean is given by the formula: $\overline{x} = A + \frac{\sum f_i u_i}{N} \times h$

$\overline{x} = 65 + \frac{-15}{50} \times 10$

$\overline{x} = 65 - \frac{150}{50}$

$\overline{x} = 65 - 3 = 62$

The mean of the distribution is 62.


Variance ($\sigma^2$)

The variance is given by the formula: $\sigma^2 = h^2 \left[ \frac{\sum f_i u_i^2}{N} - \left(\frac{\sum f_i u_i}{N}\right)^2 \right]$

$\sigma^2 = 10^2 \left[ \frac{105}{50} - \left(\frac{-15}{50}\right)^2 \right]$

$\sigma^2 = 100 \left[ 2.1 - \left(-\frac{3}{10}\right)^2 \right]$

$\sigma^2 = 100 \left[ 2.1 - \frac{9}{100} \right]$

$\sigma^2 = 100 [2.1 - 0.09]$

$\sigma^2 = 100 [2.01]$

$\sigma^2 = 201$

The variance of the distribution is 201.


Standard Deviation ($\sigma$)

The standard deviation is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{201}$

Using a calculator, $\sqrt{201} \approx 14.1774...$

The standard deviation is approximately 14.18.



Exercise 15.2

Find the mean and variance for each of the data in Exercies 1 to 5.

Question 1.

671012134812

Answer:

The given data is: 6, 7, 10, 12, 13, 4, 8, 12.

The number of observations is $n = 8$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$

Sum of observations = $6 + 7 + 10 + 12 + 13 + 4 + 8 + 12 = 72$

$\overline{x} = \frac{72}{8} = 9$

The mean of the data is 9.


Next, we calculate the deviations from the mean ($x_i - \overline{x}$) and the squared deviations ($(x_i - \overline{x})^2$).

$x_i$ $x_i - \overline{x} = x_i - 9$ $(x_i - \overline{x})^2$
6$6 - 9 = -3$$(-3)^2 = 9$
7$7 - 9 = -2$$(-2)^2 = 4$
10$10 - 9 = 1$$1^2 = 1$
12$12 - 9 = 3$$3^2 = 9$
13$13 - 9 = 4$$4^2 = 16$
4$4 - 9 = -5$$(-5)^2 = 25$
8$8 - 9 = -1$$(-1)^2 = 1$
12$12 - 9 = 3$$3^2 = 9$
Total $\sum (x_i - \overline{x}) = 0$ $\sum (x_i - \overline{x})^2 = 74$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \overline{x})^2}{n}$

$\sigma^2 = \frac{74}{8} = \frac{37}{4} = 9.25$

The mean of the data is 9 and the variance is 9.25.

Question 2. First n natural numbers

Answer:

The data consists of the first $n$ natural numbers: $1, 2, 3, \dots, n$.

The number of observations is $n$.


First, we find the mean of the data.

The sum of the first $n$ natural numbers is given by the formula $\sum_{i=1}^{n} i = \frac{n(n+1)}{2}$.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum_{i=1}^{n} i}{n}$

$\overline{x} = \frac{\frac{n(n+1)}{2}}{n} = \frac{n(n+1)}{2n}$

$\overline{x} = \frac{n+1}{2}$

The mean of the first $n$ natural numbers is $\frac{n+1}{2}$.


Next, we find the variance ($\sigma^2$).

The variance can be calculated using the formula $\sigma^2 = \frac{\sum_{i=1}^{n} x_i^2}{n} - (\overline{x})^2$.

The sum of the squares of the first $n$ natural numbers is given by the formula $\sum_{i=1}^{n} i^2 = \frac{n(n+1)(2n+1)}{6}$.

Substituting the values for $\sum x_i^2$ and $\overline{x}$ into the variance formula:

$\sigma^2 = \frac{\frac{n(n+1)(2n+1)}{6}}{n} - \left(\frac{n+1}{2}\right)^2$

$\sigma^2 = \frac{n(n+1)(2n+1)}{6n} - \frac{(n+1)^2}{4}$

$\sigma^2 = \frac{(n+1)(2n+1)}{6} - \frac{(n+1)^2}{4}$

To subtract the fractions, we find a common denominator, which is 12.

$\sigma^2 = \frac{2(n+1)(2n+1)}{12} - \frac{3(n+1)^2}{12}$

$\sigma^2 = \frac{(n+1)[2(2n+1) - 3(n+1)]}{12}$

$\sigma^2 = \frac{(n+1)[4n + 2 - 3n - 3]}{12}$

$\sigma^2 = \frac{(n+1)(n - 1)}{12}$

$\sigma^2 = \frac{n^2 - 1}{12}$

The variance of the first $n$ natural numbers is $\frac{n^2 - 1}{12}$.

The mean is $\frac{n+1}{2}$ and the variance is $\frac{n^2 - 1}{12}$.

Question 3. First 10 multiples of 3

Answer:

The data consists of the first 10 multiples of 3: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30.

The number of observations is $n = 10$.


First, we find the mean of the data.

Sum of observations ($\sum x_i$) = $3 + 6 + 9 + 12 + 15 + 18 + 21 + 24 + 27 + 30$

$\sum x_i = 165$

Mean ($\overline{x}$) = $\frac{\sum x_i}{n}$

$\overline{x} = \frac{165}{10} = 16.5$

The mean of the data is 16.5.


Next, we calculate the variance ($\sigma^2$). We will use the formula $\sigma^2 = \frac{\sum x_i^2}{n} - (\overline{x})^2$.

We calculate the square of each observation ($x_i^2$) and their sum ($\sum x_i^2$).

$x_i$ $x_i^2$
3$3^2 = 9$
6$6^2 = 36$
9$9^2 = 81$
12$12^2 = 144$
15$15^2 = 225$
18$18^2 = 324$
21$21^2 = 441$
24$24^2 = 576$
27$27^2 = 729$
30$30^2 = 900$
$\sum x_i^2 = 3465$

The mean squared is $(\overline{x})^2 = (16.5)^2 = 272.25$.

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum x_i^2}{n} - (\overline{x})^2$

$\sigma^2 = \frac{3465}{10} - 272.25$

$\sigma^2 = 346.5 - 272.25$

$\sigma^2 = 74.25$

The mean of the data is 16.5 and the variance is 74.25.

Question 4.

$x_i$ 6 10 14 18 24 28 30
$f_i$ 2 4 7 12 8 4 3

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
62
104
147
1812
248
284
303

First, we find the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $N = \sum f_i$ and the sum $\sum f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
62$2 \times 6 = 12$
104$4 \times 10 = 40$
147$7 \times 14 = 98$
1812$12 \times 18 = 216$
248$8 \times 24 = 192$
284$4 \times 28 = 112$
303$3 \times 30 = 90$
$N = \sum f_i = 40$ $\sum f_i x_i = 760$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum f_i x_i}{N}$

$\overline{x} = \frac{760}{40} = 19$

The mean of the data is 19.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

$x_i$ $f_i$ $x_i - \overline{x} = x_i - 19$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
62$6 - 19 = -13$$(-13)^2 = 169$$2 \times 169 = 338$
104$10 - 19 = -9$$(-9)^2 = 81$$4 \times 81 = 324$
147$14 - 19 = -5$$(-5)^2 = 25$$7 \times 25 = 175$
1812$18 - 19 = -1$$(-1)^2 = 1$$12 \times 1 = 12$
248$24 - 19 = 5$$5^2 = 25$$8 \times 25 = 200$
284$28 - 19 = 9$$9^2 = 81$$4 \times 81 = 324$
303$30 - 19 = 11$$11^2 = 121$$3 \times 121 = 363$
Total $N = 40$ $\sum (x_i - \overline{x}) = 0$ $\sum f_i (x_i - \overline{x})^2 = 1736$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{1736}{40}$

$\sigma^2 = \frac{173.6}{4} = 43.4$

The mean of the data is 19 and the variance is 43.4.

Question 5.

$x_i$ 92 93 97 98 102 104 109
$f_i$ 3 2 3 2 6 3 3

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
923
932
973
982
1026
1043
1093

First, we find the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $N = \sum f_i$ and the sum $\sum f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
923$3 \times 92 = 276$
932$2 \times 93 = 186$
973$3 \times 97 = 291$
982$2 \times 98 = 196$
1026$6 \times 102 = 612$
1043$3 \times 104 = 312$
1093$3 \times 109 = 327$
$N = \sum f_i = 22$ $\sum f_i x_i = 2200$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum f_i x_i}{N}$

$\overline{x} = \frac{2200}{22} = 100$

The mean of the data is 100.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

$x_i$ $f_i$ $x_i - \overline{x} = x_i - 100$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
923$92 - 100 = -8$$(-8)^2 = 64$$3 \times 64 = 192$
932$93 - 100 = -7$$(-7)^2 = 49$$2 \times 49 = 98$
973$97 - 100 = -3$$(-3)^2 = 9$$3 \times 9 = 27$
982$98 - 100 = -2$$(-2)^2 = 4$$2 \times 4 = 8$
1026$102 - 100 = 2$$2^2 = 4$$6 \times 4 = 24$
1043$104 - 100 = 4$$4^2 = 16$$3 \times 16 = 48$
1093$109 - 100 = 9$$9^2 = 81$$3 \times 81 = 243$
Total $N = 22$ $\sum (x_i - \overline{x}) = 0$ $\sum f_i (x_i - \overline{x})^2 = 640$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{640}{22} = \frac{320}{11}$

$\sigma^2 \approx 29.09$ (approximately)

The mean of the data is 100 and the variance is $\frac{320}{11}$ or approximately 29.09.

Question 6. Find the mean and standard deviation using short-cut method.

$x_i$ 60 61 62 63 64 65 66 67 68
$f_i$ 2 1 12 29 25 12 10 4 5

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
602
611
6212
6329
6425
6512
6610
674
685

We will use the short-cut method to find the mean and standard deviation.

Let the assumed mean be $A = 64$. We calculate the deviations $d_i = x_i - A = x_i - 64$, and the products $f_i d_i$ and $f_i d_i^2$. We also find the total frequency $N = \sum f_i$.

$x_i$ $f_i$ $d_i = x_i - 64$ $f_i d_i$ $d_i^2$ $f_i d_i^2$
602-4$2 \times (-4) = -8$16$2 \times 16 = 32$
611-3$1 \times (-3) = -3$9$1 \times 9 = 9$
6212-2$12 \times (-2) = -24$4$12 \times 4 = 48$
6329-1$29 \times (-1) = -29$1$29 \times 1 = 29$
64250$25 \times 0 = 0$0$25 \times 0 = 0$
65121$12 \times 1 = 12$1$12 \times 1 = 12$
66102$10 \times 2 = 20$4$10 \times 4 = 40$
6743$4 \times 3 = 12$9$4 \times 9 = 36$
6854$5 \times 4 = 20$16$5 \times 16 = 80$
Total $N = \sum f_i = 100$ $\sum f_i d_i = 0$ $\sum f_i d_i^2 = 286$

Mean ($\overline{x}$)

The mean is given by the formula: $\overline{x} = A + \frac{\sum f_i d_i}{N}$

$\overline{x} = 64 + \frac{0}{100}$

$\overline{x} = 64 + 0 = 64$

The mean of the data is 64.


Variance ($\sigma^2$)

The variance is given by the formula: $\sigma^2 = \frac{\sum f_i d_i^2}{N} - \left(\frac{\sum f_i d_i}{N}\right)^2$

$\sigma^2 = \frac{286}{100} - \left(\frac{0}{100}\right)^2$

$\sigma^2 = 2.86 - 0^2 = 2.86$

The variance of the data is 2.86.


Standard Deviation ($\sigma$)

The standard deviation is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{2.86}$

Using a calculator, $\sqrt{2.86} \approx 1.691$

The standard deviation is approximately 1.691.

Find the mean and variance for the following frequency distributions in Exercises 7 and 8.

Question 7.

Classes 0-30 30-60 60-90 90-120 120-150 150-180 180-210
Frequencies 2 3 5 10 3 5 2

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
0-302
30-603
60-905
90-12010
120-1503
150-1805
180-2102

First, we calculate the mean ($\overline{x}$).

We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum f_i$ and the sum $\sum f_i x_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
0-302$\frac{0+30}{2} = 15$$2 \times 15 = 30$
30-603$\frac{30+60}{2} = 45$$3 \times 45 = 135$
60-905$\frac{60+90}{2} = 75$$5 \times 75 = 375$
90-12010$\frac{90+120}{2} = 105$$10 \times 105 = 1050$
120-1503$\frac{120+150}{2} = 135$$3 \times 135 = 405$
150-1805$\frac{150+180}{2} = 165$$5 \times 165 = 825$
180-2102$\frac{180+210}{2} = 195$$2 \times 195 = 390$
Total $N = \sum f_i = 30$ $\sum f_i x_i = 3210$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum f_i x_i}{N}$

$\overline{x} = \frac{3210}{30} = \frac{321}{3} = 107$

The mean of the distribution is 107.


Next, we calculate the variance ($\sigma^2$).

We can use the formula $\sigma^2 = \frac{\sum f_i (x_i - \overline{x})^2}{N}$ or $\sigma^2 = \frac{\sum f_i x_i^2}{N} - (\overline{x})^2$. Let's use the first formula by calculating the deviations from the mean $(x_i - \overline{x})$ and their squares.

Class Interval $x_i$ $f_i$ $x_i - \overline{x} = x_i - 107$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
0-30152$15 - 107 = -92$$(-92)^2 = 8464$$2 \times 8464 = 16928$
30-60453$45 - 107 = -62$$(-62)^2 = 3844$$3 \times 3844 = 11532$
60-90755$75 - 107 = -32$$(-32)^2 = 1024$$5 \times 1024 = 5120$
90-12010510$105 - 107 = -2$$(-2)^2 = 4$$10 \times 4 = 40$
120-1501353$135 - 107 = 28$$28^2 = 784$$3 \times 784 = 2352$
150-1801655$165 - 107 = 58$$58^2 = 3364$$5 \times 3364 = 16820$
180-2101952$195 - 107 = 88$$88^2 = 7744$$2 \times 7744 = 15488$
Total $N = 30$ $\sum (x_i - \overline{x}) = 0$ $\sum f_i (x_i - \overline{x})^2 = 68280$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{68280}{30}$

$\sigma^2 = \frac{6828}{3} = 2276$

The mean of the distribution is 107 and the variance is 2276.

Question 8.

Classes 0-10 10-20 20-30 30-40 40-50
Frequencies 5 8 15 16 6

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
0-105
10-208
20-3015
30-4016
40-506

First, we calculate the mean ($\overline{x}$).

We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum f_i$ and the sum $\sum f_i x_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
0-105$\frac{0+10}{2} = 5$$5 \times 5 = 25$
10-208$\frac{10+20}{2} = 15$$8 \times 15 = 120$
20-3015$\frac{20+30}{2} = 25$$15 \times 25 = 375$
30-4016$\frac{30+40}{2} = 35$$16 \times 35 = 560$
40-506$\frac{40+50}{2} = 45$$6 \times 45 = 270$
Total $N = \sum f_i = 50$ $\sum f_i x_i = 1350$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum f_i x_i}{N}$

$\overline{x} = \frac{1350}{50} = \frac{135}{5} = 27$

The mean of the distribution is 27.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

Class Interval $x_i$ $f_i$ $x_i - \overline{x} = x_i - 27$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
0-1055$5 - 27 = -22$$(-22)^2 = 484$$5 \times 484 = 2420$
10-20158$15 - 27 = -12$$(-12)^2 = 144$$8 \times 144 = 1152$
20-302515$25 - 27 = -2$$(-2)^2 = 4$$15 \times 4 = 60$
30-403516$35 - 27 = 8$$8^2 = 64$$16 \times 64 = 1024$
40-50456$45 - 27 = 18$$18^2 = 324$$6 \times 324 = 1944$
Total $N = 50$ $\sum (x_i - \overline{x}) = 0$ $\sum f_i (x_i - \overline{x})^2 = 6600$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{6600}{50}$

$\sigma^2 = \frac{660}{5} = 132$

The mean of the distribution is 27 and the variance is 132.

Question 9. Find the mean, variance and standard deviation using short-cut method

Height in cms 70-75 75-80 80-85 85-90 90-95 95-100 100-105 105-110 110-115
No. of children 3 4 7 7 15 9 6 6 3

Answer:

The given data is a grouped frequency distribution of height and number of children:

Height in cms (Class Interval) Number of children ($f_i$)
70-753
75-804
80-857
85-907
90-9515
95-1009
100-1056
105-1106
110-1153

We will use the short-cut (step-deviation) method to find the mean, variance, and standard deviation.

First, calculate the midpoints ($x_i$) for each class interval. The class size is $h = 75 - 70 = 5$. Let's choose the assumed mean $A = 92.5$ (midpoint of the class 90-95 with the highest frequency).

Calculate $u_i = \frac{x_i - A}{h} = \frac{x_i - 92.5}{5}$, and the products $f_i u_i$ and $f_i u_i^2$. We also find the total frequency $N = \sum f_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $u_i = \frac{x_i - 92.5}{5}$ $f_i u_i$ $u_i^2$ $f_i u_i^2$
70-75372.5$\frac{72.5 - 92.5}{5} = -4$$3 \times (-4) = -12$$(-4)^2 = 16$$3 \times 16 = 48$
75-80477.5$\frac{77.5 - 92.5}{5} = -3$$4 \times (-3) = -12$$(-3)^2 = 9$$4 \times 9 = 36$
80-85782.5$\frac{82.5 - 92.5}{5} = -2$$7 \times (-2) = -14$$(-2)^2 = 4$$7 \times 4 = 28$
85-90787.5$\frac{87.5 - 92.5}{5} = -1$$7 \times (-1) = -7$$(-1)^2 = 1$$7 \times 1 = 7$
90-951592.5$\frac{92.5 - 92.5}{5} = 0$$15 \times 0 = 0$$0^2 = 0$$15 \times 0 = 0$
95-100997.5$\frac{97.5 - 92.5}{5} = 1$$9 \times 1 = 9$$1^2 = 1$$9 \times 1 = 9$
100-1056102.5$\frac{102.5 - 92.5}{5} = 2$$6 \times 2 = 12$$2^2 = 4$$6 \times 4 = 24$
105-1106107.5$\frac{107.5 - 92.5}{5} = 3$$6 \times 3 = 18$$3^2 = 9$$6 \times 9 = 54$
110-1153112.5$\frac{112.5 - 92.5}{5} = 4$$3 \times 4 = 12$$4^2 = 16$$3 \times 16 = 48$
Total $N = \sum f_i = 60$ $\sum f_i u_i = 6$ $\sum f_i u_i^2 = 254$

Mean ($\overline{x}$)

The mean is given by the formula: $\overline{x} = A + \frac{\sum f_i u_i}{N} \times h$

$\overline{x} = 92.5 + \frac{6}{60} \times 5$

$\overline{x} = 92.5 + \frac{1}{10} \times 5$

$\overline{x} = 92.5 + 0.5 = 93$

The mean height is 93 cms.


Variance ($\sigma^2$)

The variance is given by the formula: $\sigma^2 = h^2 \left[ \frac{\sum f_i u_i^2}{N} - \left(\frac{\sum f_i u_i}{N}\right)^2 \right]$

$\sigma^2 = 5^2 \left[ \frac{254}{60} - \left(\frac{6}{60}\right)^2 \right]$

$\sigma^2 = 25 \left[ \frac{127}{30} - \left(\frac{1}{10}\right)^2 \right]$

$\sigma^2 = 25 \left[ \frac{127}{30} - \frac{1}{100} \right]$

To subtract the fractions inside the brackets, find a common denominator (300):

$\sigma^2 = 25 \left[ \frac{127 \times 10}{300} - \frac{1 \times 3}{300} \right]$

$\sigma^2 = 25 \left[ \frac{1270 - 3}{300} \right]$

$\sigma^2 = 25 \left[ \frac{1267}{300} \right]$

$\sigma^2 = \frac{1267}{12}$

$\sigma^2 \approx 105.5833...$

The variance is $\frac{1267}{12}$ or approximately 105.58.


Standard Deviation ($\sigma$)

The standard deviation is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{1267}{12}}$

$\sigma = \sqrt{105.5833...}$

Using a calculator, $\sqrt{105.5833...} \approx 10.2753...$

The standard deviation is approximately 10.28 cms.

Question 10. The diameters of circles (in mm) drawn in a design are given below:

Diameters 33-36 37-40 41-44 45-48 49-52
No. of circles 15 17 21 22 25

Calculate the standard deviation and mean diameter of the circles.

[Hint: First make the data continuous by making the classes as 32.5-36.5, 36.5-40.5, 40.5-44.5, 44.5 - 48.5, 48.5 - 52.5 and then proceed.]

Answer:

The given data has discrete class intervals. As per the hint, we convert it into a continuous frequency distribution by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class interval.

The continuous frequency distribution is:

Diameters (Continuous Class Interval) Number of circles ($f_i$)
32.5-36.515
36.5-40.517
40.5-44.521
44.5-48.522
48.5-52.525

We will use the step-deviation method to find the mean and standard deviation.

First, calculate the midpoints ($x_i$) for each class interval. The class size is $h = 36.5 - 32.5 = 4$. Let's choose the assumed mean $A = 42.5$ (midpoint of the class 40.5-44.5).

Calculate $u_i = \frac{x_i - A}{h} = \frac{x_i - 42.5}{4}$, and the products $f_i u_i$ and $f_i u_i^2$. We also find the total frequency $N = \sum f_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $u_i = \frac{x_i - 42.5}{4}$ $f_i u_i$ $u_i^2$ $f_i u_i^2$
32.5-36.515$\frac{32.5+36.5}{2} = 34.5$$\frac{34.5 - 42.5}{4} = -2$$15 \times (-2) = -30$$(-2)^2 = 4$$15 \times 4 = 60$
36.5-40.517$\frac{36.5+40.5}{2} = 38.5$$\frac{38.5 - 42.5}{4} = -1$$17 \times (-1) = -17$$(-1)^2 = 1$$17 \times 1 = 17$
40.5-44.521$\frac{40.5+44.5}{2} = 42.5$$\frac{42.5 - 42.5}{4} = 0$$21 \times 0 = 0$$0^2 = 0$$21 \times 0 = 0$
44.5-48.522$\frac{44.5+48.5}{2} = 46.5$$\frac{46.5 - 42.5}{4} = 1$$22 \times 1 = 22$$1^2 = 1$$22 \times 1 = 22$
48.5-52.525$\frac{48.5+52.5}{2} = 50.5$$\frac{50.5 - 42.5}{4} = 2$$25 \times 2 = 50$$2^2 = 4$$25 \times 4 = 100$
Total $N = \sum f_i = 100$ $\sum f_i u_i = 25$ $\sum f_i u_i^2 = 199$

Mean ($\overline{x}$)

The mean is given by the formula: $\overline{x} = A + \frac{\sum f_i u_i}{N} \times h$

$\overline{x} = 42.5 + \frac{25}{100} \times 4$

$\overline{x} = 42.5 + \frac{1}{4} \times 4$

$\overline{x} = 42.5 + 1 = 43.5$

The mean diameter of the circles is 43.5 mm.


Variance ($\sigma^2$)

The variance is given by the formula: $\sigma^2 = h^2 \left[ \frac{\sum f_i u_i^2}{N} - \left(\frac{\sum f_i u_i}{N}\right)^2 \right]$

$\sigma^2 = 4^2 \left[ \frac{199}{100} - \left(\frac{25}{100}\right)^2 \right]$

$\sigma^2 = 16 \left[ 1.99 - \left(\frac{1}{4}\right)^2 \right]$

$\sigma^2 = 16 \left[ 1.99 - \frac{1}{16} \right]$

$\sigma^2 = 16 [1.99 - 0.0625]$

$\sigma^2 = 16 [1.9275]$

$\sigma^2 = 30.84$

The variance of the diameters is 30.84.


Standard Deviation ($\sigma$)

The standard deviation is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{30.84}$

Using a calculator, $\sqrt{30.84} \approx 5.5533...$

The standard deviation is approximately 5.55 mm.



Example 13 to 15 (Before Exercise 15.3)

Example 13: Two plants A and B of a factory show following results about the number of workers and the wages paid to them

A B
No. of workers 5000 6000
Average monthly wages ₹ 2500 ₹ 2500
Variance of distribution of wages 81 100

In which plant, A or B is there greater variability in individual wages?

Answer:

Given information for Plant A and Plant B:

Plant A Plant B
Number of workers ($N$)50006000
Average monthly wages ($\overline{x}$)$\textsf{₹}$ 2500$\textsf{₹}$ 2500
Variance of distribution of wages ($\sigma^2$)81100

To compare the variability in individual wages, we need to calculate the Coefficient of Variation (C.V.) for each plant.

The formula for Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$, where $\sigma$ is the standard deviation and $\overline{x}$ is the mean (average wages).

First, calculate the standard deviation ($\sigma$) from the variance ($\sigma^2 = \text{Variance}$).

For Plant A:

Standard Deviation ($\sigma_A$) = $\sqrt{\text{Variance}_A} = \sqrt{81} = 9$

Coefficient of Variation ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100$

$C.V._A = \frac{9}{2500} \times 100 = \frac{900}{2500} = \frac{9}{25} = 0.36$

Coefficient of Variation for Plant A is 0.36%.


For Plant B:

Standard Deviation ($\sigma_B$) = $\sqrt{\text{Variance}_B} = \sqrt{100} = 10$

Coefficient of Variation ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100$

$C.V._B = \frac{10}{2500} \times 100 = \frac{1000}{2500} = \frac{10}{25} = 0.4$

Coefficient of Variation for Plant B is 0.4%.


Comparing the Coefficients of Variation:

$C.V._A = 0.36$

$C.V._B = 0.4$

Since $C.V._B > C.V._A$ ($0.4 > 0.36$), there is greater variability in the wages in Plant B compared to Plant A.

Conclusion: There is greater variability in individual wages in Plant B.

Example 14: Coefficient of variation of two distributions are 60 and 70, and their standard deviations are 21 and 16, respectively. What are their arithmetic means.

Answer:

The formula for the Coefficient of Variation (C.V.) is:

$C.V. = \frac{\sigma}{\overline{x}} \times 100$

where $\sigma$ is the standard deviation and $\overline{x}$ is the arithmetic mean.

We can rearrange this formula to find the arithmetic mean:

$\overline{x} = \frac{\sigma}{C.V.} \times 100$


For the first distribution:

Given: $C.V._1 = 60$, $\sigma_1 = 21$

Arithmetic Mean ($\overline{x}_1$) = $\frac{\sigma_1}{C.V._1} \times 100$

$\overline{x}_1 = \frac{21}{60} \times 100$

$\overline{x}_1 = \frac{\cancel{21}^{7}}{\cancel{60}_{20}} \times \cancel{100}^{5}$

$\overline{x}_1 = 7 \times 5 = 35$

The arithmetic mean of the first distribution is 35.


For the second distribution:

Given: $C.V._2 = 70$, $\sigma_2 = 16$

Arithmetic Mean ($\overline{x}_2$) = $\frac{\sigma_2}{C.V._2} \times 100$

$\overline{x}_2 = \frac{16}{70} \times 100$

$\overline{x}_2 = \frac{160}{7}$

$\overline{x}_2 \approx 22.857$

The arithmetic mean of the second distribution is $\frac{160}{7}$ or approximately 22.86.

Example 15: The following values are calculated in respect of heights and weights of the students of a section of Class XI :

Height Weight
Mean 162.6 cm 52.36 kg
Variance 127.69 cm2 23.1361 kg2

Can we say that the weights show greater variation than the heights?

Answer:

The given information about the heights and weights of the students is:

For Height:

Mean ($\overline{x}_{\text{Height}}$) = 162.6 cm

Variance ($\sigma^2_{\text{Height}}$) = 127.69 cm$^2$

For Weight:

Mean ($\overline{x}_{\text{Weight}}$) = 52.36 kg

Variance ($\sigma^2_{\text{Weight}}$) = 23.1361 kg$^2$


To compare the variability of two distributions when they are measured in different units (cm and kg) or have significantly different means, we use the Coefficient of Variation (C.V.). A higher Coefficient of Variation indicates greater relative variability.

The formula for the Coefficient of Variation is given by:

$C.V. = \frac{\sigma}{\overline{x}} \times 100$

where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.


First, we need to calculate the standard deviation ($\sigma$) for both height and weight from their respective variances ($\sigma = \sqrt{\sigma^2}$).

For Height:

Standard Deviation ($\sigma_{\text{Height}}$) = $\sqrt{\text{Variance}_{\text{Height}}} = \sqrt{127.69}$

$\sigma_{\text{Height}} = 11.3$ cm

For Weight:

Standard Deviation ($\sigma_{\text{Weight}}$) = $\sqrt{\text{Variance}_{\text{Weight}}} = \sqrt{23.1361}$

$\sigma_{\text{Weight}} = 4.81$ kg


Now, we calculate the Coefficient of Variation for both height and weight.

For Height:

$C.V._{\text{Height}} = \frac{\sigma_{\text{Height}}}{\overline{x}_{\text{Height}}} \times 100$

$C.V._{\text{Height}} = \frac{11.3}{162.6} \times 100$

$C.V._{\text{Height}} \approx 0.069495 \times 100 \approx 6.95\%$

For Weight:

$C.V._{\text{Weight}} = \frac{\sigma_{\text{Weight}}}{\overline{x}_{\text{Weight}}} \times 100$

$C.V._{\text{Weight}} = \frac{4.81}{52.36} \times 100$

$C.V._{\text{Weight}} \approx 0.091864 \times 100 \approx 9.19\%$


Comparing the Coefficients of Variation:

$C.V._{\text{Height}} \approx 6.95\%$

$C.V._{\text{Weight}} \approx 9.19\%$

Since $C.V._{\text{Weight}} > C.V._{\text{Height}}$ ($9.19\% > 6.95\%$), the weights show greater relative variation than the heights.

Conclusion: Yes, the weights show greater variation than the heights because the Coefficient of Variation for weights is greater than that for heights.



Exercise 15.3

Question 1. From the data given below state which group is more variable, A or B?

Marks 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Group A 9 17 32 33 40 10 9
Group B 10 20 30 25 43 15 7

Answer:

To compare the variability of the two groups, A and B, we will calculate the Coefficient of Variation (C.V.) for each group. The group with the higher Coefficient of Variation is considered more variable.

The formula for the Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$, where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.

First, we calculate the mean ($\overline{x}$) and standard deviation ($\sigma$) for each group using the grouped frequency distribution method. The classes are continuous. The class size is $h = 20 - 10 = 10$. We calculate the midpoints ($x_i$) for each class.

Midpoints ($x_i$): 15, 25, 35, 45, 55, 65, 75.

We will use the step-deviation method with assumed mean $A = 45$ and $h = 10$.

Let $u_i = \frac{x_i - A}{h} = \frac{x_i - 45}{10}$.

$u_i$ values: -3, -2, -1, 0, 1, 2, 3.

$u_i^2$ values: 9, 4, 1, 0, 1, 4, 9.


For Group A:

Frequencies ($f_{iA}$): 9, 17, 32, 33, 40, 10, 9

Total frequency $N_A = \sum f_{iA} = 9 + 17 + 32 + 33 + 40 + 10 + 9 = 150$

Class $x_i$ $f_{iA}$ $u_i = \frac{x_i - 45}{10}$ $f_{iA} u_i$ $u_i^2$ $f_{iA} u_i^2$
10-20159-3-27981
20-302517-2-34468
30-403532-1-32132
40-5045330000
50-605540140140
60-706510220440
70-80759327981
Total $N_A = 150$ $\sum f_{iA} u_i = -6$ $\sum f_{iA} u_i^2 = 342$

Mean for Group A ($\overline{x}_A$) = $A + \frac{\sum f_{iA} u_i}{N_A} \times h = 45 + \frac{-6}{150} \times 10 = 45 - \frac{60}{150} = 45 - 0.4 = 44.6$

Variance for Group A ($\sigma_A^2$) = $h^2 \left[ \frac{\sum f_{iA} u_i^2}{N_A} - \left(\frac{\sum f_{iA} u_i}{N_A}\right)^2 \right] = 10^2 \left[ \frac{342}{150} - \left(\frac{-6}{150}\right)^2 \right] = 100 \left[ 2.28 - \left(-\frac{1}{25}\right)^2 \right] = 100 [2.28 - 0.0016] = 100 [2.2784] = 227.84$

Standard Deviation for Group A ($\sigma_A$) = $\sqrt{227.84} \approx 15.09437$

Coefficient of Variation for Group A ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100 = \frac{15.09437}{44.6} \times 100 \approx 33.84\%$


For Group B:

Frequencies ($f_{iB}$): 10, 20, 30, 25, 43, 15, 7

Total frequency $N_B = \sum f_{iB} = 10 + 20 + 30 + 25 + 43 + 15 + 7 = 150$

Class $x_i$ $f_{iB}$ $u_i = \frac{x_i - 45}{10}$ $f_{iB} u_i$ $u_i^2$ $f_{iB} u_i^2$
10-201510-3-30990
20-302520-2-40480
30-403530-1-30130
40-5045250000
50-605543143143
60-706515230460
70-80757321963
Total $N_B = 150$ $\sum f_{iB} u_i = -6$ $\sum f_{iB} u_i^2 = 366$

Mean for Group B ($\overline{x}_B$) = $A + \frac{\sum f_{iB} u_i}{N_B} \times h = 45 + \frac{-6}{150} \times 10 = 45 - 0.4 = 44.6$

Variance for Group B ($\sigma_B^2$) = $h^2 \left[ \frac{\sum f_{iB} u_i^2}{N_B} - \left(\frac{\sum f_{iB} u_i}{N_B}\right)^2 \right] = 10^2 \left[ \frac{366}{150} - \left(\frac{-6}{150}\right)^2 \right] = 100 \left[ 2.44 - \left(-\frac{1}{25}\right)^2 \right] = 100 [2.44 - 0.0016] = 100 [2.4384] = 243.84$

Standard Deviation for Group B ($\sigma_B$) = $\sqrt{243.84} \approx 15.61537$

Coefficient of Variation for Group B ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100 = \frac{15.61537}{44.6} \times 100 \approx 35.01\%$


Comparing the Coefficients of Variation:

$C.V._A \approx 33.84\%$

$C.V._B \approx 35.01\%$

Since the Coefficient of Variation for Group B ($35.01\%$) is greater than the Coefficient of Variation for Group A ($33.84\%$), Group B is more variable.

Conclusion: Group B is more variable than Group A.

Question 2. From the prices of shares X and Y below, find out which is more stable in value:

X 35 54 52 53 56 58 52 50 51 49
Y 108 107 105 105 106 107 104 103 104 101

Answer:

Given:

Prices of shares X:

35545253565852505149

Prices of shares Y:

108107105105106107104103104101

To Find: Which share is more stable in value.


Solution:

To compare the variability (stability) of two sets of data with different means, we use the Coefficient of Variation (C.V.). A lower C.V. indicates less variability and thus greater stability.

The formula for Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$, where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.


Calculations for Share X:

Number of observations, $n_X = 10$.

Sum of prices: $\sum x_i = 35 + 54 + 52 + 53 + 56 + 58 + 52 + 50 + 51 + 49 = 510$

Mean ($\overline{x}_X$) = $\frac{\sum x_i}{n_X} = \frac{510}{10} = 51$

The mean price for Share X is 51.

Calculate the deviations from the mean ($x_i - \overline{x}_X$) and their squares $(x_i - \overline{x}_X)^2$:

$x_i$ $x_i - 51$ $(x_i - 51)^2$
35$35 - 51 = -16$$(-16)^2 = 256$
54$54 - 51 = 3$$3^2 = 9$
52$52 - 51 = 1$$1^2 = 1$
53$53 - 51 = 2$$2^2 = 4$
56$56 - 51 = 5$$5^2 = 25$
58$58 - 51 = 7$$7^2 = 49$
52$52 - 51 = 1$$1^2 = 1$
50$50 - 51 = -1$$(-1)^2 = 1$
51$51 - 51 = 0$$0^2 = 0$
49$49 - 51 = -2$$(-2)^2 = 4$
Total $\sum (x_i - 51) = 0$ $\sum (x_i - 51)^2 = 350$

Variance ($\sigma_X^2$) = $\frac{\sum (x_i - \overline{x}_X)^2}{n_X} = \frac{350}{10} = 35$

Standard Deviation ($\sigma_X$) = $\sqrt{\sigma_X^2} = \sqrt{35} \approx 5.916$

Coefficient of Variation ($C.V._X$) = $\frac{\sigma_X}{\overline{x}_X} \times 100 = \frac{\sqrt{35}}{51} \times 100 \approx \frac{5.916}{51} \times 100 \approx 11.59\%$


Calculations for Share Y:

Number of observations, $n_Y = 10$.

Sum of prices: $\sum y_i = 108 + 107 + 105 + 105 + 106 + 107 + 104 + 103 + 104 + 101 = 1050$

Mean ($\overline{x}_Y$) = $\frac{\sum y_i}{n_Y} = \frac{1050}{10} = 105$

The mean price for Share Y is 105.

Calculate the deviations from the mean ($y_i - \overline{x}_Y$) and their squares $(y_i - \overline{x}_Y)^2$:

$y_i$ $y_i - 105$ $(y_i - 105)^2$
108$108 - 105 = 3$$3^2 = 9$
107$107 - 105 = 2$$2^2 = 4$
105$105 - 105 = 0$$0^2 = 0$
105$105 - 105 = 0$$0^2 = 0$
106$106 - 105 = 1$$1^2 = 1$
107$107 - 105 = 2$$2^2 = 4$
104$104 - 105 = -1$$(-1)^2 = 1$
103$103 - 105 = -2$$(-2)^2 = 4$
104$104 - 105 = -1$$(-1)^2 = 1$
101$101 - 105 = -4$$(-4)^2 = 16$
Total $\sum (y_i - 105) = 0$ $\sum (y_i - 105)^2 = 40$

Variance ($\sigma_Y^2$) = $\frac{\sum (y_i - \overline{x}_Y)^2}{n_Y} = \frac{40}{10} = 4$

Standard Deviation ($\sigma_Y$) = $\sqrt{\sigma_Y^2} = \sqrt{4} = 2$

Coefficient of Variation ($C.V._Y$) = $\frac{\sigma_Y}{\overline{x}_Y} \times 100 = \frac{2}{105} \times 100 = \frac{200}{105} = \frac{40}{21}$

$C.V._Y \approx 1.90476 \times 100 \approx 19.05\%$ (Mistake in previous calculation in thought process. Corrected now. Let me recheck the percentage calculation (2/105)*100. Yes, 1.90476... %. My manual calculation of 40/21 gave ~1.9, not 19. My mistake was in placing the decimal/multiplying by 100 mentally before. The value is indeed ~1.9%. This makes sense, small standard deviation relative to a large mean).

Rethinking the calculation: $C.V. = \frac{2}{105} \times 100 = \frac{200}{105} = \frac{40}{21} \approx 1.90476$. So $1.90476\%$ is the correct C.V. for Y. My previous C.V. X calculation was 11.59%. Let me quickly recheck C.V. X (sqrt(35)/51)*100. Sqrt(35) approx 5.916. 5.916/51 approx 0.1159. 0.1159 * 100 = 11.59%. Yes, both C.V.s are calculated correctly now.


Comparison of Coefficients of Variation:

$C.V._X \approx 11.59\%$

$C.V._Y \approx 1.905\%$

Since $C.V._Y < C.V._X$ ($1.905\% < 11.59\%$), the prices of Share Y show less relative variability than the prices of Share X. Therefore, Share Y is more stable in value.

Conclusion: Share Y is more stable in value.

Question 3. An analysis of monthly wages paid to workers in two firms A and B, belonging to the same industry, gives the following results:

(i) Which firm A or B pays larger amount as monthly wages?

(ii) Which firm, A or B, shows greater variability in individual wages?

Firm A Firm B
No. of wage earners 586 648
Mean of monthly wages 5253 5253
Variance of the distribution of wages 100 121

Answer:

Given:

Information about two firms A and B:

Firm A Firm B
No. of wage earners ($N$)586648
Mean of monthly wages ($\overline{x}$)$\textsf{₹}$ 5253$\textsf{₹}$ 5253
Variance of the distribution of wages ($\sigma^2$)100121

To Find:

(i) Which firm pays a larger total amount as monthly wages.

(ii) Which firm shows greater variability in individual wages.


Solution (i): Larger amount as monthly wages

The total amount paid as monthly wages by a firm is the product of the number of wage earners and the mean monthly wage.

Total monthly wages paid by Firm A = (No. of wage earners in A) $\times$ (Mean monthly wages in A)

Total monthly wages paid by Firm A = $586 \times \textsf{₹} \$ 5253$

Total monthly wages paid by Firm A = $\textsf{₹} \$ 3078318$

Total monthly wages paid by Firm B = (No. of wage earners in B) $\times$ (Mean monthly wages in B)

Total monthly wages paid by Firm B = $648 \times \textsf{₹} \$ 5253$

Total monthly wages paid by Firm B = $\textsf{₹} \$ 3403944$

Comparing the total monthly wages:

$\textsf{₹} \$ 3403944 > \textsf{₹} \$ 3078318$

Therefore, Firm B pays a larger amount as monthly wages.

Answer (i): Firm B pays a larger amount as monthly wages.


Solution (ii): Greater variability in individual wages

Variability in individual wages can be compared using the standard deviation or the Coefficient of Variation (C.V.).

Standard deviation ($\sigma$) is the square root of the variance ($\sigma^2$).

For Firm A:

Variance ($\sigma_A^2$) = 100

Standard Deviation ($\sigma_A$) = $\sqrt{100} = 10$

For Firm B:

Variance ($\sigma_B^2$) = 121

Standard Deviation ($\sigma_B$) = $\sqrt{121} = 11$

Since the mean wages are the same for both firms ($\textsf{₹} \$ 5253$), we can directly compare the standard deviations. A larger standard deviation indicates greater variability.

Comparing standard deviations: $\sigma_B = 11$ and $\sigma_A = 10$.

Since $\sigma_B > \sigma_A$, Firm B shows greater variability in individual wages.

Alternatively, we can calculate the Coefficient of Variation ($C.V. = \frac{\sigma}{\overline{x}} \times 100$). A higher C.V. indicates greater variability.

For Firm A:

$C.V._A = \frac{\sigma_A}{\overline{x}_A} \times 100 = \frac{10}{5253} \times 100 \approx 0.19036\%$

For Firm B:

$C.V._B = \frac{\sigma_B}{\overline{x}_B} \times 100 = \frac{11}{5253} \times 100 \approx 0.20940\%$

Since $C.V._B > C.V._A$, Firm B shows greater variability in individual wages.

Answer (ii): Firm B shows greater variability in individual wages.

Question 4. The following is the record of goals scored by team A in a football session:

No. of goals scored 0 1 2 3 4
No. of matches 1 9 7 5 3

For the team B, mean number of goals scored per match was 2 with a standard deviation 1.25 goals. Find which team may be considered more consistent?

Answer:

Given:

Data for Team A goals scored:

No. of goals scored ($x_i$) No. of matches ($f_i$)
01
19
27
35
43

Data for Team B:

Mean number of goals scored ($\overline{x}_B$) = 2

Standard deviation ($\sigma_B$) = 1.25


To Find: Which team is more consistent.


Solution:

Consistency is measured by the inverse of variability. A lower Coefficient of Variation (C.V.) indicates lower variability and thus higher consistency. We will calculate the C.V. for both teams and compare them.

The formula for Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$.


Calculations for Team A:

We need to calculate the mean ($\overline{x}_A$) and standard deviation ($\sigma_A$) from the frequency distribution.

Total number of matches ($N_A$) = $\sum f_i = 1 + 9 + 7 + 5 + 3 = 25$

Calculate $\sum f_i x_i$:

$x_i$ $f_i$ $f_i x_i$
01$1 \times 0 = 0$
19$9 \times 1 = 9$
27$7 \times 2 = 14$
35$5 \times 3 = 15$
43$3 \times 4 = 12$
Total $N_A = 25$ $\sum f_i x_i = 50$

Mean ($\overline{x}_A$) = $\frac{\sum f_i x_i}{N_A} = \frac{50}{25} = 2$

The mean number of goals scored per match for Team A is 2.

Calculate variance ($\sigma_A^2$). We use the formula $\sigma_A^2 = \frac{\sum f_i x_i^2}{N_A} - (\overline{x}_A)^2$.

Calculate $\sum f_i x_i^2$:

$x_i$ $f_i$ $x_i^2$ $f_i x_i^2$
010$1 \times 0 = 0$
191$9 \times 1 = 9$
274$7 \times 4 = 28$
359$5 \times 9 = 45$
4316$3 \times 16 = 48$
Total $N_A = 25$ $\sum f_i x_i^2 = 130$

Variance ($\sigma_A^2$) = $\frac{130}{25} - (2)^2 = 5.2 - 4 = 1.2$

Standard Deviation ($\sigma_A$) = $\sqrt{1.2} \approx 1.0954$

Coefficient of Variation for Team A ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100 = \frac{1.0954}{2} \times 100 \approx 54.77\%$


Calculations for Team B:

Mean ($\overline{x}_B$) = 2

Standard Deviation ($\sigma_B$) = 1.25

Coefficient of Variation for Team B ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100 = \frac{1.25}{2} \times 100 = 0.625 \times 100 = 62.5\%$


Comparison of Coefficients of Variation:

$C.V._A \approx 54.77\%$

$C.V._B = 62.5\%$

Since $C.V._A < C.V._B$ ($54.77\% < 62.5\%$), Team A has lower relative variability in the number of goals scored per match compared to Team B. Therefore, Team A is more consistent.

Conclusion: Team A may be considered more consistent.

Question 5. The sum and sum of squares corresponding to length x (in cm) and weight y (in gm) of 50 plant products are given below:

$\sum\limits_{i=1}^{50} x_i = 212 \;,\; \sum\limits_{i=1}^{50} x_i^2 = 902.8 \ ,$ $ \sum\limits_{i=1}^{50} y_i = 261 \;,\; \sum\limits_{i=1}^{50} y_i^2 = 1457.6$

Which is more varying, the length or weight?

Answer:

Given:

Number of plant products, $n = 50$.

Sum of lengths: $\sum\limits_{i=1}^{50} x_i = 212$ cm

Sum of squares of lengths: $\sum\limits_{i=1}^{50} x_i^2 = 902.8$ cm$^2$

Sum of weights: $\sum\limits_{i=1}^{50} y_i = 261$ gm

Sum of squares of weights: $\sum\limits_{i=1}^{50} y_i^2 = 1457.6$ gm$^2$


To Find: Which is more varying, the length or weight.


Solution:

To compare the variability of two distributions that are measured in different units (cm and gm), we calculate the Coefficient of Variation (C.V.) for each distribution. The distribution with the higher C.V. is considered more varying.

The Coefficient of Variation is given by the formula:

$C.V. = \frac{\sigma}{\overline{x}} \times 100$

where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.

The standard deviation is the square root of the variance ($\sigma = \sqrt{\sigma^2}$). The variance is calculated as $\sigma^2 = \frac{\sum z_i^2}{n} - (\overline{z})^2$, where $z$ represents the variable (either $x$ or $y$) and $\overline{z} = \frac{\sum z_i}{n}$.


Calculations for Length (x):

Mean of length ($\overline{x}_x$) = $\frac{\sum x_i}{n} = \frac{212}{50} = 4.24$ cm

Variance of length ($\sigma_x^2$) = $\frac{\sum x_i^2}{n} - (\overline{x}_x)^2$

$\sigma_x^2 = \frac{902.8}{50} - (4.24)^2$

$\sigma_x^2 = 18.056 - 17.9776 = 0.0784$ cm$^2$

Standard Deviation of length ($\sigma_x$) = $\sqrt{0.0784} = 0.28$ cm

Coefficient of Variation for length ($C.V._x$) = $\frac{\sigma_x}{\overline{x}_x} \times 100$

$C.V._x = \frac{0.28}{4.24} \times 100 = \frac{28}{424} \times 100 = \frac{7}{106} \times 100 = \frac{700}{106} = \frac{350}{53} \approx 6.60\%$


Calculations for Weight (y):

Mean of weight ($\overline{x}_y$) = $\frac{\sum y_i}{n} = \frac{261}{50} = 5.22$ gm

Variance of weight ($\sigma_y^2$) = $\frac{\sum y_i^2}{n} - (\overline{x}_y)^2$

$\sigma_y^2 = \frac{1457.6}{50} - (5.22)^2$

$\sigma_y^2 = 29.152 - 27.2484 = 1.9036$ gm$^2$

Standard Deviation of weight ($\sigma_y$) = $\sqrt{1.9036} = 1.38$ gm

Coefficient of Variation for weight ($C.V._y$) = $\frac{\sigma_y}{\overline{x}_y} \times 100$

$C.V._y = \frac{1.38}{5.22} \times 100 = \frac{138}{522} \times 100 = \frac{23}{87} \times 100 = \frac{2300}{87} \approx 26.44\%$


Comparison of Coefficients of Variation:

$C.V._x \approx 6.60\%$

$C.V._y \approx 26.44\%$

Since $C.V._y > C.V._x$ ($26.44\% > 6.60\%$), the weight shows greater relative variability than the length.

Conclusion: The weight is more varying than the length.



Example 16 to 19 - Miscellaneous Examples

Example 16: The variance of 20 observations is 5. If each observation is multiplied by 2, find the new variance of the resulting observations

Answer:

Given:

Number of observations, $n = 20$.

Variance of the original observations ($\sigma^2$) = 5.


Let the original observations be $x_1, x_2, \dots, x_{20}$.

The mean of the original observations is $\overline{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$.

The variance of the original observations is given by:

$\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{x})^2$

We are given $\sigma^2 = 5$.


Each observation is multiplied by 2. Let the new observations be $y_i$.

$y_i = 2x_i$, for $i = 1, 2, \dots, 20$.

The number of new observations is still $n = 20$.


Let the mean of the new observations be $\overline{y}$.

$\overline{y} = \frac{1}{n} \sum_{i=1}^{n} y_i = \frac{1}{20} \sum_{i=1}^{20} (2x_i)$

$\overline{y} = \frac{2}{20} \sum_{i=1}^{20} x_i = 2 \left(\frac{1}{20} \sum_{i=1}^{20} x_i\right)$

$\overline{y} = 2\overline{x}$

The new mean is twice the original mean.


The variance of the new observations ($\sigma_{\text{new}}^2$) is given by:

$\sigma_{\text{new}}^2 = \frac{1}{n} \sum_{i=1}^{n} (y_i - \overline{y})^2$

Substitute $y_i = 2x_i$ and $\overline{y} = 2\overline{x}$ into the formula:

$\sigma_{\text{new}}^2 = \frac{1}{20} \sum_{i=1}^{20} (2x_i - 2\overline{x})^2$

Factor out 2 from the term inside the square:

$\sigma_{\text{new}}^2 = \frac{1}{20} \sum_{i=1}^{20} (2(x_i - \overline{x}))^2$

Square the term $2(x_i - \overline{x})$:

$\sigma_{\text{new}}^2 = \frac{1}{20} \sum_{i=1}^{20} 4(x_i - \overline{x})^2$

Factor out the constant 4 from the summation:

$\sigma_{\text{new}}^2 = 4 \left( \frac{1}{20} \sum_{i=1}^{20} (x_i - \overline{x})^2 \right)$

The expression in the parenthesis is the original variance, $\sigma^2$.

$\sigma_{\text{new}}^2 = 4 \times \sigma^2$


Substitute the given value of the original variance ($\sigma^2 = 5$):

$\sigma_{\text{new}}^2 = 4 \times 5$

$\sigma_{\text{new}}^2 = 20$

The new variance of the resulting observations is 20.

Example 17: The mean of 5 observations is 4.4 and their variance is 8.24. If three of the observations are 1, 2 and 6, find the other two observations.

Answer:

Given:

Number of observations, $n = 5$.

Mean of observations ($\overline{x}$) = 4.4

Variance of observations ($\sigma^2$) = 8.24

Three of the observations are 1, 2, and 6.


To Find: The other two observations.


Solution:

Let the five observations be $x_1, x_2, x_3, x_4, x_5$.

We are given $x_1 = 1$, $x_2 = 2$, $x_3 = 6$. Let the other two observations be $a$ and $b$. So, the observations are 1, 2, 6, $a$, $b$.


The mean of the observations is given by $\overline{x} = \frac{\sum x_i}{n}$.

$\sum x_i = 1 + 2 + 6 + a + b = 9 + a + b$

We are given $\overline{x} = 4.4$ and $n = 5$.

$4.4 = \frac{9 + a + b}{5}$

Multiply both sides by 5:

$4.4 \times 5 = 9 + a + b$

$22 = 9 + a + b$

$a + b = 22 - 9$

$a + b = 13$

... (i)


The variance of the observations is given by the formula $\sigma^2 = \frac{\sum x_i^2}{n} - (\overline{x})^2$.

We need to calculate the sum of the squares of the observations, $\sum x_i^2$.

$\sum x_i^2 = 1^2 + 2^2 + 6^2 + a^2 + b^2$

$\sum x_i^2 = 1 + 4 + 36 + a^2 + b^2 = 41 + a^2 + b^2$

We are given $\sigma^2 = 8.24$, $\overline{x} = 4.4$, and $n = 5$.

$8.24 = \frac{41 + a^2 + b^2}{5} - (4.4)^2$

Calculate $(4.4)^2$: $4.4 \times 4.4 = 19.36$

$8.24 = \frac{41 + a^2 + b^2}{5} - 19.36$

Add 19.36 to both sides:

$8.24 + 19.36 = \frac{41 + a^2 + b^2}{5}$

$27.60 = \frac{41 + a^2 + b^2}{5}$

Multiply both sides by 5:

$27.60 \times 5 = 41 + a^2 + b^2$

$138 = 41 + a^2 + b^2$

$a^2 + b^2 = 138 - 41$

$a^2 + b^2 = 97$

... (ii)


Now we have a system of two equations with two variables $a$ and $b$:

From (i): $a + b = 13$. We can express $b$ in terms of $a$: $b = 13 - a$.

Substitute this expression for $b$ into equation (ii):

$a^2 + (13 - a)^2 = 97$

Expand $(13 - a)^2$ using the formula $(p - q)^2 = p^2 - 2pq + q^2$:

$a^2 + (13^2 - 2 \times 13 \times a + a^2) = 97$

$a^2 + 169 - 26a + a^2 = 97$

Combine like terms:

$2a^2 - 26a + 169 - 97 = 0$

$2a^2 - 26a + 72 = 0$

Divide the entire equation by 2:

$a^2 - 13a + 36 = 0$


This is a quadratic equation in $a$. We can solve it by factoring. We need two numbers that multiply to 36 and add up to -13. These numbers are -4 and -9.

So, we can factor the quadratic equation as:

$(a - 4)(a - 9) = 0$

This gives two possible values for $a$:

$a - 4 = 0 \implies a = 4$

or

$a - 9 = 0 \implies a = 9$


Case 1: If $a = 4$, substitute this into equation (i) to find $b$:

$4 + b = 13 \implies b = 13 - 4 = 9$

In this case, the other two observations are 4 and 9.

Case 2: If $a = 9$, substitute this into equation (i) to find $b$:

$9 + b = 13 \implies b = 13 - 9 = 4$

In this case, the other two observations are 9 and 4.

Both cases give the same pair of numbers for the remaining observations, just in a different order.

Let's verify these values using equation (ii): $a^2 + b^2 = 97$.

If $a=4$ and $b=9$, then $4^2 + 9^2 = 16 + 81 = 97$. This is correct.

If $a=9$ and $b=4$, then $9^2 + 4^2 = 81 + 16 = 97$. This is also correct.

The other two observations are 4 and 9.

The five observations are 1, 2, 6, 4, 9.

Let's check the mean: $\frac{1+2+6+4+9}{5} = \frac{22}{5} = 4.4$ (Correct).

Let's check the variance: $\sigma^2 = \frac{\sum x_i^2}{n} - \overline{x}^2 = \frac{1^2+2^2+6^2+4^2+9^2}{5} - (4.4)^2 = \frac{1+4+36+16+81}{5} - 19.36 = \frac{138}{5} - 19.36 = 27.6 - 19.36 = 8.24$ (Correct).

The other two observations are 4 and 9.

Example 18: If each of the observation x1 , x2 , ...,xn is increased by ‘a’, where a is a negative or positive number, show that the variance remains unchanged.

Answer:

Given:

A set of $n$ observations: $x_1, x_2, \dots, x_n$.

Each observation is increased by a constant 'a', where 'a' is a negative or positive number, resulting in new observations $y_1, y_2, \dots, y_n$.


To Show:

The variance of the new observations remains unchanged compared to the variance of the original observations.


Proof:

Let the original observations be $x_1, x_2, \dots, x_n$.

The number of observations is $n$.

The mean of the original observations is:

$\overline{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$

The variance of the original observations is:

$\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{x})^2$

Now, each observation is increased by 'a'. The new observations are $y_i = x_i + a$ for $i = 1, 2, \dots, n$.

The number of new observations is still $n$.

Let the mean of the new observations be $\overline{y}$.

$\overline{y} = \frac{1}{n} \sum_{i=1}^{n} y_i$

Substitute $y_i = x_i + a$:

$\overline{y} = \frac{1}{n} \sum_{i=1}^{n} (x_i + a)$

Using the property of summation $\sum (z_i + k) = \sum z_i + \sum k$:

$\overline{y} = \frac{1}{n} \left( \sum_{i=1}^{n} x_i + \sum_{i=1}^{n} a \right)$

Since 'a' is a constant, $\sum_{i=1}^{n} a = n \cdot a$.

$\overline{y} = \frac{1}{n} \left( \sum_{i=1}^{n} x_i + na \right)$

$\overline{y} = \frac{1}{n} \sum_{i=1}^{n} x_i + \frac{na}{n}$

$\overline{y} = \overline{x} + a$

... (i)

The mean of the new observations is the original mean increased by 'a'.

The variance of the new observations ($\sigma_{\text{new}}^2$) is given by:

$\sigma_{\text{new}}^2 = \frac{1}{n} \sum_{i=1}^{n} (y_i - \overline{y})^2$

Substitute $y_i = x_i + a$ and $\overline{y} = \overline{x} + a$ into the formula:

$\sigma_{\text{new}}^2 = \frac{1}{n} \sum_{i=1}^{n} ((x_i + a) - (\overline{x} + a))^2$

Simplify the term inside the square:

$(x_i + a) - (\overline{x} + a) = x_i + a - \overline{x} - a = x_i - \overline{x}$

So, the difference between each new observation and the new mean is equal to the difference between the corresponding original observation and the original mean.

$\sigma_{\text{new}}^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{x})^2$

This expression is exactly the formula for the original variance $\sigma^2$.

$\sigma_{\text{new}}^2 = \sigma^2$

Thus, the variance of the resulting observations is the same as the variance of the original observations.

Conclusion: When each observation is increased by a constant 'a', the variance remains unchanged.

Example 19: The mean and standard deviation of 100 observations were calculated as 40 and 5.1, respectively by a student who took by mistake 50 instead of 40 for one observation. What are the correct mean and standard deviation?

Answer:

Given:

Number of observations, $n = 100$.

Incorrect mean ($\overline{x}_{\text{incorrect}}$) = 40.

Incorrect standard deviation ($\sigma_{\text{incorrect}}$) = 5.1.

Incorrect observation used = 50.

Correct observation = 40.


To Find: The correct mean and standard deviation.


Solution:

First, we find the incorrect sum of observations using the incorrect mean.

Incorrect mean ($\overline{x}_{\text{incorrect}}$) = $\frac{\text{Incorrect sum of observations}}{\text{Number of observations}}$

Incorrect sum of observations ($\sum x_{\text{incorrect}}$) = $\overline{x}_{\text{incorrect}} \times n$

$\sum x_{\text{incorrect}} = 40 \times 100 = 4000$


Now, we find the correct sum of observations.

Correct sum of observations ($\sum x_{\text{correct}}$) = Incorrect sum - Incorrect observation + Correct observation

$\sum x_{\text{correct}} = 4000 - 50 + 40 = 4000 - 10 = 3990$


Calculate the correct mean ($\overline{x}_{\text{correct}}$).

$\overline{x}_{\text{correct}} = \frac{\text{Correct sum of observations}}{\text{Number of observations}}$

$\overline{x}_{\text{correct}} = \frac{3990}{100} = 39.9$

The correct mean is 39.9.


Next, we find the correct standard deviation. We first need the incorrect variance.

Incorrect variance ($\sigma_{\text{incorrect}}^2$) = $(\sigma_{\text{incorrect}})^2 = (5.1)^2 = 26.01$

The formula for variance is $\sigma^2 = \frac{\sum x_i^2}{n} - (\overline{x})^2$.

We can use the formula $\sigma^2 = \frac{1}{n} \sum x_i^2 - \overline{x}^2$ to find the incorrect sum of squares.

Incorrect variance ($\sigma_{\text{incorrect}}^2$) = $\frac{\text{Incorrect sum of squares}}{\text{Number of observations}} - (\text{Incorrect mean})^2$

Incorrect sum of squares ($\sum x^2_{\text{incorrect}}$) = $n \times (\sigma_{\text{incorrect}}^2 + \overline{x}_{\text{incorrect}}^2)$

$\sum x^2_{\text{incorrect}} = 100 \times (26.01 + (40)^2)$

$\sum x^2_{\text{incorrect}} = 100 \times (26.01 + 1600)$

$\sum x^2_{\text{incorrect}} = 100 \times 1626.01 = 162601$


Now, we find the correct sum of squares.

Correct sum of squares ($\sum x^2_{\text{correct}}$) = Incorrect sum of squares - (Incorrect observation)$^2$ + (Correct observation)$^2$

$\sum x^2_{\text{correct}} = 162601 - (50)^2 + (40)^2$

$\sum x^2_{\text{correct}} = 162601 - 2500 + 1600$

$\sum x^2_{\text{correct}} = 162601 - 900 = 161701$


Calculate the correct variance ($\sigma_{\text{correct}}^2$).

$\sigma_{\text{correct}}^2 = \frac{\text{Correct sum of squares}}{\text{Number of observations}} - (\text{Correct mean})^2$

$\sigma_{\text{correct}}^2 = \frac{161701}{100} - (39.9)^2$

$\sigma_{\text{correct}}^2 = 1617.01 - (39.9 \times 39.9)$

Calculate $39.9 \times 39.9$:

$\begin{array}{cc}& & 3 & 9 & 9 \\ \times & & 3 & 9 & 9 \\ \hline && 3 & 5 & 9 & 1 \\ & 3 & 5 & 9 & 1 & \times \\ 11 & 9 & 7 & \times & \times \\ \hline 15 & 9 & 2 & 0 & 1 \\ \hline \end{array}$

So, $(39.9)^2 = 1592.01$

$\sigma_{\text{correct}}^2 = 1617.01 - 1592.01 = 25.00$

The correct variance is 25.


Calculate the correct standard deviation ($\sigma_{\text{correct}}$).

$\sigma_{\text{correct}} = \sqrt{\sigma_{\text{correct}}^2} = \sqrt{25} = 5$

Since standard deviation is usually reported as a non-negative value, the correct standard deviation is 5.

The correct mean is 39.9 and the correct standard deviation is 5.



Miscellaneous Exercise On Chapter 15

Question 1. The mean and variance of eight observations are 9 and 9.25, respectively. If six of the observations are 6, 7, 10, 12, 12 and 13, find the remaining two observations.

Answer:

Given:

Number of observations, $n = 8$.

Mean of observations ($\overline{x}$) = 9.

Variance of observations ($\sigma^2$) = 9.25.

Six of the observations are 6, 7, 10, 12, 12, and 13.


To Find: The remaining two observations.


Solution:

Let the two remaining observations be $a$ and $b$. The eight observations are 6, 7, 10, 12, 12, 13, $a$, and $b$.

The mean of the observations is given by the formula:

$\overline{x} = \frac{\sum x_i}{n}$

The sum of the eight observations is:

$\sum x_i = 6 + 7 + 10 + 12 + 12 + 13 + a + b = 60 + a + b$

We are given $\overline{x} = 9$ and $n = 8$. Substitute these values into the mean formula:

$9 = \frac{60 + a + b}{8}$

Multiply both sides by 8:

$9 \times 8 = 60 + a + b$

$72 = 60 + a + b$

Subtract 60 from both sides:

$a + b = 12$

... (i)


The variance of the observations is given by the formula:

$\sigma^2 = \frac{\sum x_i^2}{n} - (\overline{x})^2$

The sum of the squares of the eight observations is:

$\sum x_i^2 = 6^2 + 7^2 + 10^2 + 12^2 + 12^2 + 13^2 + a^2 + b^2$

$\sum x_i^2 = 36 + 49 + 100 + 144 + 144 + 169 + a^2 + b^2$

$\sum x_i^2 = 642 + a^2 + b^2$

We are given $\sigma^2 = 9.25$ and $\overline{x} = 9$. Substitute these values into the variance formula:

$9.25 = \frac{642 + a^2 + b^2}{8} - (9)^2$

$9.25 = \frac{642 + a^2 + b^2}{8} - 81$

Add 81 to both sides:

$9.25 + 81 = \frac{642 + a^2 + b^2}{8}$

$90.25 = \frac{642 + a^2 + b^2}{8}$

Multiply both sides by 8:

$90.25 \times 8 = 642 + a^2 + b^2$

$722 = 642 + a^2 + b^2$

Subtract 642 from both sides:

$a^2 + b^2 = 80$

... (ii)


Now we have a system of two equations with two variables $a$ and $b$:

1) $a + b = 12$

2) $a^2 + b^2 = 80$

From equation (i), we can express $b$ in terms of $a$: $b = 12 - a$.

Substitute this expression for $b$ into equation (ii):

$a^2 + (12 - a)^2 = 80$

Expand $(12 - a)^2$:

$a^2 + (144 - 24a + a^2) = 80$

Combine like terms:

$2a^2 - 24a + 144 = 80$

Subtract 80 from both sides:

$2a^2 - 24a + 144 - 80 = 0$

$2a^2 - 24a + 64 = 0$

Divide the entire equation by 2:

$a^2 - 12a + 32 = 0$


This is a quadratic equation in $a$. We can factor this equation. We look for two numbers that multiply to 32 and add up to -12. These numbers are -4 and -8.

So, we can factor the quadratic equation as:

$(a - 4)(a - 8) = 0$

This gives two possible values for $a$:

$a - 4 = 0 \implies a = 4$

or

$a - 8 = 0 \implies a = 8$


Case 1: If $a = 4$, substitute this into equation (i) to find $b$:

$4 + b = 12 \implies b = 12 - 4 = 8$

In this case, the other two observations are 4 and 8.

Case 2: If $a = 8$, substitute this into equation (i) to find $b$:

$8 + b = 12 \implies b = 12 - 8 = 4$

In this case, the other two observations are 8 and 4.

Both cases result in the same pair of numbers for the remaining observations.

Let's verify if these values satisfy equation (ii): $a^2 + b^2 = 80$.

If $a=4$ and $b=8$, then $4^2 + 8^2 = 16 + 64 = 80$. This is correct.

The remaining two observations are 4 and 8.

Question 2. The mean and variance of 7 observations are 8 and 16, respectively. If five of the observations are 2, 4, 10, 12, 14. Find the remaining two observations.

Answer:

Given:

Number of observations, $n = 7$.

Mean of observations ($\overline{x}$) = 8.

Variance of observations ($\sigma^2$) = 16.

Five of the observations are 2, 4, 10, 12, and 14.


To Find: The remaining two observations.


Solution:

Let the two remaining observations be $a$ and $b$. The seven observations are 2, 4, 10, 12, 14, $a$, and $b$.

The mean of the observations is given by the formula:

$\overline{x} = \frac{\sum x_i}{n}$

The sum of the seven observations is:

$\sum x_i = 2 + 4 + 10 + 12 + 14 + a + b = 42 + a + b$

We are given $\overline{x} = 8$ and $n = 7$. Substitute these values into the mean formula:

$8 = \frac{42 + a + b}{7}$

Multiply both sides by 7:

$8 \times 7 = 42 + a + b$

$56 = 42 + a + b$

Subtract 42 from both sides:

$a + b = 14$

... (i)


The variance of the observations is given by the formula:

$\sigma^2 = \frac{\sum x_i^2}{n} - (\overline{x})^2$

The sum of the squares of the seven observations is:

$\sum x_i^2 = 2^2 + 4^2 + 10^2 + 12^2 + 14^2 + a^2 + b^2$

$\sum x_i^2 = 4 + 16 + 100 + 144 + 196 + a^2 + b^2$

$\sum x_i^2 = 460 + a^2 + b^2$

We are given $\sigma^2 = 16$ and $\overline{x} = 8$. Substitute these values into the variance formula:

$16 = \frac{460 + a^2 + b^2}{7} - (8)^2$

$16 = \frac{460 + a^2 + b^2}{7} - 64$

Add 64 to both sides:

$16 + 64 = \frac{460 + a^2 + b^2}{7}$

$80 = \frac{460 + a^2 + b^2}{7}$

Multiply both sides by 7:

$80 \times 7 = 460 + a^2 + b^2$

$560 = 460 + a^2 + b^2$

Subtract 460 from both sides:

$a^2 + b^2 = 100$

... (ii)


Now we have a system of two equations with two variables $a$ and $b$:

1) $a + b = 14$

2) $a^2 + b^2 = 100$

From equation (i), we can express $b$ in terms of $a$: $b = 14 - a$.

Substitute this expression for $b$ into equation (ii):

$a^2 + (14 - a)^2 = 100$

Expand $(14 - a)^2$ using the formula $(p - q)^2 = p^2 - 2pq + q^2$:

$a^2 + (14^2 - 2 \times 14 \times a + a^2) = 100$

$a^2 + 196 - 28a + a^2 = 100$

Combine like terms:

$2a^2 - 28a + 196 = 100$

Subtract 100 from both sides:

$2a^2 - 28a + 196 - 100 = 0$

$2a^2 - 28a + 96 = 0$

Divide the entire equation by 2:

$a^2 - 14a + 48 = 0$


This is a quadratic equation in $a$. We can solve it by factoring. We look for two numbers that multiply to 48 and add up to -14. These numbers are -6 and -8.

So, we can factor the quadratic equation as:

$(a - 6)(a - 8) = 0$

This gives two possible values for $a$:

$a - 6 = 0 \implies a = 6$

or

$a - 8 = 0 \implies a = 8$


Case 1: If $a = 6$, substitute this into equation (i) to find $b$:

$6 + b = 14 \implies b = 14 - 6 = 8$

In this case, the other two observations are 6 and 8.

Case 2: If $a = 8$, substitute this into equation (i) to find $b$:

$8 + b = 14 \implies b = 14 - 8 = 6$

In this case, the other two observations are 8 and 6.

Both cases result in the same pair of numbers for the remaining observations.

Let's verify if these values satisfy equation (ii): $a^2 + b^2 = 100$.

If $a=6$ and $b=8$, then $6^2 + 8^2 = 36 + 64 = 100$. This is correct.

The remaining two observations are 6 and 8.

Question 3. The mean and standard deviation of six observations are 8 and 4, respectively. If each observation is multiplied by 3, find the new mean and new standard deviation of the resulting observations.

Answer:

Given:

Number of observations, $n = 6$.

Mean of original observations ($\overline{x}_{\text{original}}$) = 8.

Standard deviation of original observations ($\sigma_{\text{original}}$) = 4.

Each observation is multiplied by 3.


To Find:

The new mean and new standard deviation.


Solution:

Let the original observations be $x_1, x_2, \dots, x_6$.

The mean of the original observations is $\overline{x}_{\text{original}} = \frac{1}{6} \sum\limits_{i=1}^{6} x_i = 8$.

The standard deviation of the original observations is $\sigma_{\text{original}} = \sqrt{\frac{\sum\limits_{i=1}^{6} (x_i - \overline{x}_{\text{original}})^2}{6}} = 4$.

The new observations $y_i$ are obtained by multiplying each $x_i$ by a constant $k=3$. So, $y_i = 3x_i$ for $i = 1, 2, \dots, 6$.


New Mean:

The new mean ($\overline{y}_{\text{new}}$) is given by:

$\overline{y}_{\text{new}} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i$

Substitute $y_i = 3x_i$ and $n=6$:

$\overline{y}_{\text{new}} = \frac{1}{6} \sum\limits_{i=1}^{6} (3x_i)$

Using the property of summation $\sum k z_i = k \sum z_i$:

$\overline{y}_{\text{new}} = 3 \left( \frac{1}{6} \sum\limits_{i=1}^{6} x_i \right)$

The term in the parenthesis is the original mean $\overline{x}_{\text{original}}$.

$\overline{y}_{\text{new}} = 3 \times \overline{x}_{\text{original}}$

Substitute the given value of $\overline{x}_{\text{original}} = 8$:

$\overline{y}_{\text{new}} = 3 \times 8 = 24$

The new mean is 24.


New Standard Deviation:

The variance of the original observations is $\sigma_{\text{original}}^2 = (\sigma_{\text{original}})^2 = 4^2 = 16$.

The variance of the new observations ($\sigma_{\text{new}}^2$) is given by:

$\sigma_{\text{new}}^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \overline{y}_{\text{new}})^2$

Substitute $y_i = 3x_i$ and $\overline{y}_{\text{new}} = 3\overline{x}_{\text{original}}$:

$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} (3x_i - 3\overline{x}_{\text{original}})^2$

Factor out 3 from the term inside the square:

$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} (3(x_i - \overline{x}_{\text{original}}))^2$

Square the term $3(x_i - \overline{x}_{\text{original}})$:

$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} 9(x_i - \overline{x}_{\text{original}})^2$

Factor out the constant 9 from the summation:

$\sigma_{\text{new}}^2 = 9 \left( \frac{1}{6} \sum\limits_{i=1}^{6} (x_i - \overline{x}_{\text{original}})^2 \right)$

The expression in the parenthesis is the original variance $\sigma_{\text{original}}^2$.

$\sigma_{\text{new}}^2 = 9 \times \sigma_{\text{original}}^2$

Substitute the value of $\sigma_{\text{original}}^2 = 16$:

$\sigma_{\text{new}}^2 = 9 \times 16 = 144$

The new variance is 144.

The new standard deviation ($\sigma_{\text{new}}$) is the square root of the new variance:

$\sigma_{\text{new}} = \sqrt{\sigma_{\text{new}}^2} = \sqrt{144} = 12$

Alternate Method using Property:

If each observation $x_i$ is multiplied by a constant $k$, the new mean is $\overline{y} = k\overline{x}$ and the new standard deviation is $\sigma_y = |k|\sigma_x$.

Here, $k=3$.

New mean = $3 \times \text{Original Mean} = 3 \times 8 = 24$.

New standard deviation = $|3| \times \text{Original Standard Deviation} = 3 \times 4 = 12$.

Both methods yield the same result.

The new mean of the resulting observations is 24 and the new standard deviation is 12.

Question 4. Given that $\overline{x}$ is the mean and σ2 is the variance of n observations x1 , x2 , ...,xn . Prove that the mean and variance of the observations ax1 , ax2 , ax3 , ...., axn are a$\overline{x}$ and a2 σ2 , respectively, (a ≠ 0).

Answer:

Given:

A set of $n$ observations: $x_1, x_2, \dots, x_n$.

Mean of these observations = $\overline{x}$.

Variance of these observations = $\sigma^2$.

A new set of observations is created by multiplying each original observation by a non-zero constant 'a', resulting in $y_1 = ax_1, y_2 = ax_2, \dots, y_n = ax_n$.


To Prove:

The mean of the new observations is $a\overline{x}$.

The variance of the new observations is $a^2 \sigma^2$.


Proof for the Mean:

The mean of the original observations is defined as:

$\overline{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$

Let the mean of the new observations be $\overline{y}$. By definition:

$\overline{y} = \frac{1}{n} \sum_{i=1}^{n} y_i$

Substitute $y_i = ax_i$:

$\overline{y} = \frac{1}{n} \sum_{i=1}^{n} (ax_i)$

Using the property of summation $\sum k z_i = k \sum z_i$:

$\overline{y} = a \left( \frac{1}{n} \sum_{i=1}^{n} x_i \right)$

The expression in the parenthesis is the original mean $\overline{x}$.

$\overline{y} = a\overline{x}$

Thus, the mean of the observations $ax_1, ax_2, \dots, ax_n$ is $a\overline{x}$.


Proof for the Variance:

The variance of the original observations is defined as:

$\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{x})^2$

Let the variance of the new observations be $\sigma_y^2$. By definition:

$\sigma_y^2 = \frac{1}{n} \sum_{i=1}^{n} (y_i - \overline{y})^2$

Substitute $y_i = ax_i$ and the new mean $\overline{y} = a\overline{x}$ (proved above):

$\sigma_y^2 = \frac{1}{n} \sum_{i=1}^{n} (ax_i - a\overline{x})^2$

Factor out 'a' from the term inside the square:

$\sigma_y^2 = \frac{1}{n} \sum_{i=1}^{n} (a(x_i - \overline{x}))^2$

Square the term $a(x_i - \overline{x})$:

$\sigma_y^2 = \frac{1}{n} \sum_{i=1}^{n} a^2 (x_i - \overline{x})^2$

Since $a^2$ is a constant (and $a \neq 0$), we can factor it out from the summation:

$\sigma_y^2 = a^2 \left( \frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{x})^2 \right)$

The expression in the parenthesis is the original variance $\sigma^2$.

$\sigma_y^2 = a^2 \sigma^2$

Thus, the variance of the observations $ax_1, ax_2, \dots, ax_n$ is $a^2 \sigma^2$.

We have shown that the mean and variance of the observations $ax_1, ax_2, \dots, ax_n$ are $a\overline{x}$ and $a^2 \sigma^2$, respectively, given that $a \neq 0$.

Question 5. The mean and standard deviation of 20 observations are found to be 10 and 2, respectively. On rechecking, it was found that an observation 8 was incorrect. Calculate the correct mean and standard deviation in each of the following cases:

(i) If wrong item is omitted.

(ii) If it is replaced by 12.

Answer:

Given:

Incorrect number of observations ($n_{\text{incorrect}}$) = 20.

Incorrect mean ($\overline{x}_{\text{incorrect}}$) = 10.

Incorrect standard deviation ($\sigma_{\text{incorrect}}$) = 2.

Incorrect observation recorded = 8.


To Find: The correct mean and standard deviation for two cases.


Solution:

From the incorrect mean, we can find the incorrect sum of observations:

$\overline{x}_{\text{incorrect}} = \frac{\sum x_{\text{incorrect}}}{n_{\text{incorrect}}}$

... (A)

$\sum x_{\text{incorrect}} = \overline{x}_{\text{incorrect}} \times n_{\text{incorrect}}$

$\sum x_{\text{incorrect}} = 10 \times 20 = 200$

The incorrect sum of observations is 200.

From the incorrect standard deviation, we can find the incorrect variance:

$\sigma_{\text{incorrect}}^2 = (\sigma_{\text{incorrect}})^2 = 2^2 = 4$

The formula for variance is $\sigma^2 = \frac{\sum x_i^2}{n} - (\overline{x})^2$.

Using this, we find the incorrect sum of squares:

$\sigma_{\text{incorrect}}^2 = \frac{\sum x^2_{\text{incorrect}}}{n_{\text{incorrect}}} - (\overline{x}_{\text{incorrect}})^2$

... (B)

$4 = \frac{\sum x^2_{\text{incorrect}}}{20} - (10)^2$

$4 = \frac{\sum x^2_{\text{incorrect}}}{20} - 100$

$\frac{\sum x^2_{\text{incorrect}}}{20} = 100 + 4 = 104$

$\sum x^2_{\text{incorrect}} = 104 \times 20 = 2080$

The incorrect sum of squares is 2080.


Case (i) If wrong item is omitted:

The incorrect observation (8) is removed from the data.

New number of observations ($n_{\text{new}}$) = $n_{\text{incorrect}} - 1 = 20 - 1 = 19$

Correct sum of observations ($\sum x_{\text{correct}}$) = Incorrect sum of observations - Incorrect observation

$\sum x_{\text{correct}} = 200 - 8 = 192$

Calculate the correct mean:

$\overline{x}_{\text{correct}} = \frac{\sum x_{\text{correct}}}{n_{\text{new}}}$

$\overline{x}_{\text{correct}} = \frac{192}{19}$

The correct mean is $\frac{192}{19}$.

Correct sum of squares ($\sum x^2_{\text{correct}}$) = Incorrect sum of squares - (Incorrect observation)$^2$

$\sum x^2_{\text{correct}} = 2080 - 8^2 = 2080 - 64 = 2016$

Calculate the correct variance:

$\sigma_{\text{correct}}^2 = \frac{\sum x^2_{\text{correct}}}{n_{\text{new}}} - (\overline{x}_{\text{correct}})^2$

$\sigma_{\text{correct}}^2 = \frac{2016}{19} - \left(\frac{192}{19}\right)^2$

$\sigma_{\text{correct}}^2 = \frac{2016}{19} - \frac{36864}{361}$

To combine the fractions, find a common denominator (361):

$\sigma_{\text{correct}}^2 = \frac{2016 \times 19}{361} - \frac{36864}{361} = \frac{38304}{361} - \frac{36864}{361} = \frac{1440}{361}$

Calculate the correct standard deviation:

$\sigma_{\text{correct}} = \sqrt{\sigma_{\text{correct}}^2} = \sqrt{\frac{1440}{361}} = \frac{\sqrt{144 \times 10}}{\sqrt{361}} = \frac{12\sqrt{10}}{19}$

Using $\sqrt{10} \approx 3.162$: $\sigma_{\text{correct}} \approx \frac{12 \times 3.162}{19} \approx \frac{37.944}{19} \approx 1.997$

The correct mean is $\frac{192}{19} \approx 10.11$ and the correct standard deviation is $\frac{12\sqrt{10}}{19} \approx 1.997$ if the wrong item is omitted.


Case (ii) If it is replaced by 12:

The incorrect observation (8) is replaced by the correct observation (12).

New number of observations ($n_{\text{new}}$) = $n_{\text{incorrect}} = 20$ (since one item is replaced, the number of observations remains the same).

Correct sum of observations ($\sum x_{\text{correct}}$) = Incorrect sum of observations - Incorrect observation + Correct observation

$\sum x_{\text{correct}} = 200 - 8 + 12 = 204$

Calculate the correct mean:

$\overline{x}_{\text{correct}} = \frac{\sum x_{\text{correct}}}{n_{\text{new}}}$

$\overline{x}_{\text{correct}} = \frac{204}{20} = 10.2$

The correct mean is 10.2.

Correct sum of squares ($\sum x^2_{\text{correct}}$) = Incorrect sum of squares - (Incorrect observation)$^2$ + (Correct observation)$^2$

$\sum x^2_{\text{correct}} = 2080 - 8^2 + 12^2 = 2080 - 64 + 144 = 2080 + 80 = 2160$

Calculate the correct variance:

$\sigma_{\text{correct}}^2 = \frac{\sum x^2_{\text{correct}}}{n_{\text{new}}} - (\overline{x}_{\text{correct}})^2$

$\sigma_{\text{correct}}^2 = \frac{2160}{20} - (10.2)^2$

$\sigma_{\text{correct}}^2 = 108 - 104.04$

$\sigma_{\text{correct}}^2 = 3.96$

Calculate the correct standard deviation:

$\sigma_{\text{correct}} = \sqrt{\sigma_{\text{correct}}^2} = \sqrt{3.96}$

Using a calculator, $\sqrt{3.96} \approx 1.98997$

The correct standard deviation is approximately 1.99.

The correct mean is 10.2 and the correct standard deviation is $\sqrt{3.96} \approx 1.99$ if the wrong item is replaced by 12.

Question 6. The mean and standard deviation of marks obtained by 50 students of a class in three subjects, Mathematics, Physics and Chemistry are given below:

Subject Mathematics Physics Chemistry
Mean 42 32 40.9
Standard deviation 12 15 20

Which of the three subjects shows the highest variability in marks and which shows the lowest?

Answer:

Given:

The mean and standard deviation for the marks of 50 students in three subjects are given in the table:

Subject Mean ($\overline{x}$) Standard Deviation ($\sigma$)
Mathematics4212
Physics3215
Chemistry40.920

To Find:

We need to find which subject shows the highest variability in marks and which shows the lowest.


Solution:

To compare the variability of two or more data sets with different means, we use the Coefficient of Variation (CV).

The formula for Coefficient of Variation is:

$CV = \frac{\text{Standard Deviation}}{\text{Mean}} \times 100\%$

Let's calculate the Coefficient of Variation for each subject:

Coefficient of Variation for Mathematics:

$CV_{Math} = \frac{\sigma_{Math}}{\overline{x}_{Math}} \times 100\%$

$CV_{Math} = \frac{12}{42} \times 100\%$

$CV_{Math} = \frac{2}{7} \times 100\%$

$CV_{Math} \approx 0.2857 \times 100\%$

$CV_{Math} \approx 28.57\%$

Coefficient of Variation for Physics:

$CV_{Physics} = \frac{\sigma_{Physics}}{\overline{x}_{Physics}} \times 100\%$

$CV_{Physics} = \frac{15}{32} \times 100\%$

$CV_{Physics} = 0.46875 \times 100\%$

$CV_{Physics} = 46.875\%$

$CV_{Physics} \approx 46.88\%$

Coefficient of Variation for Chemistry:

$CV_{Chemistry} = \frac{\sigma_{Chemistry}}{\overline{x}_{Chemistry}} \times 100\%$

$CV_{Chemistry} = \frac{20}{40.9} \times 100\%$

$CV_{Chemistry} = \frac{2000}{40.9}\%$

$CV_{Chemistry} \approx 48.90\%$


Comparing the Coefficients of Variation:

$CV_{Math} \approx 28.57\%$

$CV_{Physics} \approx 46.88\%$

$CV_{Chemistry} \approx 48.90\%$

A higher Coefficient of Variation indicates greater variability.

The highest Coefficient of Variation is for Chemistry ($48.90\%$).

The lowest Coefficient of Variation is for Mathematics ($28.57\%$).

Therefore, Chemistry shows the highest variability in marks, and Mathematics shows the lowest variability.

Question 7. The mean and standard deviation of a group of 100 observations were found to be 20 and 3, respectively. Later on it was found that three observations were incorrect, which were recorded as 21, 21 and 18. Find the mean and standard deviation if the incorrect observations are omitted.

Answer:

Given:

Number of observations, $n_{old} = 100$

Old Mean, $\overline{x}_{old} = 20$

Old Standard Deviation, $\sigma_{old} = 3$

Incorrect observations are 21, 21, and 18.


To Find:

The new mean and standard deviation after omitting the incorrect observations.


Solution:

We know that the mean is given by $\overline{x} = \frac{\sum x_i}{n}$.

The sum of the old observations is $\sum x_{old} = n_{old} \times \overline{x}_{old}$.

$\sum x_{old} = 100 \times 20 = 2000$

The incorrect observations are 21, 21, and 18.

The sum of incorrect observations $= 21 + 21 + 18 = 60$.

The correct sum of the remaining observations is $\sum x_{new} = \sum x_{old} - \text{Sum of incorrect observations}$.

$\sum x_{new} = 2000 - 60 = 1940$

The new number of observations is $n_{new} = n_{old} - \text{Number of incorrect observations}$.

$n_{new} = 100 - 3 = 97$

The new mean is $\overline{x}_{new} = \frac{\sum x_{new}}{n_{new}}$.

$\overline{x}_{new} = \frac{1940}{97} = 20$

$\overline{x}_{new} = 20$

... (i)

Now, we need to find the new standard deviation. The formula for standard deviation is $\sigma = \sqrt{\frac{\sum x_i^2}{n} - \overline{x}^2}$.

Squaring the standard deviation, we get the variance: $\sigma^2 = \frac{\sum x_i^2}{n} - \overline{x}^2$.

From the old data, we have $\sigma_{old}^2 = \frac{\sum x_{old}^2}{n_{old}} - \overline{x}_{old}^2$.

$3^2 = \frac{\sum x_{old}^2}{100} - 20^2$

$9 = \frac{\sum x_{old}^2}{100} - 400$

$9 + 400 = \frac{\sum x_{old}^2}{100}$

$409 = \frac{\sum x_{old}^2}{100}$

$\sum x_{old}^2 = 409 \times 100 = 40900$

The sum of squares of incorrect observations is $21^2 + 21^2 + 18^2 = 441 + 441 + 324 = 1206$.

The correct sum of squares of the remaining observations is $\sum x_{new}^2 = \sum x_{old}^2 - \text{Sum of squares of incorrect observations}$.

$\sum x_{new}^2 = 40900 - 1206 = 39694$

Now we can calculate the new variance, $\sigma_{new}^2$, using the new sum of squares and the new mean.

$\sigma_{new}^2 = \frac{\sum x_{new}^2}{n_{new}} - \overline{x}_{new}^2$

$\sigma_{new}^2 = \frac{39694}{97} - 20^2$

$\sigma_{new}^2 = \frac{39694}{97} - 400$

$\sigma_{new}^2 = \frac{39694 - 400 \times 97}{97}$

$400 \times 97 = 38800$

$\sigma_{new}^2 = \frac{39694 - 38800}{97}$

$\sigma_{new}^2 = \frac{894}{97}$

$\sigma_{new}^2 = \frac{894}{97}$

... (ii)

Finally, the new standard deviation is $\sigma_{new} = \sqrt{\sigma_{new}^2}$.

$\sigma_{new} = \sqrt{\frac{894}{97}}$

$\sigma_{new} \approx \sqrt{9.216494845}$

$\sigma_{new} \approx 3.035866$

Rounding to two decimal places, $\sigma_{new} \approx 3.04$.

$\sigma_{new} \approx 3.04$

... (iii)

Thus, the new mean is 20 and the new standard deviation is approximately 3.04.