My name is Alexander FufaeV and here I write about:

Statistics: Expected Value and Standard Deviation

Table of contents

Exercises with Solutions

The expected value is the theoretically anticipated mean in a measurement. Expected value $ \class{red}{\mu} $ of a random variable $X$ with $n$ measurements $x_1$, $x_2$, $x_3$, ..., $x_n$ and corresponding probabilities $p_1$, $p_2$, $p_3$, ..., $p_n$ to obtain these values is given by the following formula: $$ \class{red}{\mu} ~=~ x_1 \, p_1 ~+~ x_2 \, p_2 ~+~ ... ~+~ x_n \, p_n $$

In an experiment, we do not know the probabilities, so we use the empirical mean for the calculation of the expected value. Empirical mean $ \class{red}{\bar{x}} $ is the sum of the measurements $x_1$, $x_2$, $x_3$, ..., $x_n$ divided by the number of measurements $n$: $$ \class{red}{\bar{x}} ~=~ \frac{x_1 ~+~ x_2 ~+~ ... ~+~ x_n}{n} $$

Variance $ \sigma^2 $ gives the sum of the squared deviations $ (x_1 - \class{red}{\mu})^2 $, $ (x_2 - \class{red}{\mu})^2 $, and so on, from the expected value $ \class{red}{\mu} $: $$ \sigma^2 ~=~ (x_1 - \class{red}{\mu})^2 p_1 ~+~ (x_2 - \class{red}{\mu})^2 p_2 ~+~ ... ~+~ (x_n - \class{red}{\mu})^2 p_n $$

We can calculate the empirical variance $ \sigma_{\text e}^2 $ in an experiment as follows: $$ \sigma_{\text e}^2 ~=~ \frac{(x_1 - \class{red}{\bar{x}})^2 ~+~ (x_2 - \class{red}{\bar{x}})^2 ~+~ ... ~+~ (x_n - \class{red}{\bar{x}})^2}{n-1}$$

The square root of the variance yields the (Empirical) standard deviation $ \sigma $ or $ \sigma_{\text e} $. The standard deviation indicates how much the measurements $x_1$, $x_2$, $x_3$, ..., $x_n$ deviate on average from the expected value $ \class{red}{\mu} $ or mean $ \class{red}{\bar{x}} $: $$ \sigma ~=~ \sqrt{(x_1 - \class{red}{\mu})^2 p_1 ~+~ (x_2 - \class{red}{\mu})^2 p_2 ~+~ ... ~+~ (x_n - \class{red}{\mu})^2 p_n} $$ $$ \sigma_{\text e} ~=~ \sqrt{\frac{(x_1 - \class{red}{\bar{x}})^2 ~+~ (x_2 - \class{red}{\bar{x}})^2 ~+~ ... ~+~ (x_n - \class{red}{\bar{x}})^2}{n-1}}$$

If $ \class{red}{\bar{x}} $ is the mean and the measurements are normally distributed, then:

68% of all measurements lie within $ \class{red}{\bar{x}} \pm \sigma $.
95.4% of all measurements lie within $ \class{red}{\bar{x}} \pm 2\sigma $.
99.7% of all measurements lie within $ \class{red}{\bar{x}} \pm 3\sigma $.

The standard deviation $ \sigma(\class{red}{\bar{x}}) $ of the mean $ \class{red}{\bar{x}} $ for $n$ measurements is given by the following formula: $$ \sigma(\class{red}{\bar{x}}) ~=~ \frac{\sigma_{\text e}}{\sqrt{n}} $$

Doubling the accuracy requires quadrupling the number of measurements!

For multiplication $ \class{red}{\bar{x}_1} \cdot \class{red}{\bar{x}_2}$ and division $ \frac{\class{red}{\bar{x}_1}}{\class{red}{\bar{x}_2}} $ of two means, their relative errors $f_1$ and $f_2$ add up to a total relative error: $f = f_1 + f_2$.
For addition $ \class{red}{\bar{x}_1} + \class{red}{\bar{x}_2} $ and subtraction $ \class{red}{\bar{x}_1} - \class{red}{\bar{x}_2} $ of two means, their absolute errors $ \Delta x_1 $ and $ \Delta x_2 $ add up to a total absolute error: $ \Delta x = \Delta x_1 + \Delta x_2 $.

Exercises with Solutions

Use this formula eBook if you have problems with physics problems.

Exercise #1: Mean and Standard Deviation of a Measurement

10 measurements were taken:

Number $ i $	Measurement $ x_i $
1	45.0
2	45.7
3	44.6
4	45.2
5	45.6
6	44.5
7	44.9
8	45.2
9	45.8
10	44.7

What is the mean $ \class{red}{\bar{x}} $ of the sample?
What is the empirical standard deviation $ s $ of the sample?
How much does the mean deviate from the true value $ x $ with 95% confidence?

Tips:

Use the formula for the mean.
Standard deviation is given by: \[ s ~=~ \sqrt{ \frac{1}{N-1} \sum_{i=1}^N (x_i - \bar x)^2 } \]
Use the so-called (Student's) t-distribution. For your case $ N = 10 $, $ t = 2.30 $. Calculate: \[ x ~=~ \bar{x} \pm \frac{s}{\sqrt N} \, t \]

Solution to Exercise #1.1

Using the formula for the mean: \[ \bar x ~=~ \frac{1}{N} \sum_{i=1}^N x_i \] By substituting the 10 measurements from the table: \[ \bar x ~=~ \frac{1}{10} \cdot (45.0 + 45.7 + 44.6 + 45.2 + 45.6 + 44.5 + 44.9 + 45.2 + 45.8 + 44.7) \] Typed into the calculator, the mean is $ \bar x ~=~ 45.12 $

Solution to Exercise #1.2

Using the formula for standard deviation from the tips: \[ s ~=~ \sqrt{ \frac{1}{N-1} \sum_{i=1}^N (x_i - \bar x)^2 } \] you find the empirical standard deviation of the sample. Also, use the mean calculated in Exercise #1.1 and the measurements from the table: \[ s ~=~ \sqrt{ \frac{1}{9} \sum_{i=1}^{10} (x_i - 45.12)^2 } \]

Entered into the calculator: \[ s ~=~ 0.463 \]

Solution to Exercise #1.3

To find out how much the mean calculated in Exercise #1.1 deviates from the true value; namely, with a confidence of 95%, use the t-value from the t-distribution, which is appropriate for your sample. That is: $ N = 10 $ and $ P = 95 $%. You always use the t-distribution when the standard deviation of the population is not known. So, the t-distribution is useful for a sample like in this task.

For your case $ N = 10 $ and $ P = 95 $% is $ t = 2.30 $. With the formula from the hint: \[ x ~=~ \bar{x} \pm \frac{s}{\sqrt N} \, t \] You find out how much the true value $ x $ deviates from the mean $ \bar{x} $: \[ x ~=~ 45.12 \pm \frac{0.463}{\sqrt{10}} \cdot 2.30 \] So about $ \pm 0.337 $.

Exercise #2: Frequency Distribution - Relative Cumulative Frequency

A sample of 200 capacitors was taken from production to perform a quality control of the capacities $ C_i $. The capacities of the capacitors were measured and divided into class midpoints as shown in the following table.

Class	Class Midpoint in $ \text{nF} $	Number of Capacitors
1	841	3
2	842	4
3	843	3
4	844	10
5	845	2
6	846	35
7	847	70
8	848	50
9	849	23

Determine the relative frequencies $ h_i $ in percent.
Determine the relative cumulative frequencies $ H_i $ in percent.

Tips: The relative frequency $ h_i $ indicates what percentage the capacitors of a class midpoint make up of the total sample.

The relative cumulative frequency $ H_i $ is the sum of all relative frequencies up to the $i$-th class midpoint.

Solution to Exercise #2.1

The relative frequency $ h_i $ is calculated for a sample of 200 capacitors as follows: \[ h_i ~=~ \frac{\text{Number in a class}}{200} ~\cdot~ 100 \]

For example, for the 1st class: \begin{align} h_1 &~=~ \frac{3}{200} ~\cdot~ 100 \\\\ &~=~ \frac{3}{2} \, \% \\\\ &~=~ 1.5 \, \% \end{align}

If you do the same for each class, you get the following table with relative frequencies:

Class	Number of Capacitors	Relative Frequency $ h_i $ in %
1	3	1.5
2	4	2
3	3	1.5
4	10	5
5	2	1
6	35	17.5
7	70	35
8	50	25
9	23	11.5

Solution to Exercise #2.2

To calculate the relative cumulative frequency $ H_n $, sum all relative frequencies $ h_i $ up to the $n$-th class. \[ H_n ~=~ h_1 ~+~ h_2 ~+~...~+~ h_n \]

For example, relative cumulative frequency up to the 3rd class: \begin{align} H_3 &~=~ h_1 + h_2 + h_3 \\\\ &~=~ 2.5\% + 2\% + 2.5\% \\\\ &~=~ 7\% \end{align}

Class	Number of Capacitors	Relative Cumulative Frequency $ H_n $ in %
1	3	2.5
2	4	3.5
3	3	5
4	10	10
5	2	11
6	35	28.5
7	70	63.5
8	50	88.5
9	23	100