Death Statistics For England And Wales

metadata
  • keywords:
    • health
    • statistics
    • economics
    • coronavirus
    • python
    • lxml
    • matplotlib
    • numpy
  • published:
  • updated:
  • Atom Feed

With all of the events that are currently taking place around the world I thought that it would be timely to study some very basic death statistics for England and Wales. From my own observations, I feel that most people do not have a rough idea of how many people die each year in England and Wales normally, or what the average age of someone who dies is. I believe that this is a problem because I believe that people are not able to comprehend increases in deaths if they do not already know how many deaths occur normally.

This blog post has been a long time coming, the following articles are interesting food for thought from across the political spectrum:

The statistics that I am using in the following discussion have all come from the Office for National Statistics (ONS). The two specific data sources (fetched at approximately 10AM on 8th May 2020) that I have used are:

A word of caution: just because I can find an historical event that aligns with a peak or a trough on a curve does not necessarily mean that the historical event caused the peak or trough. Correlation ≠ Causation; see Spurious Correlations for some hilarious examples that underscore this statement. The rest of this blog post is arranged like:

  1. Total Deaths
    1. Total Deaths By Age
  2. Death Rates
    1. Death Rates By Age
    2. Life Expectancy
  3. Total Deaths By Week
  4. Observations

§1 Total Deaths

In the UK, when someone dies a Death Certificate is created. This usually contains some basic information about the person who has died, such as name, date of birth, where and when they died, etc... For more information see the ONS’ Guide To Death Certificates. The consequence of this is that it is trivial for the UK government to collate information on things like how many people die in the UK and when people die in the UK. The ONS provides the Deaths registered in England and Wales dataset, which contains lots of information on deaths in England and Wales (unfortunately, due to devolution, these are only deaths in England and Wales - deaths in Scotland and Northern Ireland are published separately).

The following plot uses data from “Table 1” within “referencetablesfinalv22.xlsx” (source: Office for National Statistics licensed under the Open Government Licence).

Download:
  1. 512x281 (57.7 KiB)
  2. 1,024x562 (142.2 KiB)
  3. 2,048x1,125 (358.9 KiB)
  4. 3,119x1,713 (299.6 KiB)

I think that the most basic thing to observe here first of all is that in 2018 in England and Wales 541,589 people died - which is 1,484 people per day, which is just a tiny little bit more than 1 death every minute.

In the above plot I have taken the liberty of annotating some historic events. Please use caution here though and heed the warning that I included in the top of this blog post: correlation ≠ causation.

Finally, I will point out that these are simply “total deaths” and that the numbers may simply be changing because the population of England and Wales went up and down too. Later on I will show some “total death rate” statistics which normalise out the fluctuating population.

§1.1 Total Deaths By Age

Enough people die in England and Wales each year that it is possible to get a statistically significant distribution of the age that people were when they died. This is not the same thing as life expectancy. Life expectancy is how long a person is expected to live; the distribution of peoples’ ages at death is only equal to life expectancy if life expectancy never changes due to societal changes and if at the start of the year there were the same number of people of all ages in your population (i.e., even if life expectancy is 80 years then most people who die in a year might be 20 years old if most of the people in your population at the beginning of the year were 20 years old).

The following plot uses data from “Table 4” and “Table 5” within “referencetablesfinalv22.xlsx” (source: Office for National Statistics licensed under the Open Government Licence).

Download:
  1. 512x286 (43.6 KiB)
  2. 1,024x572 (114.0 KiB)
  3. 2,048x1,143 (295.3 KiB)
  4. 3,068x1,713 (239.3 KiB)

The “modal male age at death” is 86 years old and the “modal female age at death” is 88 years old. This tells you in simple terms that the most common age for a person to die (averaged over 2016, 2017 and 2018) was 86 years old for a man and 88 years old for a woman.

The modal average, in my opinion, is not the most useful parameter for describing this particular parameter. The distribution can be re-plotted as a cumulative distribution which allows the median average to be obtained easily.

Download:
  1. 512x283 (37.5 KiB)
  2. 1,024x567 (99.1 KiB)
  3. 2,048x1,134 (253.0 KiB)
  4. 3,094x1,713 (188.7 KiB)

The “median male age at death” is 78.3 years old and the “median female age at death” is 83.5 years old (as indicated by the dashed green lines which intersect 50% in the above plot). This tells you in simple terms that (averaged over 2016, 2017 and 2018) half of the deaths were above 78.3 years old for men and above 83.5 years old for women.

Finally, there is a rather sad aspect of these statistics that you may have missed in the basic age distribution plot earlier: infant deaths. By re-plotting the data on a logarithmic y-axis it can be clearly shown that just as many babies died in their first year as 50 year olds who died (averaged over 2016, 2017 and 2018).

Download:
  1. 512x292 (50.0 KiB)
  2. 1,024x583 (131.7 KiB)
  3. 2,048x1,166 (343.0 KiB)
  4. 3,008x1,713 (268.9 KiB)

§2 Death Rates

As mentioned earlier, whilst the plots of total deaths are interesting they may give incorrect impressions of what is actually happening in society because the numbers are sensitive to general changes in population, such as emigration and immigration. To solve this issue the ONS produces “total death rate” numbers, which unfortunately only go back to 1953 (presumably because they did not accurately estimate the total England and Wales population before then?).

The following plot uses data from “Table 1” within “referencetablesfinalv22.xlsx” (source: Office for National Statistics licensed under the Open Government Licence).

Download:
  1. 512x294 (45.8 KiB)
  2. 1,024x587 (111.8 KiB)
  3. 2,048x1,174 (283.0 KiB)
  4. 2,988x1,713 (220.6 KiB)

The increase in deaths during the period of austerity that was observed earlier appears to have been, in part, due to an increasing population because the increasing gradient in the above plot is less pronounced that the earlier plot (of total deaths). In both plots the point of inflection is 2011; the increase in the total number of deaths was 11.81% whereas the increase in the total death rate was only 6.98% - thus telling you that 4.51% out of that 11.81% increase in deaths was simply due to an increase in the underlying population (1.0698 × 1.0451 = 1.1181).

The above plot is concerning though because it confirms what was observed in the earlier one: death rates were in a constant decline from approximately 1980 onwards and something occurred between 2010 and 2012 that caused this downward trend to reverse and actually increase for the first time since records began (for the “total death rate” parameter). Again, this might be a little misleading too: if your population contained a lot of 40 year olds in 1970 then it would contain a lot of 80 year olds in 2010 and a lot of them would begin to die around 2010. The uptick in death rate from 2011 might just be a consequence of the UK having an ageing population - our population distribution is top-heavy (see the excellent ONS article on “Living longer: how our population is changing and why it matters”).

§2.1 Death Rates By Age

To dig further in to whether the 6.98% increase in total death rate from 2011 to 2018 was due to either an ageing population or an actual increase in the overall death rate then the death rates by age must be studied. These are plotted below on linear and on logarithmic axes for both men and women.

The following plots use data from “Table 3” within “referencetablesfinalv22.xlsx” (source: Office for National Statistics licensed under the Open Government Licence).

Download:
  1. 512x291 (23.1 KiB)
  2. 1,024x582 (58.2 KiB)
  3. 2,048x1,164 (142.8 KiB)
  4. 3,015x1,713 (144.7 KiB)
Download:
  1. 512x291 (22.3 KiB)
  2. 1,024x582 (55.9 KiB)
  3. 2,048x1,164 (137.4 KiB)
  4. 3,015x1,713 (138.8 KiB)
Download:
  1. 512x289 (25.2 KiB)
  2. 1,024x579 (63.4 KiB)
  3. 2,048x1,157 (156.7 KiB)
  4. 3,031x1,713 (158.1 KiB)
Download:
  1. 512x289 (24.8 KiB)
  2. 1,024x579 (62.6 KiB)
  3. 2,048x1,157 (154.7 KiB)
  4. 3,031x1,713 (154.7 KiB)

Personally, I believe that there is a lot of noise in these numbers and that broadly speaking there does not appear to be any general trend (either up or down) in death rates by age from 2009 to 2018. Therefore, it is my interpretation that the unexplained 6.98% increase in total death rate is consistent with an ageing population and that there does not appear to have been an event (or series of events) between 2009 and 2018 that caused any significant change to the death rate for different age groups. In short: the political policy of “austerity” did not cause more people to die - it is more likely that it was an ageing population coinciding with net immigration during the period of austerity. More simply: the death rate increased not because life got worse but because people grew up and moved from one age bracket to another one.

Given that the above data is very noisy without any overall trends then it seems fair to average different years together to obtain an “average death rate by age” curve, which is shown below on both linear and logarithmic axes.

Download:
  1. 512x291 (21.5 KiB)
  2. 1,024x582 (53.1 KiB)
  3. 2,048x1,164 (130.5 KiB)
  4. 3,015x1,713 (123.2 KiB)
Download:
  1. 512x289 (24.4 KiB)
  2. 1,024x579 (60.5 KiB)
  3. 2,048x1,157 (148.7 KiB)
  4. 3,031x1,713 (137.0 KiB)

These curves are very interesting because it allows an estimation of life expectancy to be made ...

§2.2 Life Expectancy (i.e. Most Likely Age At Death)

The death rates discussed above are, in effect, “the probability of a 20 year old person dying whilst they are 20 years old”. If you want to know “what is the probability of a person dying when they are 20 years old” then it is simply the “the probability of a person not dying from 0 years old to 19 years old” × “the probability of a 20 year old person dying whilst they are 20 years old”. This logic allows a simple numerical series to be created and a distribution of “most likely age at death”, i.e. “life expectancy”, to be estimated. This method assumes that the “death rate by age” does not change during a person’s life.

Download:
  1. 512x292 (30.9 KiB)
  2. 1,024x584 (78.4 KiB)
  3. 2,048x1,169 (199.9 KiB)
  4. 3,002x1,713 (164.5 KiB)

The estimated “median male age at death” is 81.3 years old and the “median female age at death” is 84.8 years old (as indicated by the dashed green lines which intersect 50% in the above plot). If you wanted to really depress yourself, the above plot can trivially be used to answer questions like:

Unfortunately, due to the coarseness of the original dataset, the probability density function has lots of saw-teeth and therefore looks awful. Perhaps you can find the death rate dataset broken down by individual years (rather than 5 year groups) somewhere else on the ONS web site and let me know?

Download:
  1. 512x290 (32.8 KiB)
  2. 1,024x579 (84.3 KiB)
  3. 2,048x1,159 (216.1 KiB)
  4. 3,028x1,713 (195.9 KiB)

A word of caution: these plots obviously assume that the death rate as a function of age is constant - that your probability of death at 60 years old is not affected by what the probability of death at 20 years old was 40 years ago. If I could find more historic “death rate by age” data from the ONS then I could improve this approximation and make a plot of “most likely age at death of a person born in 1950” (for example) instead.

§3 Total Deaths By Week

The final parameter that I wish to discuss is the most controversial and the most relevant to us at this time: the “weekly deaths”. The ONS provides the Deaths registered weekly in England and Wales dataset, which contains a running total of the number of deaths in England and Wales for each week in a year.

The following plot uses data from “Weekly figures 2017” within “publishedweek522017.xlsx”, “Weekly figures 2018” within “publishedweek522018withupdatedrespiratoryrow.xlsx”, “Weekly figures 2019” within “publishedweek522019.xlsx” and “Weekly figures 2020” within “publishedweek172020.xlsx” (source: Office for National Statistics licensed under the Open Government Licence).

Download:
  1. 512x292 (50.5 KiB)
  2. 1,024x584 (135.6 KiB)
  3. 2,048x1,169 (362.7 KiB)
  4. 3,002x1,713 (306.3 KiB)

In the above plot I have taken the liberty of highlighting the week when the UK entered lockdown. Again, please use caution here though and heed the warning that I included in the top of this blog post: correlation ≠ causation. There are a few things to observe from the above plot:

§4 Observations

If you have got this far reading my blog post, then: thank you. Overall, I have a few summarising observations to make:

Finally, I thought that I would end on a modern classic web comic:

A xkcd comic about Epidemiology
© xkcd