The family got hit pretty hard with the flu right as the Christmas festivities started and we were all pretty much bed-ridden zombies up until today (2017-01-02). When in the throes of a very bad ILI it’s easy to imagine that you’re a victim of a severe outbreak, especially with ancillary data from others that they, too, either just had/have the flu or know others who do. Thankfully, I didn’t have to accept this emotional opine and could turn to the cdcfluview package to see just how this year is measuring up.
Influenza cases are cyclical, and that’s super-easy to see with a longer-view of the CDC national data:
library(cdcfluview)library(tidyverse)
library(stringi)
flu <- get_flu_data("national", years=2010:2016)
mutate(flu, week=as.Date(sprintf("%s %s 1", YEAR, WEEK), format="%Y %U %u")) %>%
select(-`AGE 25-64`) %>%
gather(age_group, count, starts_with("AGE")) %>%
mutate(age_group=stri_replace_all_fixed(age_group, "AGE", "Ages")) %>%
mutate(age_group=factor(age_group, levels=c("Ages 0-4", "Ages 5-24", "Ages 25-49", "Ages 50-64", "Ages 65"))) %>%
ggplot(aes(week, count, group=age_group)) +
geom_line(aes(color=age_group)) +
scale_y_continuous(label=scales::comma, limits=c(0,20000)) +
facet_wrap(~age_group, scales="free") +
labs(x=NULL, y="Count of reported ILI cases",
title="U.S. National ILI Case Counts by Age Group (2010:2011 flu season through 2016:2017)",
caption="Source: CDC ILInet via CRAN cdcfluview pacakge") +
ggthemes::scale_color_tableau(name=NULL) +
hrbrmisc::theme_hrbrmstr(grid="XY") +
theme(legend.position="none")
We can use the same data to zoom in on this season:
mutate(flu, week=as.Date(sprintf("%s %s 1", YEAR, WEEK), format="%Y %U %u")) %>%select(-`AGE 25-64`) %>%
gather(age_group, count, starts_with("AGE")) %>%
mutate(age_group=stri_replace_all_fixed(age_group, "AGE", "Ages")) %>%
mutate(age_group=factor(age_group, levels=c("Ages 0-4", "Ages 5-24", "Ages 25-49", "Ages 50-64", "Ages 65"))) %>%
filter(week >= as.Date("2016-07-01")) %>%
ggplot(aes(week, count, group=age_group)) +
geom_line(aes(color=age_group)) +
scale_y_continuous(label=scales::comma, limits=c(0,20000)) +
facet_wrap(~age_group, scales="free") +
labs(x=NULL, y="Count of reported ILI cases",
title="U.S. National ILI Case Counts by Age Group (2016:2017 flu season)",
caption="Source: CDC ILInet via CRAN cdcfluview pacakge") +
ggthemes::scale_color_tableau(name=NULL) +
hrbrmisc::theme_hrbrmstr(grid="XY") +
theme(legend.position="none")
So, things are trending up, but how severe is this year compared to others? While looking at the number/percentage of ILI cases is one way to understand severity, another is to look at the mortality rate. The cdcfluview package has a get_mortality_surveillance_data() function, but it’s region-based and I’m really only looking at national data in this post. A helpful individual pointed out a new CSV file at https://www.cdc.gov/flu/weekly/index.htm#MS which we can reproducibly programmatically target (so we don’t have to track filename changes by hand) with:
library(rvest)pg <- read_html("https://www.cdc.gov/flu/weekly/index.htm#MS")
html_nodes(pg, xpath=".//a[contains(@href, 'csv') and contains(@href, 'NCHS')]") %>%
html_attr("href") -> mort_ref
mort_url <- sprintf("https://www.cdc.gov%s", mort_ref)
df <- readr::read_csv(mort_url)
We can, then, take a look at the current “outbreak” status (when real-world mortality events exceed the model threshold):
mutate(df, week=as.Date(sprintf("%s %s 1", Year, Week), format="%Y %U %u")) %>%select(week, Expected, Threshold, `Percent of Deaths Due to Pneumonia and Influenza`) %>%
gather(category, percent, -week) %>%
mutate(percent=percent/100) %>%
ggplot() +
geom_line(aes(week, percent, group=category, color=category)) +
scale_x_date(date_labels="%Y-%U") +
scale_y_continuous(label=scales::percent) +
ggthemes::scale_color_tableau(name=NULL) +
labs(x=NULL, y=NULL, title="U.S. Pneumonia & Influenza Mortality",
subtitle="Data through week ending December 10, 2016 as of December 28, 2016",
caption="Source: National Center for Health Statistics Mortality Surveillance System") +
hrbrmisc::theme_hrbrmstr(grid="XY") +
theme(legend.position="bottom")
That view is for all mortality events from both influenza and pneumonia. We can look at the counts for just influenza as well:
mutate(df, week=as.Date(sprintf("%s %s 1", Year, Week), format="%Y %U %u")) %>%select(week, `Influenza Deaths`) %>%
ggplot() +
geom_line(aes(week, `Influenza Deaths`), color=ggthemes::tableau_color_pal()(1)) +
scale_x_date(date_labels="%Y-%U") +
scale_y_continuous(label=scales::comma) +
ggthemes::scale_color_tableau(name=NULL) +
labs(x=NULL, y=NULL, title="U.S. Influenza Mortality (count of mortality events)",
subtitle="Data through week ending December 10, 2016 as of December 28, 2016",
caption="Source: National Center for Health Statistics Mortality Surveillance System") +
hrbrmisc::theme_hrbrmstr(grid="XY") +
theme(legend.position="bottom")
It’s encouraging that the overall combined mortality rate is trending downwards and that the mortality rate for influenza is very low. Go. Science.
I’ll be adding a function to cdcfluview to retrieve this new data set a bit later in the year.
Hopefully you’ll avoid the flu and enjoy a healthy and prosperous 2017.