Channel: CodeSection,代码区,网络安全 - CodeSec
Viewing all articles
Browse latest Browse all 12749

Putting It All Together


The kind folks over at @RStudio gave a nod to my recently CRAN-released epidata package in their January data package roundup and I thought it might be useful to give it one more showcase using the recent CRAN update to ggalt and the new hrbrthemes (github-only for now) packages.

Labor force participation rate

The U.S. labor force participation rate (LFPR) is an oft-overlooked and under- or mis-reported economic indicator. I’ll borrow the definition from Investopedia:

The participation rate is a measure of the active portion of an economy’s labor force. It refers to the number of people who are either employed or are actively looking for work. During an economic recession, many workers often get discouraged and stop looking for employment, resulting in a decrease in the participation rate.

Population age distributions and other factors are necessary to honestly interpret this statistic. Parties in power usually dismiss/ignore this statistic outright and their opponents tend to wholly embrace it for criticism (it’s an easy target if you’re naive). “Yay” partisan democracy.

Since the LFPR is has nuances when looked at categorically, let’s take a look at it by attained education level to see how that particular view has changed over time (at least since the gov-quants have been tracking it).

We can easily grab this data with epidata::get_labor_force_participation_rate() (and, we’ll setup some library() calls while we’re at it:

library(hrbrthemes) # devtools::install_github("hrbrmstr/hrbrthemes")
part_rate <- get_labor_force_participation_rate("e")
## Observations: 457
## Variables: 7
## $ date <date> 1978-12-01, 1979-01-01, 1979-02-01, 1979-03-01, 1979-04-01, 1979-05-01...
## $ all <dbl> 0.634, 0.634, 0.635, 0.636, 0.636, 0.637, 0.637, 0.637, 0.638, 0.638, 0...
## $ less_than_hs <dbl> 0.474, 0.475, 0.475, 0.475, 0.475, 0.474, 0.474, 0.473, 0.473, 0.473, 0...
## $ high_school <dbl> 0.690, 0.691, 0.692, 0.692, 0.693, 0.693, 0.694, 0.694, 0.695, 0.696, 0...
## $ some_college <dbl> 0.709, 0.710, 0.711, 0.712, 0.712, 0.713, 0.712, 0.712, 0.712, 0.712, 0...
## $ bachelor's_degree <dbl> 0.771, 0.772, 0.772, 0.773, 0.772, 0.772, 0.772, 0.772, 0.772, 0.773, 0...
## $ advanced_degree <dbl> 0.847, 0.847, 0.848, 0.848, 0.848, 0.848, 0.847, 0.847, 0.848, 0.848, 0...

One of the easiest things to do is to use ggplot2 to make a faceted line chart by attained education level. But, let’s change the labels so they are a bit easier on the eyes in the facets and switch the facet order from alphabetical to something more useful:

gather(part_rate, category, rate, -date) %>%
mutate(category=stri_replace_all_fixed(category, "_", " "),
category=stri_replace_last_regex(category, "Hs$", "High School"),
category=factor(category, levels=c("Advanced Degree", "Bachelor's Degree", "Some College",
"High School", "Less Than High School", "All"))) -> part_rate

Now, we’ll make a simple line chart, tweaking the aesthetics just a bit:

ggplot(part_rate) +
geom_line(aes(date, rate, group=category)) +
scale_y_percent(limits=c(0.3, 0.9)) +
facet_wrap(~category, scales="free") +
labs(x=paste(format(range(part_rate$date), "%Y-%b"), collapse=" to "),
y="Participation rate (%)",
title="U.S. Labor Force Participation Rate",
caption="Source: EPI analysis of basic monthly Current Population Survey microdata.") +
theme_ipsum_rc(grid="XY", axis="XY")
Putting It All Together

The “All” view is interesting in that the LFPR has held fairly “steady” between 60% & 70%. Those individual and fractional percentage points actually translate to real humans, so the “minor” fluctuations do matter.

It’s also interesting to see the direct contrast between the starting historical rate and current rate (you could also do the same with min/max rates, etc.) We can use a “dumbbell” chart to compare the 1978 value to today’s value, but we’ll need to reshape the data a bit first:

group_by(part_rate, category) %>%
arrange(date) %>%
slice(c(1, n())) %>%
spread(date, rate) %>%
ungroup() %>%
filter(category != "All") %>%
mutate(category=factor(category, levels=rev(levels(category)))) -> rate_range
filter(part_rate, category=="Advanced Degree") %>%
arrange(date) %>%
slice(c(1, n())) %>%
mutate(lab=lubridate::year(date)) -> lab_df

(We’ll be using the extra data frame to add labels the chart.)

Now, we can compare the various ranges, once again tweaking aesthetics a bit:

ggplot(rate_range) +
geom_dumbbell(aes(y=category, x=`1978-12-01`, xend=`2016-12-01`),
size=3, color="#e3e2e1",
colour_x = "#5b8124", colour_xend = "#bad744",
dot_guide=TRUE, dot_guide_size=0.25) +
geom_text(data=lab_df, aes(x=rate, y=5.25, label=lab), vjust=0) +
scale_x_percent(limits=c(0.375, 0.9)) +
labs(x=NULL, y=NULL,
title=sprintf("U.S. Labor Force Participation Rate %s-Present", lab_df$lab[1]),
caption="Source: EPI analysis of basic monthly Current Population Survey microdata.") +
Putting It All Together

One takeaway from both these charts is that it’s probably important to take education level into account when talking about the labor force participation rate. The get_labor_force_participation_rate() function ― along with most other functions in the epidata package ― also has options to factor the data by sex, race and age, so you can pull in all those views to get a more nuanced & informed understanding of this economic health indicator.

Viewing all articles
Browse latest Browse all 12749

Latest Images

Trending Articles

Latest Images