One More Thing on NBER Report: Where did pre-2011 data come from?

Earlier today I critiqued the new National Bureau of Economic Research (NBER) report on the “The Returns to Online Postsecondary Education”, calling it “a hot mess that that conflates online students, enrollments, programs, institutions and uses a bizarre and misleading data set for its analysis”. That post looked mostly at the interpretation of Integrated Postsecondary Education Data System (IPEDS) data to define which institutions were exclusively online or substantially online based on the author’s definition. But this report has a bigger flaw that I initially missed – it shows historical data on online enrollments from the late 1990s and claims to analyze in detail online enrollment data from 2009 – 2014, yet IPEDS did not collect distance education data until 2011.

In the data section, the author states:

For data on the share of an institution’s courses that are taken online, I rely on the National Center for Education Statistics’ Integrated Postsecondary Education Data System (IPEDS). This is a data system to which nearly all postsecondary institutions must mandatorily report. IPEDS is also the source of numerous other institution-level, as opposed to student-level, variables: Pell Grant revenue, total undergraduate student loans, total enrollment, and so on. IPEDS asks postsecondary schools the following:

(1) Are all programs at your institution offered exclusively via distance education? (2) How many degree/certificate-seeking undergraduates are (a) enrolled exclusively in distance education courses, (b) enrolled in some but not all distance education courses, (c) not enrolled in any distance education course? (3) Repeat question (2) for non-degree/certificate-seeking undergraduates and for graduate students.

Figures 1 and 2 show historical online enrollment data.

Figure 1

Figure 2

The problem is that IPEDS did not add the distance education data referenced in the data description above until the Fall 2012 term. Prior to that term, there were only two other sources of data from the National Center for Educational Statistics (NCES), host organization for IPEDS and part of the Education Department.

  • Starting 2011, IPEDS collected data on institutions for which “All programs offered completely via distance education”. This was augmented with the broader distance education data starting Fall 2012.
  • There were two reports – one in 2008 and one in 2011 – done by NCES that looked at survey data from the National Postsecondary Student Aid
    Study, which were conducted during the 1999–2000, 2003–04, and 2007–08 academic years. These were not part of IPEDS, and there was only data for those three specific years.

Yet figures 1 and 2 show data back to the late 1990s with annual updates. Figure 3 from the NBER report makes it even more clear that the NBER report claims to have online enrollment data for every year within the study.

Figure 3

The only annual data source that I know of with online education enrollments is the Babson Survey Research Group, which tracked the number of students taking at least one class online since 2003. In recent years, BSRG has changed to use the IPEDS data set instead of its survey. From the BSRG Grade Change report from January 2014:


Yet this could not have been the source of NBER data as it did not differentiate between exclusively online and substantially online, and the data shows far higher enrollments than does the NBER historical data.

Where did the historical data for the NBER report come from? The only reference of different data is this one comment on p. 8 [emphasis added]:

The descriptive statistics shown in the next section focus on students who were enrolled in 2013 so as to represent online education as its most current. (Descriptive statistics based on earlier years are available from the author.)

By checking the references in the NBER paper, there are a set of three papers authored by Deming et al that might give a clue.

Reference section

The first reference is for a forthcoming section to be published in a volume edited by Caroline Hoxby, author of this NBER report. The second reference (2015) does not share historical data but does have this footnote:

1IPEDS has collected online enrollment data at the campus
level since 2012

Good to see that acknowledgement. Initially I discounted the third reference (2012) as it was on for-profit institutions and not online education more broadly. But maybe we should look there. In this paper we find the following chart.

Deming For-profit

What the 2012 paper did was to look at IPEDS data for for-profit institutions, then it applied the heuristic filter:

A for-profit institution is classified as “online” if it has the word online in its name or if not more than 33 percent of the school’s students are from one U.S. state.

The “online Appendix” with further details mostly shares regression models and student loan default data while confirming the interpretation of what is an online institution.

This is the closest I could find to historical data based on IPEDS that might support 1999/2000 through 2011 enrollment data for online education. But it is only for for-profit sector, and the data does not match that shown in figure 3.

The means there are three  timeframes of data:

  • 1999 / 2000 thru 2008 – used only in figures 1 – 3 to set the historical context for the study and claims of explosive growth after 2005 (despite the graphs not supporting this claim directly; figure 1 growth really starts in 2008, and figure 2 growth is fairly consistent through 2008). There is no apparent source for this data, but the closest guess is some variation of the Deming heuristics.
  • 2009 thru 2011 – used directly in the analysis that claims to analyze “longitudinal data on nearly every person who engaged in postsecondary education that was wholly or substantially online between 1999 and 2014”. There is no apparent source for the enrollment part of this data.
  • 2012 – 2014 – used directly in the analysis and for which IPEDS data existed. 2013 data analyzed in detail in the data tables.

If I am missing something and there is a source of reliable IPEDS data on online education pre-2011, I would love to see it. And I will update this post accordingly. But until then, this is troublesome.

Share Button
"One More Thing on NBER Report: Where did pre-2011 data come from?", 5 out of 5 based on 2 ratings.

Google+ Comments

About Phil Hill

Phil is a consultant and industry analyst covering the educational technology market primarily for higher education. He has written for e-Literate since Aug 2011. For a more complete biography, view his profile page.
This entry was posted in Big Picture, Business & Economics, Policy, Research and tagged , , , , , , . Bookmark the permalink.

3 Responses to One More Thing on NBER Report: Where did pre-2011 data come from?

  1. Pingback: New NBER Study on Online Education is Deeply Flawed -e-Literate

  2. Pingback: Recommended Reading: IHE coverage of NBER paper and critiques -e-Literate

  3. Pingback: One More Thing on NBER Report: Where did pre-2011 data come from? |

Leave a Reply