Efficacy, Adaptive Learning, and the Flipped Classroom, Part II

Listen

In my last post, I described positive but mixed results of an effort by MSU’s psychology department to flip and blend their classroom:

On the 30-item comprehensive exam, students in the redesigned sections performed significantly better (84% improvement) compared to the traditional comparison group (54% improvement).

Students in the redesigned course demonstrated significantly more improvement from pre to post on the 50-item comprehensive exam (62% improvement) compared to the traditional sections (37% improvement).

Attendance improved substantially in the redesigned section. (Fall 2011 traditional mean percent attendance = 75% versus fall 2012 redesign mean percent attendance = 83%)

They did not get a statistically significant improvement in the number of failures and withdrawals, which was one of the main goals of the redesign, although they note that “it does appear that the distribution of A’s, B’s, and C’s shifted such that in the redesign, there were more A’s and B’s and fewer C’s compared to the traditional course.”

In terms of cost reduction, while they fell short of their 17.8% goal, they did achieve a 10% drop in the cost of the course….

It’s also worth noting that MSU expected to increase enrollment by 72 students annually but actually saw a decline of enrollment by 126 students, which impacted their ability to deliver decreased costs to the institution.

Those numbers were based on the NCAT report that was written up after the first semester of the redesigned course. But that wasn’t the whole story. It turns out that, after several semesters of offering the course, MSU was able to improve their DFW numbers after all:

That’s a fairly substantial reduction. In addition, their enrollment numbers have returned to roughly what they were pre-redesign (although they haven’t yet achieved the enrollment increases they originally hoped for).

When I asked Danae Hudson, one of the leads on the project, why she thought it took time to see these results, here’s what she had to say:

I do think there is a period of time (about a full year) where students (and other faculty) are getting used to a redesigned course. In that first year, there are a few things going on 1) students/and other faculty are hearing about “a fancy new course” – this makes some people skeptical, especially if that message is coming from administration; 2) students realize that there are now a much higher set of expectations and requirements, and have all of their friends saying “I didn’t have to do any of that!” — this makes them bitter; 3) during that first year, you are still working out some technological glitches and fine tuning the course. We have always been very open with our students about the process of redesign and letting them know we value their feedback. There is a risk to that approach though, in that it gives students a license to really complain, with the assumption that the faculty team “doesn’t know what they are doing”. So, we dealt with that, and I would probably do it again, because I do really value the input from students.

I feel that we have now reached a point (2 years in) where most students at MSU don’t remember the course taught any other way and now the conversations are more about “what a cool course it is etc”.

Finally, one other thought regarding the slight drop in enrollment we had. While I certainly think a “new blended course” may have scared some students away that first year, the other thing that happened was there were some scheduling issues that I didn’t initially think about. For example, in the Fall of 2012 we had 5 sections and in an attempt to make them very consistent and minimize missed classes due to holidays, we scheduled all sections on either a Tuesday or a Wednesday. I didn’t think about how that lack of flexibility could impact enrollment (which I think it did). So now, we are careful to offer sections (Monday through Thursday) and in morning and afternoon.

To sum up, she thinks there were three main factors: (1) it took time to get the design right and the technology working optimally; (2) there was a shift in cultural expectations on campus that took several semesters; and (3) there was some noise in the data due to scheduling glitches.

There are a number of lessons one could draw from this story, but from the perspective of educational efficacy, I think it underlines how little the headlines (or advertisements) we get really tell us, particularly about components of a larger educational intervention. We could have read, “Pearson’s MyPsychLabs Course Substantially Increased Students Knowledge, Study Shows.” That would have been true, but we have little idea how much improvement there would have been had the course not been fairly radically redesigned at the same time. We also could have read, “Pearson’s MyPsychLabs Course Did Not Improve Pass and Completion Rates, Study Shows.” That would have been true, but it would have told us nothing about the substantial gains over the semesters following the study. We want talking about educational efficacy to be like talking about the efficacy of Advil for treating arthritis. But it’s closer to talking about the efficacy of various chemotherapy drugs for treating a particular cancer. And we’re really really bad at talking about that kind of efficacy. I think we have our work cut out for us if we really want to be able to talk intelligently and intelligibly about the effectiveness of any particular educational intervention.

By Michael Feldstein

More Posts(472)

mikecaulfield says

April 11, 2014 at 12:58 PM

I’d actually say it’s closer to talking about efficacy for lifestyle interventions like diets and physical therapy paired with behavior changes. But point taken.

Couple things about these numbers. First, this pattern, where the Cs push into Bs but the F’s remain stable is one of the more common patterns you see in these interventions. My sense is this is because the problems F’s are suffering from are often different in nature (not just amount) from what C students are struggling with. And the fascinating thing to me here is that while there is a bit of a ding into the F’s it doesn’t look terribly big to me.

From 2011 to 2013 the big story looks like withdrawals. If that’s robust, it’s interesting. Because there’s possibly something really simple going on here — a lot of students withdraw because they hit max absences, to the point where attendance policy impacts their grade. Another bunch of students withdraw because they miss enough classes where their grade is impacted for non-policy reasons.

And what Danae says about enrollment is actually interesting when you think about absence-based withdrawal. Imagine two classes where absence is completely random. One has 30 meetings, and one has 27. In a purely random process, students in the 30 meeting class are more likely to have three absences than students in the 27 meeting class, because there’s three more chances to be absent. Modelling the difference with a Tue/Thu vs. MWF split doesn’t really change it — the Tue/Thu class has less chances to miss, but each miss counts more, so there’s no free lunch. Only the class with Monday and Friday sessions gets free lunch.

But conversely, in a blended course, there’s less chances to be absent. So you’d predict less absence-based withdrawal in a blended course.

This isn’t to say the figures are wrong — quite the opposite. It’s to say that the pattern of DFWs could indicate that blended’s capability to reduce frequency of absence is one of the prime drivers of the DWF success.

So it makes me wonder how much of that withdrawal bump in 2012 dealt purely with scheduling issues (putting the classes on Tuesday/Thursday), and how much of the drop in W’s is also essentially do to scheduling benefits of blended? It’d be interesting to look at a metric like “number of students with three absences or more three quarters into the semester” and see how that tracks with withdrawal rates.

I meant to leave a small comment, but this has turned into a stream of consciousness meditation. In any case, I’d be fascinated to see some absence data from this intervention.

Comments

mikecaulfield says

April 11, 2014 at 12:58 PM

I’d actually say it’s closer to talking about efficacy for lifestyle interventions like diets and physical therapy paired with behavior changes. But point taken.

Couple things about these numbers. First, this pattern, where the Cs push into Bs but the F’s remain stable is one of the more common patterns you see in these interventions. My sense is this is because the problems F’s are suffering from are often different in nature (not just amount) from what C students are struggling with. And the fascinating thing to me here is that while there is a bit of a ding into the F’s it doesn’t look terribly big to me.

From 2011 to 2013 the big story looks like withdrawals. If that’s robust, it’s interesting. Because there’s possibly something really simple going on here — a lot of students withdraw because they hit max absences, to the point where attendance policy impacts their grade. Another bunch of students withdraw because they miss enough classes where their grade is impacted for non-policy reasons.

And what Danae says about enrollment is actually interesting when you think about absence-based withdrawal. Imagine two classes where absence is completely random. One has 30 meetings, and one has 27. In a purely random process, students in the 30 meeting class are more likely to have three absences than students in the 27 meeting class, because there’s three more chances to be absent. Modelling the difference with a Tue/Thu vs. MWF split doesn’t really change it — the Tue/Thu class has less chances to miss, but each miss counts more, so there’s no free lunch. Only the class with Monday and Friday sessions gets free lunch.

But conversely, in a blended course, there’s less chances to be absent. So you’d predict less absence-based withdrawal in a blended course.

This isn’t to say the figures are wrong — quite the opposite. It’s to say that the pattern of DFWs could indicate that blended’s capability to reduce frequency of absence is one of the prime drivers of the DWF success.

So it makes me wonder how much of that withdrawal bump in 2012 dealt purely with scheduling issues (putting the classes on Tuesday/Thursday), and how much of the drop in W’s is also essentially do to scheduling benefits of blended? It’d be interesting to look at a metric like “number of students with three absences or more three quarters into the semester” and see how that tracks with withdrawal rates.

I meant to leave a small comment, but this has turned into a stream of consciousness meditation. In any case, I’d be fascinated to see some absence data from this intervention.
mikecaulfield says

April 11, 2014 at 3:11 PM

Ah, sorry — here’s the metric I’d like to see comparing control to intervention — percentage of W’s with more than one week’s worth of absences. If it stays stable or increases something else is driving the drop in W’s, if it decreases, then the effect is possibly driven by the reduced impact/occurence of absenteeism on student performance.

These are really exciting results, by the way. Thanks Danae for sharing them.
Michael Feldstein says

April 11, 2014 at 4:04 PM

As usual, Mike, you have a great eye for these things. The point about the F’s not moving has come up a lot in the Purdue Signals data too, for example, but the withdrawal question you raise is super-interesting. It’s become a truism that hybrid courses are “more effective” than either F2F or fully online (which usually means better DFW scores), but I’ve yet to hear a convincing data-driven argument for why hybrid could get better results across a wide range of class implementations. It suggests something structural, and you’ve just pointed to a good candidate.

By Michael Feldstein

Reader Interactions

Comments

Trackbacks