There’s something that drives me a little crazy when I hear about how someone has learned from Netflix, Amazon, Google, etc., and that they’re going to be the Netflix, Amazon, Movie Box or Google of education. Actually there are a few things.
There is a natural tendency to want to leverage the work of others. In the burgeoning space of learning analytics many look to those who have large data sets and algorithms to extract some kind of meaning from those massive stores of information. We should always be looking to leverage when we can! This time, though, our takeaways are limited.
What are Our Outcomes?
Amazon, Netflix, etc., have an outcome. Let’s call it “profit” for lack of a better term. They run data analysis to provide suggestions like Netflix does to recommend movies to users. Why? The goal is to show you something you didn’t think of that you’d like, which then increases interaction and loyalty, leading to profit. So long as they get that right even 1 in 100 times, it’s worth their while to run their analysis and make the suggestion. For someone doing data analysis, this is a gift. Any data mining algorithm wants something to optimize for. Let’s say in this case, it’s “$”.
They can ask their algorithm “I did X, did they give us more $?” Netflix can ask their data “I did Y, did I get more $?” Google can ask “I did Z, did I get more $?” Not only are they measuring the outcomes directly (more $) but also the inputs (X, Y, Z - whatever changes they introduce to their products).
We don’t have it that easy in education if we want to do it right (assuming we’re not looking just for more $). Measuring learning is hard. As Michael Feldstein discusses in his post A Taxonomy of Adaptive Analytics Strategies, “Since doing good learning analytics is hard, we often do easy learning analytics and pretend that they are good instead.”
I’ll be the first to say there is good work going on in early warning systems based on click data. The outcome is yes/no – does the student stay? They’re not measuring if students understand the material, just “do they stick it through.” Don’t get me wrong – every student kept in school via an early warning system is an achievement. But it’s not the big promise of learning analytics. It’s an achievement of website analytics very similar to those studied by the commerce sites. The more a user stays engaged with their sites, the more profit they generate. The comparisons to those kinds of analytics pretty much end there. Unfortunately for those looking for the easy path, our outcomes are complex and the inputs aren't actually that obvious either. Let's talk about the real promise of learning analytics.
We need to be measuring learning the entire time in which students are engaging in learning activities and track how we believe those activities should be shaping the experience.
Explanatory vs. Predictive
How does a good teacher know if a student is struggling with a concept in the classroom? We hope that they recognize signs of difficulty while reviewing practice work or are asked for assistance by the student (feedback, hints). If learning analytics are going to provide useful feedback then we should be measuring those feedbacks and requests for help. A click stream tells me if a student is using material but not why or what that interaction ought to achieve. A student might skip problems they already know – their lack of answering questions in a particular part of the course is not itself evidence of lack of understanding. Similarly, a student can struggle while working very hard to try and understand a concept. Their mere frequency of interaction does not in any way imply instructional success. Only knowing their clicks, or visits, tells us nothing about their intent, when they wanted help, if they got that help, or what feedback they were given (or should have been given). Consider the following questions one might want to ask:
- How often is the student getting questions right on the first try?
- Do they eventually get them correct?
- How often are they asking for help?
- Do expert teachers rate this skill is generally difficult?
Answering these (and many more such) questions require semantic data that we need to be collecting and cannot collect with a mere click-stream. When Feldstein refers to Semantic Analytics, this points the finger at the algorithms. It is also the lack of semantic data for algorithms to take advantage of. What does the difference in that data look like? This is an example of what I mean:
- Student x clicked y at time z
- With semantic data, we can can store:
- Student x has asked for his second hint on part three of the question “What are the five steps of this program?” and he was told “Recall that you need to identify a base case for your function” The correct answer will be “line 5”. The question is related to the skill “recursive base case” and is often mistakenly answered as “line 4” due to a common misconception
- Student x' clicked y' at time z'
- With semantic data we can store:
- Student x' has now selected "line 5” which is correct. Student was given the feedback “You are correct, line 5 is the base case”. This was her third try, though it is the first question about the skill "recursive base case" she attempted to answer even though it's the third related question in the material.
Not only do we need data about interaction, but we need the content itself to identify these in meaningful ways. If nobody tells the system what the hints and feedback mean, what skills the targeted interaction is meant to address, etc., the algorithms can’t make any reasonable estimations of learning. It’s beyond the capacity of a simple algorithm to identify what these clicks mean without guidance of better data. We can go further.
- When do students ask for a specific hint and what do we know about the misconception they’re exhibiting?
- Does one hint for a particular question provide enough guidance or are more hints needed?
- After selecting an answer and being told “That’s not quite right, because…” do they then answer correctly?
To answer these kinds of questions you have to have a design process that not only creates these targeted hints and feedback, but allows the system to semantically record each and every selection of students interacting with the material as described. Only then can you ask the questions above of the data you’re collecting with any hope of a meaningful result. If there are no targeted hints that students can ask for, if there is no targeted feedback, if there is no well-designed question, there is no semantic data.
What can we do when we are empowered with this sort of semantic data and analysis? Here are just some examples:
- Provide real-time feedback to teachers about how groups of students and individuals are performing while they learn before summative exams or projects arise
- Provide guidance on where specifically in the course students are struggling
- Use semantic analyses like learning curve analysis to identify areas where content needs to be improved
This last one in particular is important. A lot of times we are tempted to assume that whatever content we create simply works. This is true even in the traditional classroom where a teacher prepares and delivers a lecture. Was it an effective lecture? We might be able to decide if students liked the lecture but we have the same data collection problem as the click stream. All we can reliable know is if the student received the lecture. In reality we want to be able to know in what areas are we being successful at imparting knowledge and be accepting of the fact that not everything is great right at the start. Without semantic data and algorithms you’re forced to assume effective content and that when the student receives it, it did what it was supposed to (whatever that was) effectively (whatever that means). With so many variables, we must make assertions that can then be verified or debunked by analyses.
It’s tempting to try to work around this metric problem by using a summative evaluation as the metric, ie, if they pass this test, then clearly all the stuff before must have done what we wanted. (And even then, if they don't, there's no good information on why they didn't). This is not much better than saying the SAT is an accurate reflection of an individual’s skills as a whole and their educational experience up to exam time. We want an approach that utilizes the formative work of students to give us insights. If our materials are working, then the formative ought to simply give the same results and the summative ought to be perfunctory.
Impact of Errors in Analytics
The ultimate goal of learning analytics must be providing actionable feedback that can be given to students and instructors during the run of a course. This is already possible technologically and pedagogically. As this methodology becomes more readily available it will become expected. This isn’t a pipe dream. This mandates moving beyond the goal-post metrics of summative exams and click-stream data that allows us at best only look back and say “we weren't entirely successful in supporting students last semester, so we’ll take a guess this semester at improving things for students next semester and see what happens.” We can and ought to do better.
But this has to be done with a high degree of accuracy. If Amazon gives me a recommendation I don’t agree with (which it actually does fairly often for me), is there any real harm? If I buy some gifts for a friend and Amazon uses that to suggest other purchases for me that I don't want, I don’t leave Amazon because I see a poor recommendation. When they make a particularly good one their profit metric goes up a tic. No harm, no foul and so it is worth Amazon making the occasional poor recommendation to capitalize on the good ones.
What if we’re not good at these predictive models? It’s not as neat and tidy as recommendations. If we mistakenly identify a student as at risk for dropping out, the two possible negative effects are that we intervene with someone who is not actually at risk (not a terribly bad thing) or we miss an at risk student we would otherwise hope to identify (not optimal but no different from if we weren’t trying).
Now what happens if we tell a student they aren’t achieving learning outcomes when in fact we are wrong about that? The potential for demotivating the student comes at a high cost. This could happen with errors in reporting the other way, as well. If learning analytics inform a student they are succeeding but in fact they are not prepared for their next exam or job, the disservice is just as bad. Getting learning analytics wrong on the learning dimension is a recipe for disaster and must be done carefully and with understanding. Without that semantic ability to understand what is happening, we won’t even know if we’re doing harm to our students by using algorithms to optimize for things we don’t understand.
The way to take advantage of online learning technologies has to include the ability for timely, reliable, prescriptive information for those engaged in the learning experience (students and their instructors) as well as a rich semantic data set for learning engineers to be able to improve those resources continually. The only way to do this is to have algorithms and data that have an explanatory capability in order to give guidance to each user group on what to do next. This means developing rich models that encapsulate the intent of online learning content and well-instrumented learning environments that provide large sets of meaningful data that can feed these analyses. Click streams provide retention data and this is being used to success. Now we need to recognize the next step is in fact a harder one to take, but well worth it for everyone involved.