Whose Data Is It Anyway?

Listen

I admit: I don’t read Terms of Service agreements before hitting the “Accept” button. I doubt many folks do, save the lawyers who actually write them. As such, it’s hard for me to write an article wagging my finger at those of us who adopt software only to realize later that it has quite onerous terms – that we’ve granted someone an irrevocable license to our content, that we’ve agreed to have our data mined and sold to third parties.

Ignorantia juris non excusat. Caveat emptor. Et cetera. Et cetera. Et cetera.

Moreover, I write this article on the heels of several rants penned last week about my inability to access the Twitter archives for the ISTE12 hashtag. “I can’t even export my own Twitter archives!” I tweeted in frustration. (Yes, I realize the irony.) But I do love Twitter; it’s one of my favorite tools. Nonetheless, I’ve started keeping my own personal Twitter archive (thanks ifttt!), and more broadly, I’m trying to make sure all the other services I rely on let me export and/or host my data, allow me to opt-out of data-mining, and offer me user-friendly terms.

Easier said than done – in consumer as in ed-tech.

I have been writing a lot lately about the lack of data portability in educational software, and I do wonder how much of our data frustrations here are technological problems (i.e., no APIs, no exports), legal/TOS problems (i.e., restrictions or the lack thereof on data usage), and how much is simply ignorance or indifference on the part of consumers (and/or deviousness on the part of companies, I suppose) about both the technological and the legal ramifications of poor data interoperability.

Many of the discussions surrounding improved data portability in educational software – such as the efforts by startups like Clever and LearnSprout, both of which are building APIs to connect SISes to third party applications, and by initiatives like the Shared Learning Collaborative – focus on what I’d called administrative data: student records, grades, demographics, enrollment, test scores. But by encouraging interoperability here, I should note, the data flows from system to system, not necessarily from system to user (to their own data locker perhaps).

Left out of the discussion of data portability, by and large: user-generated data. (Students’ and teachers’) Blog posts, videos, comments, discussions, assignments, papers, projects, experiments, and all the analytics therein.

In other words, while LMSes, SISes, app developers, game-makers, gradebooks, and the like might enable the import, export, the GET and PUT of student data by schools, by teachers, APIs and apps, few offer the download (or API access) of student data by students.

Who speaks for user control of user data in education software?
How do we do a better job of protecting user-generated content, particularly student-generated content – so that students can control their own data, their own projects (and, in the end, their own learning)?
How should user-generated edu content licensed? (By extension, how do we help people understand copyrights and Creative Commons?)
How do “work-for-hire” and other copyright controls impact how professor/teacher/grad-student-generated content is treated (shared, stored, monetized, etc)?
What can we glean about sites’ missions, monetization plans, and more based on their TOS statements about user-generated content? And what can we glean about employment contracts based on their statements about employee-generated content? And then what?
How do we get folks to read the “fine print”?

A recent case study: Eduwonk’s Andrew Rotherham recently highlighted the different language in the TOS for the American Federation of Teachers’ newly launched Sharemylesson.com and some of the other lesson-sharing websites –namely BetterLesson). He writes,

SML and AFT President Randi Weingarten are correct that teachers retain ownership of content they put up on SML. That’s Randi’s main talking point. But it’s only half the story. The other half is the part where a participant gives up their rights to the content and SML can use it however it wants in perpetuity. The terms of use are unambiguous about that. So teachers sharing lessons and content on SML do not retain exclusive ownership. That’s a big deal. Weingarten says SML won’t use teacher generated content to make money. But the terms don’t explicitly say that, don’t preclude it, and have strong language that SML can do what it wants with the content.

As the NEA itself has noted IP issues are “complicated” surrounding teacher-generated content. Add to that the length and legalese of many online TOS agreements, and it’s not at all clear for the layman, I’d argue, to ascertain who owns what or what the implications are for uploading one’s data or accessing others’.

Alongside the calls in general for better data portability in education software, I’d add then we need to look critically at data ownership. That means cracking open the TOS agreement; it means re-evaluating intellectual property rights. And hopefully it means that when we talk about extracting data, we’re not simply talking about extracting value from our students’ and faculty’s user-generated content.

By Audrey Watters

More Posts(3)

By Audrey Watters

Reader Interactions

Trackbacks