When It Comes to Content, Say “Yes” to Wrappers But “No” to Containers

By Michael Feldstein. Posted on February 4, 2012

Scott Leslie has a good post up ruminating on the moving target of open textbooks which reminded me that I have long intended to write a follow-up to an exchange that he, I, and Rob Abel had in the comments section of a post a I wrote a while back. Scott lamented that the Washington State Board for Community and Technical Colleges was releasing its open course content in IMS Common Cartridge format, which seemed to him to be not so easily accessible or universally usable as one might like. I wrote in response,

Fundamentally, I don’t believe in cartridges. I don’t believe in forking a copy of a digital resource and stuffing it into another system. It’s bad for a variety of reasons, including but not limited to the implementation challenges that Scott ran into with Moodle (although it’s fair to say that some LMSs handle CC import better than others). Common Cartridge made more sense 5 or 10 years ago, but it’s late to the game and is ultimately destined to be eclipsed by in-place APIs, including but not limited to IMS LTI. (By the way, I’m not so sure it’s such a good idea to let Google own our integration API either.)

Unsurprisingly, Rob Abel, as CEO of the IMS, took issue:

If there is agreement that CC helps with the issue of content in an LMS then, well in your scenario the content is inside the publisher “LMS” (or equivalent).

Can I tailor it? Can I put things in there – like a syllabus – and get it out? If I’m the student and I create something in there can I get it out? Can I mix and match with other publisher materials? Can I archive that mixing for next term? Can I share what I did with my faculty peers who might want to learn from it? Can I create assessments in there and then use them somewhere else or just put them somewhere so that I can use them in the future?

Common Cartridge – or something like it – helps solve those issues. Fits right into the topic of openness. But, most importantly, in the digital education age we need to make digital education easy for the faculty and the students. Otherwise there won’t be a digital education age

Perhaps a mixture of OER and publisher proprietary stuff might be a solution. IMHO, some stuff needs to be tailored, remixed, moved in, and moved out. Doesn’t matter whether it’s a publisher platform or an LMS. Faculty want their stuff. Students want their stuff. Publishers need to help them, not thwart them.

I said that the binary choice Rob was offering up wasn’t the right one and promised to elaborate in a future post. Here, at last, is that response.

Let me start by reviewing an argument that I have made here before, which is that there should only ever be one copy of a learning resource except under very limited and specific circumstances. In this era of iframes, you can embed content pretty much wherever you want. By keeping the single canonical copy at one URL and surfacing it where it is needed (as opposed to copying it), you both maintain access to the most updated version from the authoritative source and preserve the ability to do in-depth usage and learning analytics. Who is using this content to learn what in which contexts? If you have a thousand copies of the same resource floating around, you can’t effectively aggregate this data (especially if you don’t know whether or how the content has been altered in those copies). There are only two circumstances under which it makes sense to make a second copy of a web-based learning resource: (1) you want to cache it locally for access in offline or bandwidth-constrained environments, or (2) you deliberately intend to fork the content and create a new version of it. And the first case should be addressed as a caching problem rather than a copying problem.

We have a number of formats today that are designed to take web-based resources and organize them for a particular type of consumption. Common Cartridge is one such format. It provides the content wrapped in metadata so the LMS knows where to put it. EPUB and the .ibooks derivative are other examples; they pull together disparate web-native resources into a book-like sequence and user experience. That’s fine. I have no problem with it. My problem is when those resources are copied and stored locally for no good reason. If you want to use one of these formats as a metadata wrapper to surface the remotely stored content within a context and user experience that makes it most useful, then yay. Use iframes or some similar technology and wrap them in the metadata you need. But don’t make local copies of the resources unless you have good reason to do so.

I would argue that efforts like the one by Washington State Board for Community and Technical Colleges should make the OER content available in canonical copies on their servers as plain old web pages and then provide cartridges that include pointers to those copies. Since one of the values of OERs is being able to remix, then maybe Common Cartridge should be extended to include an option to pull down the remote resource for local editing, constrained by the particular machine-readable license of that remote content. (I actually have an idea that would allow remixing but still maintain the “chain of custody” to the original resource for the purpose of learning analytics, but that’s another post for another time.) But the decision to download should be a deliberate one, not a default one, and all resources should be available on the naked web and not locked up by default in some metadata container that you have to crack open if you want access to the content.

By Michael Feldstein

More Posts(472)

Comments

Wilbert Kraan says

February 4, 2012 at 6:15 PM

I’m guessing that the next blog post you mention will be all about version control systems, possibly distributed ones such as git.

It certainly went through my head after that brilliant use case for one in Rob’s quote.
Michael Feldstein says

February 4, 2012 at 11:08 PM

Yup, that’s the right general idea. It needs to be simplified and adapted to be friendly to learning analytics, but yes, OERs should be backed by some sort of collaborative source management system.
Bill Fitzgerald says

February 11, 2012 at 2:50 PM

RE: “there should only ever be one copy of a learning resource except under very limited and specific circumstances.”

This approach/perspective puts more control than is needed into the hands of the content author (or content distributor).

The purpose of the content is to support interactions around it/with it that lead to learning, and these interactions take place within a different context that the point where the resource is authored/distributed.

Learning analytics around a resource, a canonical version of that resource, and the format used to distribute a resource are related, but technically separate issues. They can be joined to serve the needs of distributors, but learning can take place very well without these elements being addressed in lockstep.

So, with all that said, why aren’t we focusing our efforts on eliminating barriers to reuse and remixing? Forking is good; it’s where new varieties, each modified for their specific environment, can go to meet the specific needs of that environment. Many of these localized changes would never have a place in the original, “canonical” version of the resource.

But, given that the idea of a “canonical” version is arguable a dated term of more convenience to publishers/distributors that users/learners, why mandate that as a requirement that limits the ability to reuse this material elsewhere.

The business case around content shouldn’t get in the way of the actual usefulness (remixing/reusing/redistributing) that content. In reading through the SCORM and Common Cartridge specs, there are elements of those specs that have more to do with the business case of distribution than the actual process of learning.
Michael Feldstein says

February 11, 2012 at 3:35 PM

So, with all that said, why aren’t we focusing our efforts on eliminating barriers to reuse and remixing? Forking is good; it’s where new varieties, each modified for their specific environment, can go to meet the specific needs of that environment.

No, forking can be good, when it is deliberate and for a specific purpose. Forking for its own sake has no particular value, and can be harmful if information is lost.

But, given that the idea of a “canonical” version is arguable a dated term of more convenience to publishers/distributors that users/learners….

I do not accept that given. We should be able to use analytics to discover which learning resources are useful to achieve which learning objectives for which types of learner. Making sure that learners can find the most valuable and impactful educational content for their purpose is more than a convenience to publishers and distributors. But you can only achieve that if you are able to collect data at scale, which you can’t do if your analytics are fragmented across a million billion mostly identical copies.

I think I made it pretty clear that I think the system should enable the forking of open content when it’s useful to do so. But making copies by default in the era of the internet is pointless and counterproductive, particularly when forking is not the intention of the user in the vast majority of cases.
Bill Fitzgerald says

February 11, 2012 at 4:02 PM

RE:

Making sure that learners can find the most valuable and impactful educational content for their purpose is more than a convenience to publishers and distributors.

Search also does that pretty well, and creating/sharing content in more open formats would have the additional benefit of improving discoverability via search. The notion of automated discovery works best in closed systems, and/or when we know that any analytics are actually aligned with what learners need. Given how easy it is to learn from a large number of decentralized sources, are those safe assumptions to make?

RE:

But you can only achieve that if you are able to collect data at scale, which you can’t do if your analytics are fragmented across a million billion mostly identical copies.

Most resources would not be used anywhere near that number of times. It would be an awesome problem to have, but not one we’re likely to encounter anytime soon, except as – maybe, if a course gets incredibly popular – a fringe case. I’d rather design for the likely and learn from the fringe cases if and when they arise.

RE:

But making copies by default in the era of the internet is pointless and counterproductive, particularly when forking is not the intention of the user in the vast majority of cases.

The fork/don’t fork question isn’t really of much interest to end users – they just want something to work. As to the default (make a copy or use a reference) the “best” answer here is also largely one of context. For a resource being reused within a closed system, creating a reference will often work perfectly well. For reuse outside a closed system (ie, moving outside the silo) a copy often meets the needs of end users better. But labelling anything potentially helpful to end users as useless and counterproductive is, well, useless and counterproductive.

It comes down to priorities – are we more concerned about analytics and tracking, maintaining control of content, or streamlining and simplifying the reuse and redistribution of content. This isn’t a binary choice, but a spectrum from which we can choose, and our priorities help shape our choices wrt implementation.

We should also be clear that forking happens just about *every single time* an instructor uses a text. However, due to the medium of the text, the instructor/course specific improvements/shifts cannot be incorporated into a localized version of the text, and the instructor ends up maintaining their notes (or rather, their fork) on their own over the lifetime of the course.
Audrey Watters says

February 11, 2012 at 5:07 PM

At edu-related hackathons, I often hear the engineers among us say “let’s build a GitHub for education content.” I do like the idea of repositories, versioning, tracking committers and projects and the like (although I tend to get a little weary of “we’re the X for Y” startup metaphors… that’s a different issue perhaps. Perhaps). Would such a thing make it easier for folks to use, remix and commit content? Maybe. Ideally.

Have you looked at the Dept of Ed’s Learning Registry? (I haven’t looked at it closely yet)
Michael Feldstein says

February 11, 2012 at 5:18 PM

Search helps learners find the most relevant and impactful educational content pretty well? Really? Which search engine are you using? Maybe I need to try Bing, because I’m just not getting the same results that you are with Google. I submit that one reason that we’re not getting a lot of content re-use is because people can’t find it and find out how good it is.

Furthermore, if forking is of little interest to end users, then why fork by default? The whole notion of forking comes from the world of open source software, where you only fork code when you have a good and thoughtful reason to do so. Git and Subversion exist for a reason.

It is also specious to claim that content gets forked “every single time an instructor uses a text.” Notes are not the same as forking. Annotation is not the same as editing. Might instructors want to change the content itself? Yes. But can you claim that, given the option instructors would prefer to alter the text itself over adding a separate note? No. And when instructors do want to fork the content in order to make specific improvements, those changes are isolated to a local copy and lost to the world unless there is some sort of version tracking telling us that the change happened and meta- or para-data telling us why the change happened and what benefit the change created.

You keep wanting to construct a straw man in which I am arguing that all forking is bad. I never wrote that. My point is that mindless forking is bad and invisible forking is bad. If the vast majority of users don’t care about forking or not forking, then we shouldn’t fork without them asking to do it or even knowing they are doing it in the vast majority of cases. We know how to do distributed source control in an open way. Those principles just need to be applied to open content.

You are taking a relic of an aging architecture—the local file system—and arguing as if it was designed to solve the particular problems we face. You are essentially claiming that somehow wanting to use the modern web the way it is designed to be used is about asserting the control of the content publisher. That’s nonsense. If there are affordances we want to support, then let’s support them. But let’s not pretend that making local copies is an inherent good.
Michael Feldstein says

February 11, 2012 at 5:22 PM

Audrey, on the question of Git, my answer is an emphatic “yes.” (Our comments crossed in the ether.) I have thoughts about how to adapt the concept of distributed version control for this kind of use, and I will eventually write about those concepts.

On the subject of the Learning Registry, I will be posting an interview with Steve Midgely about it in the near future. That’s exactly the kind of project that makes me believe in the value of a canonical copy to which we can attach meta-data that will aid in discoverability and analytics.
Bill Fitzgerald says

February 11, 2012 at 7:41 PM

Hello, Michael,

A couple clarifications here.

RE:

Search helps learners find the most relevant and impactful educational content pretty well

Remove the word “educational” from this sentence, and it’s close to accurate. When we add “educational” back in, we become mired in the reality that much educational content is tied down in formats that do not get discovered as well via search (flash, pdfs, proprietary formats, behind firewalls, etc). That is why I said, “creating/sharing content in more open formats would have the additional benefit of improving discoverability via search. ”

RE:

Furthermore, if forking is of little interest to end users, then why fork by default?

Please re-read my comment. I never said that forking by default is the best choice. I said that both methods have their use: “For a resource being reused within a closed system, creating a reference will often work perfectly well. For reuse outside a closed system (ie, moving outside the silo) a copy often meets the needs of end users better.” To extend this, if a person is outside a system, and they are using an embedded version, if the original changes, they have no recourse to revert that change.

The thing about a default is that it’s a choice that can be changed. My position on this is that we need to preserve the ability for the end user to choose this for themselves.

RE:

Git and Subversion exist for a reason

Git and svn are both version control systems, but they operate under very different paradigms for code management. They are different tools for managing projects, so I’d hesitate to use them as interchangeable examples of managing the work of distributed teams.

RE:

when instructors do want to fork the content in order to make specific improvements, those changes are isolated to a local copy and lost to the world unless there is some sort of version tracking telling us that the change happened and meta- or para-data telling us why the change happened and what benefit the change created.

No, those changes are only lost to the world if they can’t be shared out in a format that is easily discoverable and reusable. They will be lost to upstream content repositories attempting to maintain a canonical version if the meta and para data is not included. But even then (to use a git analogy) if two people were working downstream on separate branches and they made edits to the same section, those changes would need to be reviewed manually, which runs the risk of at least one version of changes being lost within the merge.

RE:

You keep wanting to construct a straw man in which I am arguing that all forking is bad. I never wrote that. My point is that mindless forking is bad and invisible forking is bad.

No, that is not what I am doing in any way, shape or form. You clearly define when you think it’s okay to fork (in your original post: “There are only two circumstances under which it makes sense to make a second copy of a web-based learning resource…”). The fork/not fork question is completely irrelevant to most end users. To be clear, my position is that anything that diminishes someone’s ability to reuse and redistribute the materials they interact with should be examined very carefully. Making assumptions that restrict that ability should be done with trepidation, if at all.

RE:

You are taking a relic of an aging architecture—the local file system—and arguing as if it was designed to solve the particular problems we face.

Where is this coming from? Where do I discuss the file system as a solution to anything?

RE:

You are essentially claiming that somehow wanting to use the modern web the way it is designed to be used is about asserting the control of the content publisher.

No. The modern web was designed to be used as a tool for exchanging information. I favor a web that empowers users to interact with information how, where, and with whom they choose. There’s nothing wrong with content publishers asserting control over their work. It’s their right. Personally, I prefer licensing that allows for a broad range of use/reuse/redistristribution; this is clearly supported by the current and evolving web technologies.

Trackbacks

A Michael Feldstein Sampler says:

February 24, 2012 at 10:21 AM

[…] When It Comes to Content, Say “Yes” to Wrappers But “No” to Containers […]
Software Carpentry » GitHub for Education says:

April 17, 2012 at 2:01 PM

[…] Michael Feldstein: When It Comes to Content, Say “Yes” to Wrappers But “No” to Containers […]
Software Carpentry » GitHub for Education | Carpentry says:

April 18, 2012 at 4:02 AM

[…] Michael Feldstein: When It Comes to Content, Say “Yes” to Wrappers But “No” to Containers […]

By Michael Feldstein

Reader Interactions

Comments

Trackbacks