OER File Formats: Tomorrow’s problem?

Repositories are living archives. In terms of the support it must provide for stored files, it must take into account two important functions of the files it holds:

  1. Access: The files are held so that users can access them. This means that they must be stored in formats that can be used by today’s intended audience
  2. Preservation: The files are held so that users in 5, 10, 50, or more years can still access them. This means that they must be stored in formats that can be used by future audiences, or in formats that can easily be migrated

These two considerations are not always complementary. A file format that is good for access today may not be a format that is easy to migrate, but a format that is easy to migrate may not be easy to read.

The text above is taken from the JISC infoNet Digital Repositories infoKit. An added complication when considering the deposit of OER is if you are not using a ‘No Derivatives’ licence how can you support remix/editing.  Here’s a scenario taken from the WikiEducator:

A teacher wants to make a collage. She imports several PNG photos into Photoshop and creates the collage. She saves the file as a PSD and exports a copy as a PNG to post on the web. While others can edit the PNG, it would be a lot easier to edit the PSD file. However, in order to use PSD files, the person has to have a copy of Photoshop.

Already it’s starting to get more tricky. PSD is a proprietary file format developed and owned by Abobe and used in Photoshop. You can actually open and edit PSD files in open source tools like GIMP (I’m not sure how legally Gimp can do this – waiting for a response from OSSWatch Update: I’ve had a response. Upshot ‘it can be awkward on all levels’. I’ll point to a related blog post when it’s published. Post by Scott Wilson at OSS Watch on using proprietary file formats in open source projects). Similarly you can use open source alternatives to Microsoft Office like LibreOffice to open and edit DOC/XLS/PPT etc but in this case Microsoft’s proprietary file formats under their Open Specification Promise, which if you read this page on Wikipedia itself has a number of issues and limitations.
The next issue is, as highlighted by Chris Rusbridge in his Open letter to Microsoft on specs for obsolete file formats, the OSP doesn’t cover older file formats. So if you were an earlier adopter publishing OER in editable formats there is a danger that the format you used won’t be suitable down the line.
I’m mindful of the Digital Repository infoKit’s last point of guidance

Be practical: Being overly-strict about file formats may mean collecting no files leading to an empty repository! A sensible approach must be used that weighs up the cost and benefits of different file formats and the effort required to convert between them.

Should OER file formats be tomorrow’s problem?


Join the conversation

comment 5 comments
  • dkernohan

    Great post martin, and yes – worry about this too. We’ve always encouraged projects to use openly documented formats where possible – but for ease of use many have used MS Office or PDF.
    But this is a much bigger problem than OER, as the old computer wing of the national archive testifies…

  • Sheila MacNeill

    Hi Martin
    aren’t they already today’s (and yesterday’s)? Good to highlight again tho! Re-use, re-mix not always straight forward

  • Pat

    I think it depends on how long something is supported and if migration exists.
    Format X might be open, but that doesn’t mean it can be converted into something else. Flash is an open format, but so much elearning content is still trapped in it.
    I’d also wonder how long is the lifespan of an LO – five years? ten?

    • Martin Hawksey

      I’m aware of a number of general Flash to HTML5 projects but you’ll always have problems mapping format specifications. Nice to see projects like Xerte (via Xenith) are already looking at this area, I fear though there will be many more Chris Rusbridges in this world with content trapped in format.

  • Format? Y/N at OSS Watch team blog

    […] Hawksey of CETIS published an interesting blog post on the ugly problem of openly-licensed content wrapped in closed file formats. In that post Martin […]

Comments are closed.