By @mhawksey

OER File Formats: Tomorrow’s problem?

Repositories are living archives. In terms of the support it must provide for stored files, it must take into account two important functions of the files it holds:

  1. Access: The files are held so that users can access them. This means that they must be stored in formats that can be used by today’s intended audience
  2. Preservation: The files are held so that users in 5, 10, 50, or more years can still access them. This means that they must be stored in formats that can be used by future audiences, or in formats that can easily be migrated

These two considerations are not always complementary. A file format that is good for access today may not be a format that is easy to migrate, but a format that is easy to migrate may not be easy to read.

The text above is taken from the JISC infoNet Digital Repositories infoKit. An added complication when considering the deposit of OER is if you are not using a ‘No Derivatives’ licence how can you support remix/editing.  Here’s a scenario taken from the WikiEducator:

A teacher wants to make a collage. She imports several PNG photos into Photoshop and creates the collage. She saves the file as a PSD and exports a copy as a PNG to post on the web. While others can edit the PNG, it would be a lot easier to edit the PSD file. However, in order to use PSD files, the person has to have a copy of Photoshop.

Already it’s starting to get more tricky. PSD is a proprietary file format developed and owned by Abobe and used in Photoshop. You can actually open and edit PSD files in open source tools like GIMP (I’m not sure how legally Gimp can do this – waiting for a response from OSSWatch Update: I’ve had a response. Upshot ‘it can be awkward on all levels’. I’ll point to a related blog post when it’s published. Post by Scott Wilson at OSS Watch on using proprietary file formats in open source projects). Similarly you can use open source alternatives to Microsoft Office like LibreOffice to open and edit DOC/XLS/PPT etc but in this case Microsoft’s proprietary file formats under their Open Specification Promise, which if you read this page on Wikipedia itself has a number of issues and limitations.
The next issue is, as highlighted by Chris Rusbridge in his Open letter to Microsoft on specs for obsolete file formats, the OSP doesn’t cover older file formats. So if you were an earlier adopter publishing OER in editable formats there is a danger that the format you used won’t be suitable down the line.
I’m mindful of the Digital Repository infoKit’s last point of guidance

Be practical: Being overly-strict about file formats may mean collecting no files leading to an empty repository! A sensible approach must be used that weighs up the cost and benefits of different file formats and the effort required to convert between them.

Should OER file formats be tomorrow’s problem?

Exit mobile version