Today Tony Hirst posted Data Shaping in Google Refine – Generating New Rows from Multiple Values in a Single Column. Coincidently I faced a similar problem today working with a Google Spreadsheet so thought I’d share my solution.
Here’s a Google Spreadsheet where I’ve been capturing some MOOC infrastructure (contract me if you’d like to contribute/edit). For the technology column I’ve just been dumping in comma separated terms.
Having entered some data I started thinking about how it could be presented. One idea was to use the Smile timeline widget, which I’ve experimented with before here. To get the data in the right shape I need to split the technologies to have one per row. To do this I wrote the custom function below in Google Apps Script which splits the technology into one per row (I’m undecided now if I’ll still use this for a timeline – it’s another case of it’s taken longer to write this up than actually do it)
// splitColumnAndRepeatRows iterates row by row in the input range and returns an array spliting // by a comma separated column // Arguments: // - anArray: values[][] // - splitColumnIndex: specifies the column number where the cell values need to be comma split // 0 indexed // Returns an [][] Array of values. function splitColumnAndRepeatRows(anArray, splitColumnIndex) { var output = []; for (i in anArray){ // for each row var splitArray = anArray[i][splitColumnIndex].split(","); // split values in specified column for (j in splitArray){ // for each split cell value var row = anArray[i].slice(0); // take a copy of source row row[splitColumnIndex] = alltrim(splitArray[j]); // replace comma separated value with current split value output.push(row); // push new row to output } } return output; } function alltrim(str) { return str.replace(/^\s+|\s+$/g, ''); }
Update: added an alltrim function to remove whitespaces