By @mhawksey

Hacking Google Slides as a voice enabled presentation tool using Google Apps Script

Using Google Slide speaker notes to prepare tweet text

On Saturday I returned to #DevFest London to share some of the work on developing conversational interfaces in the G Suite editors (Docs, Sheets, Forms and Slides). I always enjoy presenting at DevFest as usually it is a reason to explore something new. In the case of DevFest London it was an opportunity to think about how voice interfaces could be used as part of a Google Slide presentation. In this post I’ll share some of the ideas I explored and how they did/didn’t work out.

If you are interested you can make a copy of my presentation which includes all the code mentioned in this post and slides/audio of this talk are on YouTube.

Presentation hacks

So first thing to say these are hacks. I would love to be able to interact with Google Slides with Google Apps Script whilst in presentation mode but it just aint currently possible. I’ve previously written about my Domains talk and my experiences of using video production software (OBS Studio) for live mixing. As I was using the Google Slides editor with an add-on sidebar open I revisited the use of OBS to allow me to mix between a cropped and full screen view. This aspect I thought worked incredible well, OBS allowing me to have two scenes setup and I was able to toggle between the two using a hotkey. One thing I would change is the choice of hotkey. Because I didn’t want to lose focus in a sidebar text input I chose the Ctrl key. This was okay until I needed to adjust the browser zoom level using Ctrl and +/- and given the way OBS captures keystrokes also meant I toggled between scenes. If you’d like to see how this worked out my full presentation (slides and audio only) is on YouTube.

Below are some of the editor hacks I prepared for my talk designed to be used with a Dialogflow virtual agent I created:

Next slide…

So I did make a Dialogflow agent that could be used for slide navigation so that I could use phrases like ‘next slide’. Getting this working in Google Slides is a bit of a hack because, with Google Apps Script anyway, there isn’t a way to programmatically interact with slides once you are in presentation mode. Instead I created an add-on to enable slide interaction and then used a cropped version of my screen on the data projector.

In the end it felt like advancing the slides in this way wasn’t going to work as it quickly became clear that saying ‘next slide’ rather than clicking on a button was going to break the flow, so this particular hack was quickly kicked to the side line. Next time I plan to just capture the slide advance from my clicker and move the current slide.

In terms of code for moving the Google Slides editor with the methods available the route I used was to get the array of Slides[], work out where in the array was the index of the current slide then either increment/decrease this to get the slide object of the required slide:

// work out current slide and slide ids 
var prez = SlidesApp.getActivePresentation();
var slides = prez.getSlides();
var objIdx = slides.map(function (s, idx){
  return s.getObjectId();
});
var currentSlide = prez.getSelection().getCurrentPage().getObjectId();
var idx = objIdx.indexOf(currentSlide);
// …
if (param.action === "next slide" || param.action === "previous slide"){
  if (param.action === "next slide"){
    idx++;
  } else {
    idx--;
  }
  slides[idx].selectAsCurrentPage();
} else if (param.number>0){
  slides[param.number-1].selectAsCurrentPage();
}

Tweet this…

The second intent I implemented was a ‘tweet this’ agent which would enable me to live tweet a slide image and any tagged text I’d included in the speaker notes for that slide. Again this could only be used when in the editor. In the end I didn’t use this in my presentation for the similar reason that it would break the flow:

In terms of the code used to tweet a slide I created a function that would take a Slide object parse any text between and then tweet the text along with the slide saved as an image (to handle the Twitter API call I used my own TwtrService library – other Twitter libraries for Apps Script are out there and might be easier to setup).

/**
 * Tweet slide as image and with speaker note text.
 * @param {Slide} slide to tweet 
 * @return {String} url of tweet
 */
// H/T https://stackoverflow.com/a/46711331/1027723
function tweetSlide(slide) {
  var pageObjectId = slide.getObjectId();
  // get tweet text from the speaker notes
  var note = slide.getNotesPage().getSpeakerNotesShape().getText().asString();
  var regex = /<tw>(.*?)<\/tw>/g;
  var tweet = regex.exec(note)[1];
  console.log("Tweet text: "+tweet);
  
  // only prepare the tweet if I've remembered to include it in the notes
  if (tweet){
    var presentationId = SlidesApp.getActivePresentation().getId();
    // to send a media tweet twitter requires sending the image first
    // get slide image as a thumbnail (only PNG currently supported)
    var thumbnail = Slides.Presentations.Pages.getThumbnail(presentationId,
                                                            pageObjectId,
                   {"thumbnailProperties.thumbnailSize": "MEDIUM"});
    var blob = UrlFetchApp.fetch(thumbnail.contentUrl).getBlob();
    
    // send media to twitter as base64Encoded 
    var parameters = { "media_data" :  Utilities.base64Encode(blob.getBytes()) };
    var twResp = TwtrService.upload("media/upload", parameters);
    var media_id= twResp.media_id_string;
    
    // post the tweet and image
    var twResp = TwtrService.post("statuses/update", {status: tweet, media_ids: media_id});
    console.log(JSON.stringify(twResp))
    var url = "https://twitter.com/"+twResp.user.screen_name+"/status/"+twResp.id_str;
    var url ="https://twitter.com";
    return url;
  }
}

Lessons learned

Solo presentations are probably not the best platform for experimenting with conversational interfaces – who wants to have the presenter talking to their laptop rather than to them. One afterthought was whether if there was a continuous audio capture that an agent could be trained to pick up audio cues. For example, if saying something like “blah, blah, blah, I’ll tweet a copy of this slide, blah, blah, blah”, my intent could be used to tweet the slide. One problem I’d anticipate is a bit like your Google Home or Google Assistant getting over excited when not required, you could end up randomly tweeting slides. Anyway not giving up one this, looking forward to sharing the version.

Exit mobile version