Analytics Reconnoitre – AaaS/DaaS: Analytics as a service, data as a service

For the Analytics Reconnoitre I’ve been trying to get my head around ‘Analytics as a service’ asking myself what new “as-a-service” offerings are emerging. Let start by defining what ‘as-a-service’ is before looking at some of the analytics offering. For this I’m going to use the five key characteristics used in the JISC CETIS Cloud Computing in Institutions briefing paper:

As a service: Key characteristics

Rapid elasticity: Capabilities can be rapidly and elastically provisioned to quickly scale up and rapidly released to quickly scale down.
Ubiquitous network access: Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones and laptops, etc.).
Pay per use: Capabilities are charged using a metered, fee-for-service, or advertising based billing model to promote optimisation of resource use.
On-demand self-service: A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed without requiring human interaction with each service’s provider.
Location independent data centres: The provider’s computing resources are usually pooled to serve all consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand.

Delivery Levels

The JISC CETIS briefing then goes on to name three delivery levels for as a service offerings in software (SaaS), platform (PaaS) and infrastructure (IaaS). Here are my suggestions for Analytics as a Service and Data as a Service:
Analytics as a Service (AaaS): The capability provided to the consumer is to use the providers applications running on a cloud infrastructure to extract “actionable insights through problem definition and the application of statistical models and analysis against existing and/or simulated future data”* Examples include application specific solutions like Google Analytics and more general offering like Amazon AWS.
*definition proposed by Adam Cooper in Analytics and Big Data. In many instances AaaS is a subset of SaaS
Data as a Service (DaaS): The capability provided to the consumer is to use the provider’s data on demand regardless of location and affiliation. Education specific data services are provided by HESA, UCAS and others (more examples in the JISC infoNet BI infoKit). Cost models include subscription and volume based. As well as DaaS option there are a growing number of Open Data providers including Government initiatives like These fall outside the definition used here of ‘as-a-service’ offering.


Web Analytics: Web analytics as a service is not a new phenomenon and the current market leader Google Analytics has been around since 2005. Google’s ‘as a service’ offering is available for free or as a paid for premium service. The service provides a number of standard web-based dashboards which allow administrators to analyses website traffic. Recently Google have also start recording and reporting social referrals from networks like Facebook, Twitter and their own Google+. Detailed social activity streams are also available from Google’s Social Data Hub partners. These streams extract conversations and social actions like bookmarking around website resources. As well as the web interface Google have options for downloading processed data and API access for use in other applications and services.
Customer Relationship Management: As part of the CETIS Cloud Computing briefing Enrollment Rx was used to illustrate how their CRM solution offered as Software as a Service in turn build upon the Platform as a Service offered by Salesforce. As part of this Enrollment Rx integrate Salesforce’s analytics tools and dashboards within their own product. Within Salesforce’s appexchange there are over 100 other applications tagged with ‘analytics’, including SvimEdu which is a complete enterprise resource planning package targeted at the education sector.
BenchmarkinginHE:  Benchmarking In HE is a HEFCE funded project which aims to offer benchmarking tools and data for universities and colleges. Many of the data sources (listed here) are Open Data provided by organisations like HESA but some are only available on a subscription basis. For example, the Higher Education Information Database for Institutions (heidi) which is managed by HESA is operated on a subscription basis and operated on a not-for-profit basis. The current tool available to institutions via BenchmarkinginHE is BenchmarkerHE, an online database of shared financial data with reporting options.
Big Data Analytics: Similar to the CRM illustration there are other examples of raw analytics services that are also relayered with 3rd party applications. An example of this is Amazon’s Elastic MapReduce (EMR). MapReduce is a programming framework for processing large datasets using multiple computers originally developed by Google and now features in open source frameworks like Apache Hadoop. Elastic MapReduce was developed as part of one of the offering in Amazon Web Services (AWS) based on Hadoop and is ‘elastic’ because it can easily scale.  Karmasphere Analytics for Amazon EMR is a service which provides a graphical layer to interface Amazon EMR providing tools to create queries to generate reduced datasets which can be visually viewed or exported into other tools like MS Excel.

Spare notes

There is one more illustration I have in mind but doesn’t entirely fit with the ‘as-a-service’ ethos. There are a growing number of sites that let you publish datasets for analysis. These services don’t include tools to process the data, instead they provide an infrastructure to set bounties. Examples include Kaggle and Amazon Mechanical Turk, the later being a component of UC Berkeley’s AMPLab, which I’ve written about here.

Risk and Opportunities

A number of risks and opportunities are identified in the JISC CETIS Cloud Computing in Institutions briefing paper. One additional opportunity offered by analytics as a service is the argument that ‘as-a-service’ offering can, to a degree, remove the reliance on the need to have a dedicated data scientist. For example, a recent NY Times article asked ‘Will Amazon Offer Analytics as a Service?’, in which they speculate if Amazon will make and sell pattern-finding algorithms, removing the burden from the customer to develop their own.

Available Products and Services

A range of analytics and data services are available. Here are a couple I’ve mentioned in this post topped up with some more.
Google Analytics: A free Google product that provides website analytics. Standard reporting includes analysis of: audience; advertising; traffic sources; content; and conversions. Data can be analysed via the Google Analytics web interface or downloaded/emailed to users. Analytics also has a Data API allowing which can be used by 3rd party web services or in desktop applications. Website visitors are tracked in Google Analytics using a combination of cookies (rather than server logs) and most recently social activity. Google market share is reported to be around 50% but in a recent survey of 134 Universities UK websites 88% (n.118) were using Google Analytics.
Enrollment Rx (text from CETIS briefing): Is a relatively small company in the US that offers a Customer Relationship Management solution as Software as a Service. The service allows institutions to track prospective students through the application and enrollment process. The system is not free, but the combination of web delivery on the user end, and Platform as a Service at the backend, are intended to keep prices competitive.
Salesforce for Higher Education: Higher education institutions are using the platform for its instant scalability, ease of configuration, and support for multiple functional roles. Imagine a unified view of every interaction prospects, students, alumni, donors and affiliates have with your department or institution. Combine this with all of the tools you need to drive growth and success – campaign management, real-time analytics, web portals, and the ability to build custom applications without having to code – and you’re well on your way to getting your school to work smarter.
Karmasphere Analytics for Amazon EMR: Karmasphere provides a graphical, high productivity solution for working with large structured and unstructured data sets on Amazon Elastic MapReduce. By combining the scalability and flexibility of Amazon Elastic MapReduce with the ease-of-use and graphical interface of Karmasphere desktop tools, you can quickly and cost-effectively build powerful Apache Hadoop-based applications to generate insights from your data. Launch new or access existing Amazon Elastic MapReduce job flows directly from the Karmasphere Analyst or Karmasphere Studio desktop tools, all with hourly pricing and no upfront fees or long-term commitments.
Kaggle: Kaggle is an innovative solution for statistical/analytics outsourcing. We are the leading platform for predictive modeling competitions. Companies, governments and researchers present datasets and problems – the world’s best data scientists then compete to produce the best solutions. At the end of a competition, the competition host pays prize money in exchange for the intellectual property behind the winning model.