Big Data: Will Open Source Software Challenge BI & Analytics Software Vendors

Predictive Analytics has been billed as the next big thing for almost fifteen years, but hasn’t gained mass acceptance so far the way ERP and CRM solutions have. One of the main reason for this is the high upfront investment required in Software, Hardware and Talent for implementing a Predictive Analytics solution.

As a result, only a handful of very large enterprises such as mega banks or top telecom companies have made the required investments and have benefited from power of Predictive Modeling and advanced Statistical techniques that are in existence for well over five decades. Most of the other companies have not been able to levarage power of business analytics as they cannot afford investing in specialized harware, database and BI/Analytics software applications being marketed by enterprise software vendors such as SAS and Teradata.

Well, this is about to change – thanks to technologies such as Apache Hadoop (which supports Big Data distributed applications under a free license), HBase (an open source, non-relational/distributed database) and the freely available R programming language (which is part of the GNU project). Using R, HBase and Hadoop, it is possible to build cost-effective and scalable Big Data Analytics solutions that match or even exceed the functionality offered by costly proprietary solutions from leading BI/Analytics software vendors at a fraction of the cost. And since R programming language is a freely available Open Source Software, users can leverage work done by others for specific analytics functionality and don’t have to re-invent wheel by rewriting the code. This reduces cost of developing analytics solution significantly.

Established BI and Analytics software vendors have no option other than offering their solution under SaaS (Software as a Service) model so that it is cost effective for their customers to implement analytics solution without requiring large upfront investment. This is all the more important for Big Data as the field is evolving rapidly. And if any BI or Analytics software vendor fails to adapt to this changing technological environment, they risk losing their market share.

 

  • Neil Raden

    I think this is oversimplified. First, most of the “community” developing these open source tools are employed by the commercial open source vendors, like Cloudera, who offer significantly improved, enterprise-ready versions of Hadoop, R, etc. As they gain acceptance, expect the prices to rise.

    Second, BI was developed and grew as essentially a reporting tool, not a tool for quantitative methods. The success of companies like QlikTech and Tableau, and continuing growth of IBI and BIRT from Actuate, prove that reporting is still valid, even if it has changed its use and presentation.

    Most companies have relatively small data warehouses and modest BI needs. The world can’t be full of leaders. 

    On predictive analytics – its not a good term. Most useful analytics, even statistical/quantitative ones, are not predictive. For example, there is pretty wide agreement that when it comes to customers, behavior itself is not perfectly indicative of true underlying propensities. People’s behavior is ineffably random and it can’t be figured out by sifting through hundreds of attribute about the customer, which is what “predictive” analytics purports to do.

    Hadoop/MapReduce in a functional programming framework for processing large amounts of weird, distributed data and is either overkill and/or not suited to BI. Someone still has to count the beans.

     

  • http://twitter.com/leodataminer Massih Mayeli

    Harish, Thanks for the interesting post.

    I believe open source is definitely going to challenge commercial BI software and this is going to be healthy for the market.

    World in general is asking for more and more freedom and open source is democratic. Look how Android’s market share is on the rise. People tolerate less,  getting locked-in with certain propitiatory technologies and this will be on the rise by having more options on the table.

    I agree with Neil that most companies have small data warehouses and they can go very far with simple reporting and OLAP drill-downs. That means that Big Data is going to stay with giant leaders and maybe some challengers. But if we speak about predictive modeling for instance,  many average companies are in need of building segmentation/loyalty models, it is really hard to convince management in analytically not-so-matured  companies, paying the bill for software such as SAS and SPSS. The way SAS advertised their packages a decade a go, by giving away student licenses to statistical departments at universities, R is getting very popular today. And people enjoy the freedom, large number of packages and getting around with software licenses without sweating to convince the management for paying the bill.

    This will certainly push big vendors to a direction to open up more and reconsider their pricing and strategies, if they want to remain competitive. 

    Just have a look how Microsoft today has realized power of open source and their big Hadoop projects http://www.wired.com/wiredenterprise/2012/01/meet-bill-gates/all/1

  • http://hkotadia.com/ Harish Kotadia, Ph.D.

    Thanks 
    @NeilRaden for visiting my blog and sharing your thoughts, greatly appreciated! 

    Here’s my take on points you have mentioned:

    * Agree that “”community” developing/offer significantly improved, enterprise-ready versions of Hadoop, R, etc. As they gain acceptance, expect the prices to rise” but not very much. Still a great bargain compared to what some of the analytics/BI vendors are charging, add to that cost of professional services.

    * Agree 100% that “BI was developed and grew as essentially a reporting tool, not a tool for quantitative methods and reporting is still valid, even if it has changed its use and presentation” My take on this is that given the volume, velocity and variety of Big Data, focus is less on “historic” reporting and more on “predictive modeling” like causal path analysis –> Churn/customer attrition forecasting in telecom for example. There is great value in using predictive analytics and taking corrective action, rather than just historic reporting like BI.

    * We disagree the most about Predictive Analytics from what you have written. You said that “behavior itself is not perfectly indicative of true underlying propensities. People’s behavior is ineffably random and it can’t be figured out by sifting through hundreds of attribute about the customer”. Techniques like Multiple-regression couple with factor analysis, cluster analysis and causal path analytics can be used very effectively with Big data – now that we have many variables, both in terms of rows and columns (variables and no. of observations for each). 

    Talking specifically about CRM, social media data can be used effectively for Churn forecasting/attrition management. And since this information is publicly available for free, cost of such solution both for data and analytics (R, HBase, Hadoop) is much lower as compared to “traditional” solutions from Analytics and BI vendors. This is what I have tried to highlight in my post.

    Thanks again for sharing your thoughts, much appreciated!

    Harish Kotadia, Ph.D.

  • http://hkotadia.com/ Harish Kotadia, Ph.D.

    Thanks  @leodatamine:twitter for visiting my blog and for sharing your thoughts, greatly appreciated!

    I agree that ”
    World in general is asking for more and more freedom and open source is democratic” and yes, Android market is a very good example. 

    You are also correct that it is tough to convince management in analytically not-so-matured  companies about paying the bill for software such as SAS and SPSS. In fact that is where Open Source alternatives will come in play as I have highlighted in my post. And thanks for the link  to Microsoft example, that is a great case study.

    Thanks again for your comments,

    Harish Kotadia, Ph.D.

  • http://www.brownbook.net/business/37881744/cheap- interior stair

    If there are children and old people in the house, they are the ones who are the most
    affected. The thing is, the concept for this perfect bathroom is yours, and
    yours alone. If you are considering doing some improvements to your attic
    the first thing you need to do is make a list of what you would like to use the space for.

  • About Dr. Harish Kotadia


    That's me with photo gear,  taking snaps of Texas wild flowers. #texas

  • Dr. Harish Kotadia

  • Dr. Harish Kotadia is an industry recognized thought leader on Big Data and Analytics with more than fifteen years' experience as a hands-on Big Data, Analytics and BI Program/Project Manager implementing Enterprise Solutions for Fortune 500 clients in the US.

    He also has five years' work experience as a Research Executive in Marketing Research and Consulting industry working for leading MR organizations such as Gallup.

    Dr. Harish Kotadia's educational qualification includes Ph.D. in Marketing Management. Subject of his doctoral thesis was Customer Satisfaction and it involved building a statistical model for predicting satisfaction of clients with services of their ad agency.

    His educational qualification also includes M.B.A. and B.B.A. with specialization in Marketing Management and Diploma in Computer Applications.

    Dr. Harish Kotadia currently works as Principal Data Scientist and Client Partner, Big Data and Analytics at a Global Consulting Company. Views and opinion expressed in this blog are his own.



  • Subscribe to this blog via RSS or Email


     Subscribe in a reader

    Enter your email address:

    Delivered by FeedBurner

  • Search this blog:




  • Tag Cloud

  • Calendar of Blog Posts:

  • September 2014
    S M T W T F S
    « Mar    
     123456
    78910111213
    14151617181920
    21222324252627
    282930  
  • Harish Kotadia's Flickr Photos


    By Erik Rasmussen