From Data to Information to Insights: Changing Role of #CIO

There was a time not too long ago (late 1970s to be precise) when companies use to have Electronic Data Processing department and EDP Managers to manage data processing function. EDP department evolved to become Management Information Systems department in early 1980s and MIS Manager ran the show when it came to ‘computerization’ initiatives as it used to be called back then.

As technology evolved from mainframes to client-server computing and with the rise of personal computers (PCs) or desk-tops, adoption of information technology across organization picked up pace and IT was no longer limited to key business processes such as accounting and inventory management.

With expansion of computer networks and growth of the internet in 1990s, increasing number of business processes started getting ‘computerized’ resulting in decentralization of information technology and companies started moving away from having a ‘centralized’ EDP or MIS department for managing their IT functions. This decentralization of MIS (or IT) received boost with development of ‘business friendly’ software applications that just needed to be customized to meet business requirements rather than developing them from a scratch. Case in point – ERP or CRM systems.

But although key functions of MIS department were decentralized due to rapid expansion of information technology within the organization, there was need for a role to monitor and guide the adoption and use of information technology tools across organization in order to make sure that multitude of systems that were being developed ‘talked’ to each other and followed consistent organizational standards when it came to development and usage. Role of Chief Information Officer (or CIO) evolved to meet this need and most large and medium sized companies started having CIO function or role since middle of 1990s.

Role of CIOs gained importance with the expansion of internet and growth of web enabled business applications during dot com boom. And CIO’s function became critical during the outsourcing boom following dot com bust in early 2000s as growing number of CEOs relied upon their CIOs (and their ‘twin’ brother CTOs) not only to manage growing complexity of enterprise IT but also to manage it in a cost effective manner through outsourcing. As a result, CIO’s role became critical within most large and medium sized companies in the last ten years.

But as ‘short’ history of information technology can tell us that only thing that is constant in IT is change. And change at even faster pace with each passing year. To confirm, just consider changes that have taken place in IT over the past few years. We have seen IT evolve at an even faster pace, thanks to rapid growth and adoption of cloud computing, mobile devices, social media and explosion at the rate at which data is being generated by end users resulting in phenomenon that is now known as ‘Big Data’.

Add to this the fact that technology is becoming even more business and end-user friendly. Just as an example, because of cloud computing and software as a service (SaaS) model, some of the work that was done traditionally by CIO’s or CTO’s organization is now being done by outside service providers with help from functional executives in user departments. To cite another example, CMOs or key executives in marketing department of large and medium sized companies are working directly with vendors with minimal inputs from CIO or CTO of their organization to leverage social media for multi-channel marketing or customer engagement. This growing ‘consumerization’ of technology has resulted in erosion of the clout or influence CIOs had in their respective organization just a few years back.

So the question is how can CIOs regain past glory of their role? In my opinion, best way Chief Information Officers (Or CIOs) can regain past glory of their role is to adopt and grow with changing technology and evolve to become Chief Insights Officer – still a CIO! As more and more applications get migrated to cloud and with enterprise apps and data residing in cloud managed mostly by third party service providers, ‘traditional’ functions performed by CIOs are being performed by third party vendors. But what has not changed is the need for ‘quality’ information in a timely manner to aid in decision making across organization. With more data being generated outside the organization than within the organization, such as social media data, there is a need for someone at a senior level to not only monitor but guide the organization as to how data are collected, stored and most importantly analyzed in a timely manner to aid in decision making. Given the volume, velocity and variety of data being generated, it is no longer enough just to prepare ‘simple’ reports, but to derive critical insights in real time from available data, both from internal and external sources. And in my opinion, no one is better prepared to take on this challenge other than good old Chief Information Officer.

So Dear Chief Information Officer, are you ready to take on the role of Chief Insights Officer?

Big Data in Retail Industry

Here’s a great data visualization on Big Data in retail industry. Also, a FT video on the subject:

The Retailer’s Guide to Big Data

Source: Monetate Marketing Infographics

Source: http://video.ft.com/v/2172704889001/Big-data-bring-attention-to-retail

Where The Big Data Jobs Are and How Much They Pay

I am often asked the question, especially by those aspiring for a career in Big Data, as to how to find a suitable job in Big Data and how much do they pay.

Given below is an excellent Visualization by Chris Dannen that answers some of these questions like where the jobs are, how much they pay etc. Hope you find this data visualization chart useful.

In my future posts, I will highlight some of the skills necessary to get one of these job and how to get necessary training on a low budget or even free. So you may want to bookmark my blog site URL address: http://hkotadia.com/ or sign up for email updates using this link: http://feedburner.google.com/fb/a/mailverify?uri=hkotadia/qkGh

Source: Big Data Jobs Around The Nation (And What They Pay)

Key Big Data Terms You Should Know

Given below is a listing of key Big Data terms that you should know and a very brief explanation of what it is in simple language. Hope you find it useful.

1. Hadoop: System for processing very large data sets
2. HDFS or Hadoop Distributed File System: For storage of large volume of data (key elements – Datanodes, Namenode and Tasktracker)
3. MapReduce: Think of it as Assembly level language for distributed computing. Used for computation in Hadoop
4. Pig: Developed by Yahoo. It is a higher level language than MapReduce
5. Hive: Higher level language developed by Facebook with SQL like syntax
6. Apache HBase: For real-time access to Hadoop data
7. Accumulo: Improved HBase with new features like cell level security
8. AVRO: New data serialization format (protocol buffers etc.)
9. Apache ZooKeeper: Distributed co-ordination system
10. HCatalog: For combining meta store of Hive and merging with what Pig does
11. Oozie: Scheduling system developed by Yahoo
12. Flume: Log aggregation system
13. Whirr: For automating hadoop cluster processing
14. Sqoop: For transfering structured data to Hadoop
15. Mahout: Machine learning on top of MapReduce
16: Bigtop: Integrate multiple Hadoop  sub-systems into one that works as a whole
17. Crunch:  Runs on top of MapReduce, Java API for tedious tasks like joining and data aggregation.
18. Giraph: Used for large scale distributed graph processing

Also, embedded below is an excellent TechTalk by Jakob Homan of LinkedIn on the subject explaining these tech terms.

Master Data Management (MDM): Key to Big Data Success

With the hype surrounding Big Data and current focus on tools and technology such as Hadoop, it is easy to forget that success of any technology project rests more on strategy and less on technology/tools. That’s true even in the case of Big Data solutions.

Architects and managers implementing Big Data solutions would do well to remember that in order to truly leverage and derive insights from Big Data, it is important to have a Master Data Management (MDM) solution in place with a repository of relevant non-transactional data entities (also known as master data).

For example, if an organization wants to leverage social media data for better sales, marketing or customer support, it is important that a master database of all customers and prospects is in place with information on social media profiles/handles for each customer. Master Data Management (MDM) ”comprises a set of processes, governance, policies, standards and tools that consistently defines and manages the master data of an organization” (for more, see this).

Trying to implement a Big Data solution without a repository of relevant master data is a recipe for disaster in my opinion. What to you think? Do you agree that MDM is key to Big Data success? Please share your thoughts:

 

Predictive Analytics: A Force Multiplier for Big Data

Force Multiplier, a noun, means something that increases effect of the force. In military usage, force multiplication refers to “an attribute or a combination of attributes which make a given force more effective than that same force would be without it” (for more, see this).

Big Data, which is characterized by three Vs, namely Volume, Variety and Velocity can be a major force in running of any large or medium sized businessas as it adds tremendous value by improving quality of decision making. Thanks to Big Data revolution, it is possible to process large volumes of structured and unstructured data in real time and derive insights from large data sets. This by itself is a huge improvement over pre-Big Data era.

What’s even better is that predictive analytics makes it possible not only to analyze the past, but predict the future too with high degree of confidence level. For example, social media data can enrich risk modeling and help auto insurance companies prepare much better risk profile of an individual. Car sensor data can help auto insurance companies better assess risk posed by a driver’s habits (like speeding, fast acceleration or braking) and come up with auto insurance policy tailored to that specific individual with individual level premium (not at a zip code or a city level).

Another good example is assessment of customer life time value (CLV). Using big data, companies can come up with much better assessment of customer life time value. What makes it even better is that predictive modeling can be used on social media or sensor data in arriving at a much better estimation of CLV so that companies can better target customers with high CLV. This has been very effectively used in Travel and Hospitality industry.

Point I want to highlight here is that Big data is a revolution in itself as it enables organizations identify, store, process and analyze data sets from outside the organizations in a way which was not possible thus far. Add predictive analytics to this mix and it pushes Big Data capability to a whole new level – a true force multiplier. Don’t you agree?

Question is how many large and medium sized companies are in a position to take advantage of not only Big Data revolution, but also effectively leverage Predictive Analytics for driving better insights and decision making. Not too many in my opinion. What do you think? Please do share your opinion:

Infographic: Big Data and Predictive Analytics

I published a blog post titled Big Data and Rise of Predictive Analytics a couple of days back in which I highlighted that I am happy to say that Predictive Analytics (or Advanced Analytics as some would prefer to say) is going main stream in 2013, thanks to Big Data. It is not too difficult to understand why given the three Vs of Big Data, namely Volume, Variety and Velocity. Only way one can derive full benefits out of Big Data is by using predictive analytics and this is forcing large and medium companies to make necessary investments in building analytics infrastructure and reporting capabilities. And this is excellent news for those in Analytics profession or technology companies/service providers who help clients derive insights from mountains of (big) data.

Good to see that major enterprise software vendors have started focusing their attention on predictive analytics. Here’s a very good infographic on Predictive Analytics published by SAP Blog (infographic embedded below):

 

Big Data and Rise of Predictive Analytics

Back in mid 1990s, when I was a Ph.D. student,  one of the professors asked me what my career goal was and I said: “To help clients serve their customers better through use of Information Technology and Analytics“.

After completing my Ph.D. in November of 1998, when the IT revolution in Enterprise Solutions was about to take off and large and medium companies had started investing in ERP or CRM systems in a major way, I thought it was just a matter of time before Predictive Analytics goes main stream.

Back then, Siebel ’98 and Vantive were the ‘hottest‘ new tools in the market and Dot Com boom was on its way. I expected that in a year or two (or may be three), predictive analytics would become part and partial of all enterprise information systems and use of statistical techniques such as Regression Analysis, Factor Analysis and Multi-Dimensional Scaling would be common while analyzing and reporting information collected using ERP or CRM systems.

Looking back, I think I was over optimistic as this did not happen around 2001-2002 time frame as I expected. Most of the ERP and CRM applications had bare bones reporting functionality with just frequency (%) and advanced analytics was not leveraged.

If an application manager wanted anything more than frequency or % information, he/she had to invest in Business Intelligence (BI) or Data Warehouse (DW) solution. But again, BI or DW solutions analyze past information and are not “predictive” in nature the way high end statistical tools can be.

Yes, one could invest in SAS based analytics solution, but that was expensive, time consuming and out of reach of most companies – even Fortune 500 ones. As a result, use of Predictive Analytics was limited to a handful of use cases such as fraud detection in banks or customer churn management in telecom companies for example, where one could justify the investment in terms of time, effort and costs. But for a majority of ERP or CRM applications, data collected was never analyzed using Predictive Analytics and as a result investments in ERP or CRM systems did not deliver expected return on investment (ROI).

Things started to change around 2008-2009 with advent of social media tools. Again, I thought that it was just a matter of 12-18 months before Predictive Analytics goes main stream as large organizations will be required to use analytics tool to engage their customers on social media channels. In a blog post titled Social Media: The New Front End of CRM System published three years ago, I wrote that “the best any marketer can do is to Listen and Learn from what customers are saying and Engage them in meaningful conversations. In other words, treat Social Media channels as the front-end of CRM system, capture all relevant information from Social Media channels in the database and use Predictive Analytics and Knowledge Management tools to derive insights and help in decision making”. Again it turned out to be a case of over optimism as advanced analytics did not go mainstream in 2009-2010 as I expected.

Finally, yes, I say FINALLY after waiting for fifteen years, I am happy to say that Predictive Analytics (or Advanced Analytics as some would prefer to say) is going main stream in 2013, thanks to Big Data. It is not too difficult to understand why given the three Vs of Big Data, namely Volume, Variety and Velocity. Only way one can derive full benefits out of Big Data is by using predictive analytics and this is forcing large and medium companies to make necessary investments in building analytics infrastructure and reporting capabilities. And this is excellent news for those in Analytics profession or technology companies/service providers who help clients derive insights from mountains of (big) data.

What do you think? Do you agree that Predictive Analytics is going main stream in 2013 or is it a case of over optimism? Would love to hear your thoughts:

Organizational Challenge: Where to Start in Big Data?

Have ever wondered what is a good starting point for an organization as it embarks upon its Big Data voyage? If yes, here’s a good YouTube video on the subject by Stacy Leidwinger, Product Manager, IBM Big Data.

Hope you find this video not only interesting but also educative:

 About the author:

View Harish Kotadia, Ph.D.'s LinkedIn profileView Harish Kotadia, Ph.D.’s profile

5 Ways Big Data Are Fundamentally Changing Information Systems

A lot has been said and written lately about whether Big Data revolution is for real or it is one more hype that will die down soon as tech world moves on to the next fad.

In my opinion, Big Data  is a game changing revolution that will fundamentally change how information is collected, stored, managed and consumed thereby transforming the way we work, live and play.

Given below are five reasons why Big Data will change information systems and corporate IT:

1. Move away for traditional RDBMS:

Ever since electronic storage and processing of data began as a centralized corporate function (remember good old EDP or Electronic Data Processing Department!), Relational Database Management System or RDBMS in short is fundamental to most of the computerized corporate information systems. Even today, most of the information systems such as ERP or CRM are supported by RDBMS.

This is about to change in a big way, thanks to three Vs of Big Data namely, Data Volume, Data Variety and Data Velocity. Traditional data storage and retrieval methods, such as RDBMS are no longer going to work and would necessitate NoSQL (short for “Not only SQL”) database instead of RDBMS. Unlike SQL data or RDBMS which places data inside well defined structures or tables using meta data, NoSQL is designed to capture all data without categorizing and parsing upon entry into the system. This will fundamentally change the architecture of corporate information systems.

 2. Unstructured data handling capability:

Capability of handling both, structured and unstructured data is another important way information systems are going to change fundamentally thanks to Big Data. As noted above, Big Data has three defining attributes – Data Volume, Data Variety and Data Velocity and together they constitute a comprehensive definition of Big Data.

Data Variety implies that Big Data is not just about text or numbers (alphanumeric fields), but also unstructured data. Information systems in future will have to be designed with capability of handling both structured and unstructured data.

 3. Real Time Data Processing:

Given the Velocity or speed with which Big Data is being generated, information systems in future will require capability of processing massive volume of data in real time. Even “near real time”, a phrase often used with current generation of information systems, is not good enough.

A good example of real time data processing is the ability to process social media or sensor data as they are being generated and take necessary action immediately, such as responding to a tweet or Facebook posting. Batch processing, nightly or weekly updates and even near real time data processing are not good enough because of high Data Velocity as is the case with Big Data.

4. Predictive analytics and in memory analytics:

If data is being generated in a variety of formats (structured and unstructued), in high volume and at a high velocity, only way it can be used effectively for decision making is through the use of Predictive Analytics and in memory data analytics. Information systems in future will have to be designed keeping this aspect in mind.

 5. Most data are either user or machine  generated:

And lastly but not the least, most of Big Data are either generated by end users/customers (such as social media data) or by machines/sensors outside the confines or firewall of a company. This is unlike in the past, when most of the data were generated within the firewall of a corporation (such as transaction data, inventory data or factory production data) with very little coming from outside. This will fundamentally transform the architecture of information systems in future.

What do you think? Do you agree that Big Data will fundamentally change information systems and corporate IT? Please do share your thoughts:

 About the author:

View Harish Kotadia, Ph.D.'s LinkedIn profileView Harish Kotadia, Ph.D.’s profile