5 Tips for Buying Big Data Tools

Drew Robb

Updated · Oct 10, 2014

Not everyone can agree on a definition for Big Data. Though many experts, including Gartner, cite high volume, variety and velocity as key characteristics of Big Data, there is more to it than that.

“Big Data is much more than volume, variety or velocity of data,” said Irfan Khan, CTO, Global Customer Operations, SAP. “It is about being able to consolidate a variety of data from disparate sources and get a holistic picture.”

You can split the subject into several sub-categories. One category is machine-generated or machine-to-machine data generated by sensors and machine logs, to create what is being called the Internet of Things. Another broad category is human generated data – not just the social data that people think of when Big Data is mentioned, but other types of human-generated data, including information from point-of-sale and other transactional systems as well as ERP, emails, documents and other elements that make up enterprise data.

The point of Big Data is to tie all this together to gain a holistic picture. It can then be used for a multitude of use cases from understanding and predicting customer behavior, loyalty and churn, to customer segmentation and campaign performance, to preventative maintenance based on warranty and call center data.

“The ultimate goal is to be able to glean insight from a variety of data sources and types integrated and connected in real time with enterprise data to be able to predict, forecast and optimize for greater results and performance,” said Khan.

Big Data's Barrier to Entry Getting Lower

Just a few years ago, implementing Big Data meant some kind of rip-and-replace strategy that often involved assembling technically demanding Hadoop-based data stores. That kind of complexity has given way to a more user-friendly entry into the Big Data universe. If you want to add a Hadoop element, you can plug in a pre-assembled system rather than having to cobble together one of your own from various open source and proprietary software elements.

Newer Big Data systems are coming on the market that integrate more easily with existing enterprise systems for maintenance, inventory management and supply chain while adding data from other sources into the mix to provide keener insight.

Khan gave the example of adding sensor data to an existing maintenance system to help move from a “fix it after it breaks” mentality to a proactive “anticipate and service before it breaks” process.  “There is no need to rewrite and redo the entire system, then have to re-train all the staff,” said Khan.

Big Data Features

As for specific features, bells and whistles, it’s almost a case of “how big is infinity?” Survey the vendors that offer Big Data solutions and you may get lost among all the potential features that are available. But common desired characteristics include:

  • Back-end and front-end simplicity
  • Ease of use
  • Speed of response
  • Analysis and prediction capabilities
  • Visualization tools
  • Support for mobile devices
  • Being data agnostic

Here are five more tips for selecting the right Big Data solution:

Identify Big Data Goals

Israeli advised those considering Big Data analytics to look before they leap. This means working back from the actual problems they are trying to address as the starting point for any Big Data strategy.

Are you trying to upgrade from Excel and need a simple visualization tool? Do you need to analyze gigabytes, terabytes or petabytes of data? Is your data structured or unstructured? Do you have a large data science team or are you seeking to empower business users to analyze data and create reports and dashboards? Are you a CIO seeking to reduce the wait time for query returns?

Knowing the answer to these questions goes a long way toward defining an appropriate toolset.

Stick with Web-based Software

“You can’t really call yourself Big Data if you’re bound by a non-Big Data environment like your desktop or server,” said Joel Horwitz, director of Products and Marketing, Alpine Data Labs.

In addition, today's Big Data tools need to be nimble, easy to update, and quick to revise or customize.

“Big Data technology is rapidly evolving, so it’s important to stick to agile solutions that don’t quickly become obsolete,” said Elad Israeli, co-founder and CPO, SiSense. “Avoid solutions that take weeks or months to implement or require expensive consultants.”

Look for Leaders in Lines of Business

Who will lead a Big Data initiative also needs to be worked out. Some projects are begun in IT, others at top management level and others in lines of business.

“Big Data projects do best in the product and marketing groups. Nine out of 10 times that’s the case if you want to use it strategically,” said Horwitz. “It starts in the line of business because that is where you can do the full end-to-end process.”

If you’re in marketing, for example, you can add sensors or tags to particular events, pull data all the way through your analysis and use it to act on that information. The key is that the analysis is only as good as your ability to act on it.

Match Tools to Use Case

Horwitz suggested reviewing tools against the individual steps in the process, such as ability to tag events, aggregate the data/data collection, manipulation of the data, model creation and analysis. The manipulation element is often a tricky area.

“There are a number of data transformation tools that help people wrangle their data,” said Horwitz. “You need a Swiss army knife that can deal and manipulate any data type because there are hundreds if not thousands of data types and variables.”

There are different ways of doing model building and analysis, Horwitz said. The tool you use will depend upon your use case. For instance, if you are a marketing analyst and want to analyze a website or Twitter campaign, you would need a different tool than someone in finance who is analyzing fraud detection. There are specific tools for different use cases.

Consider Data Integrity

Data integrity, too, must be achieved and then maintained. If you go all out on Hadoop, you may struggle on data integrity with your core business apps.

“While stores like Hadoop are a great place to store contextual data that can be used to enrich corporate data, it is not suited for storing mission-critical data at this time,” said Khan.

Short List of Big Data Vendors

There are just too many Big Data products out there to list them all. But here is a healthy sampling, ranging from basic analytics to full-scale enterprise-class suites.

Amazon Web Services

Microsoft Big Data

SiSense

Pentaho

Oracle

IBM

SAS Big Data

SAP

Birst

Terracotta

DataTorrent

Tibco Jaspersoft

Kapow Software

Drew Robb is a freelance writer specializing in technology and engineering. Currently living in Florida, he is originally from Scotland, where he received a degree in geology and geography from the University of Strathclyde. He is the author of Server Disk Management in a Windows Environment (CRC Press).

  • Business Intelligence
  • Data Management
  • Research
  • Drew Robb
    Drew Robb

    Drew Robb is a writer who has been writing about IT, engineering, and other topics. Originating from Scotland, he currently resides in Florida. Highly skilled in rapid prototyping innovative and reliable systems. He has been an editor and professional writer full-time for more than 20 years. He works as a freelancer at Enterprise Apps Today, CIO Insight and other IT publications. He is also an editor-in chief of an international engineering journal. He enjoys solving data problems and learning abstractions that will allow for better infrastructure.

    Read next