Tech Thursday: Big Data

Story by Hal Goodtree, first published on Cary Innovation Center.

Cary, NC – Big Data is the flood of information that surrounds every company in 2012. Creating meaningful information from that data is a challenge to companies large and small. Here’s what you need to know about Big Data for 2013.

Big Data

Wikipedia refers to Big Data as “a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools.”

Disparate Sources

Like all businesses, our data at CaryCitizen comes from many sources:

  • QuickBooks (financial stats)
  • Facebook (social stats)
  • Google Analytics (stats)
  • Feedburner (Email stats)
  • Open X (Advertising stats)
  • CRM (client stats)
  • MailChimp (marketing stats)

The Challenge of Big Data

The challenge of Big Data for small and medium businesses (and municipalities) is to collect and visualize the massive (and useful) information from disconnected data sources.

A Horde of Solutions

Big companies can turn to SAS, the Cary technology giant and leader in the field of statistics and analysis. SAP, Omniture, NetLedger and IBM also provide big data, statistics and business analytics software and solutions.

Small and medium businesses can be harder pressed to find a comprehensive solution, but not from lack of offerings.

SalesForce, HootSuite and Google App Insightly all promise various aspects of Big Data management. Companies with strange names like Cloudera and Clearstory are finding funding to the tune of millions. But none of these solutions covers everything and “free” or low-cost apps can get pricey as you add customized or upgraded services.

There isn’t really a good solution for small and medium businesses to the big data challenge.

A Challenge, Not a Problem

Big Data is an opportunity, not a problem. At least not yet.

Businesses get more data than ever about customers. There is no doubt that unifying this data into meaningful information can improve the bottom line for almost any business.

Right now, not having a handle on your big data is, at worst, a missed opportunity. But in the future, it might be a competitive problem.

Other Names for Big Data

Companies have been awakening to the possibilities of Big Data for over a decade. In the recent past, other lexicographic handles include:

  • Data Mining
  • Business Intelligence
  • Decision Support
  • Analytics

Hadoop and NoSQL

As you begun to familiarize yourself with Big Data, you may hear about Hadoop, a project of the open-source Apache Foundation. Apache is best know for producing software for servers.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

NoSQL, also frequently mentioned in the Big Data discussion, is a broad class of database management systems. They are specially designed to handle huge flows of data, but trade many of the features of traditional SQL systems for vast scalability.


Klout is an interesting Big Data app. It attempts to measure influence by utilizing stats from Facebook, Twitter, LinkedIn, Google and other sources.

You sign up for an account and “authorize” Klout to access your social media accounts. Klout measures friends, fans, connections and interactions and comes up with an aggregate score between 1 and 100.

Klout is interesting, but neither actionable or necessary accurate. However, services like Klout are the wave of the future. Sign up for Klout and see their measure of your influence.

Big Data Column

ZDNet began publishing a column called Big Data Week back in February of 2012.

It’s pretty nerdy stuff, but Andrew Brust has his finger on the pulse of this rapidly emerging industry.

What Should You Do About Big Data?

Small businesses can take steps to wrangle their own Big Data into shape with a few free, openly available programs.

A. Where’s Your Data

First, make sure you are capturing data for key areas of your business. Some of the basics include:

    • Google Analytics – your web stats
    • Financial Data – from QuickBooks or other
    • Customer Data – from CRM or another customer relations manager

B. What Do You Need to Know?

This is the big question. What information would help you run your business more profitably and efficiently? What would give you the best “visibility” into the state of your enterprise?

Some examples of useful data from CaryCitizen:

    • Booked Revenue
    • Billed Revenue
    • Accounts Receivable
    • Contacts in the CRM
    • Monthly Page Views
    • Click-Through Rate of our ad platform
    • Email Opens

C. Extract and Visualize

Once you’ve decided the most important performance indicators for your business, extract (cut and paste) the relevant data from where it lives to a Google spreadsheet.

It’s important to identify and isolate what’s important to you. Not everything is useful. Our Facebook stats report is about 75 spreadsheets of data – a gargantuan 1.1 mgbs. We only use two metrics from that data dump.

Now, paste the data into a Google Spreadsheet. Build the spreadsheet so it is easy to update – that is, dates running down the left column and categories running across the top row.

From that data, you can create charts that visualize the information so it is easy to understand.

This chart is a mashup of WordPress web stats and Google Feedburner email analytics:

Viola! Your own homemade Big Data solution.

Looking Ahead to Big Data for Small Business

Your business will grow. New solutions will come out. In 3 years, we’ll all be using a spiffy newBusinessDataIntelligenceMiningCloud program.

In the meantime, you’ll be building a pool of useful information to leverage in the present and the future.