Small Data – Big Data – The Relevance of Real Time

“Big Data” is a big term; probably one that is used too much. Wikipedia defines “Big Data” as “data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time“. Tolerable elapsed time – what does that mean? When I was a kid I would wait upwards of 15 minutes for my Commodore 64 to load an event in Summer Games. Today that kind of latency just doesn’t cut it. We move fast, demand immediate gratification, and access to information derived from data at all times. Not only do we demand access to information we are becoming more obsessed with creating and curating our own data to derive custom information.

My wife and I just welcomed our first child, a little girl named Olivia, into the world. Aside from just being plain obsessed with her we became obsessed with monitoring her every movement, mostly bowel movements actually. Thankfully there was an app for that. When did she eat, when did she sleep, and for how long? All of this small data could be captured and analyzed to ensure that the new addition to the family was getting enough of everything, and hoping that patterns in her daily life would start to appear. The capture of this type of data is termed “Small Data”. Babies are one example but society is becoming obsessed with all kinds of “Small Data”. It is, in my view the “Small Data” of peoples lives that is exceptional and interesting. The marketing power of Social Media is actually based on “Small Data” . Where did a person check in, what types of food, movies, and music do they like, which companies are they engaging with, what is their search history, who are they connected to? This type of data when considered on an individual level can be handled with ease. Capture this data on an enterprise level with 10 million customers and associating “Small Data” with existing customers, finding new patterns, creating segments, and producing analytics becomes increasingly harder.

Banks can, and do, churn large amounts of data making that data somewhat useful in exception reporting, market analysis, regulator and compliance reporting, and a host of other data supported processes. There are two challenges that banks need to overcome:

1. Ability to answer the unknown

Traditionally the ability to answer questions with analytics quickly in banking required data-marts specific to a question or a set of questions. These data-marts normally house aggregates just sufficient enough to answer the predefined questions. Aggregation helps facilitate speed by limiting the amount of source data but limits the number of questions that can be asked against a specific data-mart because the granularity of data is not sufficient enough to do more than the predefined analytics. Today business, customers, and regulators are pivoting at an alarming rate. Analytics and data required yesterday may no longer be sufficient to answer the questions asked today.

2. Tolerable elapsed time

The second challenge is the tolerable elapsed time when utilizing analytics in business processes. This challenge is amplified as existing and new processes are pushed out to new channels, including mobile, where expectations for immediate response are high. There are also expectations that new visual analytics tools designed for use by business analysts have access to more granular data putting the art of analytics and data discovery in the hands of the business. In banking new revenue generating processes demand real-time or near real-time response. Meaning that inputs to a process like purchase transaction amount, item identification, and location of purchase are integrated with previously captured customer transaction detail, legacy customer information, and new social media metadata, in a real-time segmentation process to correlate real-time offers that are pushed back to a mobile device before the transaction is completed at the point of sale. Meaning this has to happen in real-time with sub second response. Tolerable elapsed time gets shorter and shorter as banks try to take advantage of real-time processes with real-time data.

Embracing real-time systems, providing real-time data to a real-time data platform as an ultimate end state vision for your enterprise and making the decisions required to progressively move towards that vision will position your bank for success. The bank of the future is quickly becoming the bank of the now. A large part of the future of banking, whether it is to address customer demands, internal user and line of business requirements, or regulator requirements, only a real-time big data solutions will prove successful. The problem is most banks are still thinking about data management and data usage with outdated paradigms. The volume, velocity, and variety  of data is rapidly increasing is your bank ready for it?

Leave a Reply

Skip to toolbar