One of the best ways to improve the power of your analytics is to
include some totally new information. The use of new information can
enable huge leaps in the effectiveness, predictive power, and accuracy
of your analytics. Most of the time, effort is spent trying to
incrementally improve results by using existing data and information in a
more effective manner. This isn’t as much because analytic
professionals don’t realize that new data can be powerful as it is
because new data only becomes available occasionally. As soon as a new
and different data source is available, however, you’ll be much better
off to shift your focus to the new data immediately.
To me, this gets to the heart of why big data is so powerful and is
getting so much attention. I believe that the volume, variety, and
velocity aspects of big data, which get so much attention, are
secondary. As I have discussed in prior blogs and articles, the most important ‘V’ associated with big data is value. The other ‘V’s’ are only relevant in the presence of value. So what drives that value for big data? Keep reading.
The fact is that many big data sources contain information that was
either not available in the past, or was available only to a much lesser
extent through means requiring much more effort. For example,
information from your web browsing activity is easy to capture and
analyze today. In the past, the only way to get similar data was through
very expensive research projects executed on a very small scale. In
practice, the information just wasn’t available because it was too
expensive.
Let’s fast forward to an analytic professional attempting to address a
common business problem today, such as churn or next best offer. When
the data sources available are fixed, most effort goes into trying new
modeling methods, new variable definitions, and new ways to handle
sparse or missing data. These efforts can result in increased power, but
typically only provide small, incremental gains. In cases with a lot of
money on the line, such gains aren’t anything to sneeze at. However,
the fact is that the likelihood of blowing your last results out of the
water is pretty low.
Now let’s imagine that the same analytic professional uses the exact
same modeling methods, variable definitions, and data preparation today
as he or she used yesterday. However, added into the analysis are new
variables from a new data source that contains totally new information.
Let’s assume that browsing history is now available to help identify
customers’ next best offer, for example. Given that browsing history
provides information on preferences and future purchase intent that
isn’t available with traditional data sources, the analytic professional
can achieve tremendous gains in analytic power. This is true even when
using the same old methods, but with new data.
My point is that for all the fuss about what the best analysis
methods are and how to best handle missing and dirty data, the really
big gains come from finding new information to include. Think back to
statistics 101 and the idea of Principal Components Analysis and
orthogonal vectors. While dozens of variables may be available to an
analysis, the variables often contain widely overlapping information. A
new variable with substantially the same information as is already known
won’t add much value. However, anytime you can add variables that are
completely or mostly distinct in terms of the information contained,
there is the potential for a lot of value.
The action I recommend for readers is to constantly seek out new data
sources. Instead of putting all your effort into tuning your existing
modeling methods with existing data, focus effort on a new data source
every chance you get. That’s where you’ll find the big gains. After you
realize your initial gains from the new data you can go back to tuning,
but I believe that makes sense only when you’ve exhausted your ability
to include additional data sources.
This is the core of the value proposition for big data. Many
organizations suddenly have multiple new, untested sets of data
available for incorporation into their analytic processes. Used
correctly, this data can provide a huge competitive advantage and a
veritable gold mine of value. Don’t miss your chance to get ahead.
Let’s close with a thought experiment. Assume I offer you a world
class analytic professional with access to every tool available, but who
will be limited to using only existing data. Your other option is a
solid, but not world class, analytic professional with access to just
standard tools. This person, however, will be allowed to incorporate
some new data sources that appear to hold value.
I hope you’ll take the 2nd option over the 1st.
Ideally, you’ll have a world class analytic professional working with
the new data, of course, but the thought experiment illustrates the
point. No matter how good an analytic professional is and how fancy the
tools, the inherent value in new and different data will win in most
cases
Original Source
Driving Analytic Value From New Data
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment