What Is Insight? Is It Visual?

Shall we play a game?
What word can connect all three of these words?
Pine
Crab
Sauce
Here's a picture to give you time to think - be warned though, looking at the picture may stop you finding the answer (more on this below).
data analytics

Got it? That’s right; ‘apple’ can go with them all: pinepple; crab apple; apple sauce.
There are two ways of coming to the answer:
  • Analytic logic: did you run through a series of possible matching words until you found the right association? For example, saying: “Does ‘cake’ work? No. Does ‘cone’ work? No. Does ‘tree’ work? No. Does ‘apple’ work? Yes.”
  • Unconscious Insight: did you have a moment of pure insight, where your brain leapt to the right answer? You somehow just knew it, with no conscious thought process?
Humans do both, but the neurological process that drives insight, those amazing a-ha moments we all have, has been little understood until recently.
Neuroscientist Dr. Mark Beeman at Northwestern University is using puzzles and brain imaging to understand how insight works. His team have discovered that when an insight occurs different areas of our brains are active than when we reason analytically. The research has identified that a part of the brain above our right ear (specifically the anterior superior temporal gyrus) emits an intense burst of gamma brain waves when an insight happens. As Dr. Beeman says, “The dendrites – the pieces of the neurons that collect information - actually branch differently on the left and right side, characteristically having broader branching in the right hemisphere, so that each neuron is collecting information from a broader source of inputs and this allows them to find connections that might not be evident otherwise.”
So, here’s objective evidence of association occurring naturally in the brain, making connections between distant concepts, in a flash of insight. It seems that associative technology really does reflect the way that we think when we gain insight.
Interestingly, given all the attention on visualization at the moment, neuroscience research has found that although insights can be prompted by visual cues, the brain activity that generates insight is explicitly non-visual. As Professor John Kounios at Drexel University explains: “At the a-ha moment there’s a burst in the right temporal lobe… but if you go about a second before that there’s a burst of alpha waves in the back of the head on the right side. Now strangely enough the back of the brain accomplishes visual processing and alpha is known to reflect brain areas shutting down.”
In other words just before an insight the brain closes down part of the visual cortex.
“You have all this visual information flooding in; your brain momentarily shuts down some of that visual information – sort of like closing your eyes… so the brain does its own ‘blinking’ and that allows very faint ideas to bubble up to the surface as an insight”. Prof Kounios continues: “Think of it this way – when you ask somebody a difficult question, you’ll often notice that they’ll look away or they might close their eyes or look down. They’ll look anywhere but at a face which is very distracting. If your attention is directed inwardly then you’re more likely to solve the problem with a flash of insight.”
The key point here is that while visualization is very useful and compelling, used in isolation (or too extensively) it’s not the most powerful driver of insightful thinking.
Time for one final game: what word can link these four words?
Class
Casual
Risky
Discovery
Got it? I’m sure you have. So what was it for you, analytic logic or pure insight? If it was insight did you catch yourself looking away so your brain could blink?
Notes: 1) This subject of this blog and the quotes in it came from a fantastic BBC Horizon documentary. 2) I’m aware that the word puzzles in this blog may not be as effective for readers whose first language is not English - I hope that doesn’t undermine its interest for those of you. 3) Distracting image source (creative commons sharealike license).

Original article

Big Data Meets Walt Disney's Magical Approach

Walt Disney is one of the most admired companies in the world and annually, approximately 100 million visitors spend a lot of time in the Walt Disney parks around the world. These visitors could generate a lot of data and that is exactly what Walt Disney aims to achieve. Recently, they announced the introduction of the wireless-tracking wristband ‘MagicBand’. They aim to make the visit to the Walt Disney World in Orlando a more magical experience while in return record the complete data trail of the visitors.
big data disneyThe MagicBands are linked to a credit card and function as a park entry pass as well as a room key. They are part of the new MyMagic+ system and joining is still completely voluntarily. However, visitors who join will have many advantages such as jumping the queues, pre-booking rides, changing reservations on-the-go via smartphones, being personally addressed with their name by the characters and many more.
In the mean time, the MyMagic+ system allows Walt Disney to collect massive amounts of sensitive and valuable data such as real-time location data, purchase history, information about the visitors, riding patterns and more. As such, Walt Disney is building a gigantic database of every move of the visitors of the park. All this data is waiting to be analysed and used by Walt Disney to make better decisions, to improve its offerings and tailor its marketing messages.
Although they are collecting massive amounts of data, Walt Disney does respect the privacy of their visitors. They allow visitors to completely control how much and what sort of data is collected, stored and shared with whom or to op-out completely. Visitors can select via a special menu whether or not Disney can send you personalized offers when they get back or during their stay. Parents have to opt-in before the characters in the park can use the personal information stored in the MagicBand. However, even with the most restrictive selection, the MagicBand does record general information about how the visitors use the park, as noted by the NYTimes.
In order to achieve a magical experience with the MyMagic+ system, Disney had to go great lengths. The 60,000 employees needed to be trained to use the system, and free WiFi across the 40-square mile park in Orlando had to be installed. The free WiFi allows visitors to use their smartphones more often while in the park, adding to the amount of data that will be collected. Analysts estimate that the entire program costs around $800 million.
In order to store, process, analyse and visualize all data that is generated through the MyMagic+ system, Disney created a big data platform based on Hadoop, Cassandra and MongoDB. It was complemented by a suite of other tools for particular use cases, as mentioned by GigaOM. They moved from an RDBMS to their first Hadoop cluster in 2009 to a complete Data Management Platform in 2011.
Today, all the collected data allow Disney to gain additional insights via, among others, audience analysis and segmentation, a recommendation engine and analysis of in-park traffic flow. They did start small however, and built the big data platform like a startup builds a company; with a small and flexible team, failing fast and early but improving along the way. They started with open-source tools to keep the costs down, but as the amount of data grew, the open-source tools used failed and Disney opted for paid tools that are more reliable with the amount of data Disney processes.
The opportunities for Walt Disney and Big Data are enormous. Already they see great results from the MyMagic+ system in the Walt Disney World Orlando, so we can expect them to expand the project to other parks around the world. The data that they collect with the system is gigantic and it will provide Disney with extremely valuable insights that they can use to give the visors an even more magical experience.
Picture: DisneyParks blogs
Copyright Big Data Startups 2013. You may share using our article tools. Please don’t cut articles from BigData-Startups.com and redistribute by email or post to the web.
Original article

Data Visualization: Making Big Data Dance

Fifteen years ago, the presentation of data typically fell under the purview of analysts and IT professionals. Quarterly or annual meetings entailed rolling data up into now quaint diagrams, graphs, and charts.
My, how times have changed. Today, data is everywhere. We have entered the era of Big Data and, as I write in Too Big to Ignore, many things are changing.

Big Data: Enterprise Shifts

In the workplace, let’s focus on two major shifts. First, today it’s becoming incumbent upon just about every member of a team, group, department, and organization to effectively present data in a compelling manner. Hidden in the petabytes of structured and unstructured data are key consumer, employee, and organizational insights that, if unleashed, would invariably move the needle.
Second, data no longer needs be presented on an occasional or periodic basis. Many employees are routinely looking at data of all types, a trend that will only intensify in the coming years.
data visualization
The proliferation of effective data visualization tools like Ease.ly and Tableau provides tremendous opportunity. (The latter just went public with the übercool stock symbol $DATA.) Sadly, though, not enough employees—and, by extension, organizations—maximize the massive opportunity presented by data visualization. Of course, notable exceptions exist, but far too many professionals ignore DV tools. The result: they fail to present data in visually compelling ways. Far too many of us rely upon old standbys: bar charts, simple graphs, and the ubiquitous Excel spreadsheet. One of the biggest challenges to date with Big Data: Getting more people actually use the data–and the tools that make that data dance.
This begs the question: Why the lack of adoption? I’d posit that two factors are at play here:
  • Lack of knowledge that such tools exist among end users.
  • Many end users who know of these tools are unwilling to use them.

Simon Says: Make the Data Dance

Big Data in and of itself guarantees nothing. Presenting findings to senior management should involve more than pouring over thousands of records. Yes, the ability to drill down is essential. But starting with a compelling visual represents a strong start in gaining their attention.
Big Data is impossible to leverage with traditional tools (read: relational databases, SQL statements, Excel spreadsheets, and the like.)  Fortunately, increasingly powerful tools allow us to interpret and act upon previously unimaginable amounts of data. But we have to decide to use them.

Original article

Seven Steps to Rejuvenate Your Marketing Database


marketing databaseA hard look at digital marketing programs shows an increase in marketing activity by businesses and a proportionate level of consumer disengagement. This happens for a variety of reasons. The consumer is barraged by messaging across channels – from your business and others.  Some of the other reasons include the consumer having real time access to messaging through mobile, lots of social discussions, and the consumers’ ability to turn you off at any time.
Your messaging needs to engage. You need to make sure that you are communicating with relevance. Your marketing database is what will help you create this relevance. Here are seven key considerations to rejuvenate your marketing database.
1 - Create a Roadmap - You know where you are and what you can do today. Start by listing where you would like to be and start coming up with small steps in between. Use cases will go a long way in helping you come up with tactical steps.
2 - Focus on the Business Needs - Put forth a list of those who use your marketing database and ask them about what information they need and how they want to use the information. In the world of use cases, this means you are listing the actors (users) and asking them to list their use cases (tasks) on how they would use the marketing database.
3 - Find the Gaps - You will find that you are missing information, have duplicate information, take too long to update information, or it is really hard to find the information. Additionally, users may not be able to find the information in different places compatible with each other. You consequently need to highlight these gaps and start improving the quality of data (or put it on the roadmap).
4 - Design Your Reports - Do this first as people will want to see the information in a particular format. Once you start getting people sold on the look and feel of what the reports are, you will find them much more amenable to changes that might be required later.
5 - Educate Your Users - Teach them how to use the database. I usually recommend a three-step approach. I ask the users to write down specific use cases on what information they need from the system. Next I ask them to point it out in the reports (step 4). Once they have done that, I encourage them to click through to find the appropriate data. These three steps can now be captured as part of the user manual and should be kept available for online support as well.
6 - Make the Database Accessible - If people can access your database to get the reports that they desire - they are more likely to use it. From a performance perspective, limit the number of power users (users who can run any query) so as to not reduce the performance of your system. Pre-run popular reports so that the information is available to those that need it. Also, work with the power users to add to the repository of reports and keep evaluating (and removing) reports that are not used.
7 - Make Reports Mobile - Allow your key users to have access to summary reports on their mobile devices. This is power messaging at its best and it really helps to rejuvenate the utility of information and your marketing database.
(image: database / shutterstock)

Original article

How much Data Boes the World creates Every Day

I can still remember the first time I saw a 1 GB flash drive—it blew my mind. This device, the size of my thumb, could hold the information of 711 3.5-inch floppy disks. Manufacturers had become bona fide magicians, capable of shrinking data right before my eyes.
But it’s a darn good thing that data is occupying less space these days, since we’re seeing it increase astronomically in volume. The most critical challenge we face is transforming all this “big” data into “this-actually-makes-my-business-better” data. And when you’re dealing with enough data to reach the moon, that’s no easy task.
Big Data


Original article

The Journey from Big Data to Big Promise

It’s well known that big data is usually stated in terms of the three Vs: Volume, Variety and Velocity. The three Vs appropriately sum up the characteristics of big data and convey that big data is heterogeneous, noisy, dynamic, inter-related and not trustworthy. Companies now strive to convert the three Vs into Big Promises. And Big Data’s promise can be summarized by three new descriptive terms: Veracity, Value and Victory.
 1.      Three Vs of Big Promise - Veracity, Value and Victory
Like the three Vs of big data that well describe the characteristics of big data, the volume is based on both variety and velocity; the three Vs of Big Promise also has an internal relationship. The Veracity mined from big data, based on volume and variety, determines the Value of big data. The value determines the Victory when a business appropriately applies in a timely manner. The higher the Veracity mined from raw data, the more valuable the result, the smarter decision a business can make, and the more successful the business will become. All those will lead to big Victory for the business.
While much around big data remains hype, many companies are in the fledging stages of drawing value from their big data corpus, and given an army of discussions and opinions around the topic, it’s still hard to find a clear roadmap to arrive at the Big Promise.
 2.      The Journey from Big Data to Big Promise
Here I share my thoughts about the roadmap from the big picture of big data in a grand view, regardless the type of a business. Basically there are three big steps:
 Step 1: Big Data Collection – Gathering Organic Material
Regardless where you are in the journey – it has to start with understanding the nature of the big data defined by three V’s defined though there is voice that put more dimensions into the big data such as value and veracity. However I do not think they are characteristics of big data in raw. Instead I defined them as two characteristics of big data promise.
 Step 2: Big Data Analytics – Gleaning Big Insight
The core technologies are big data platform and big data analytics.  The big data platform provides the power of speedy processing with millions of records per second. It harness an integrated technologies for transforming organic/raw content to designed content like Natural Language process (NLP), Data Cleansing, transformation (ETL) and filtering methods. The goal aims to transform semi-structured or unstructured data into structured format for easier understanding, analysis and visualization.
 Though in the world of analytics, there are many different kinds of analytics terminologies used and referenced like text analytics, social media analytics, customer, social network, business or sentiment analytics, if given deep thoughts on those terminologies, basically analytics can be categorized into three categories functionally, they are Descriptive Analytics, Relationship Analytics, Prescriptive Analytics. The detail is explained as below for each of them.
 1.      Descriptive Analytics
Once organic data are transformed into designed data from data processing phase, the first analytics is descriptive or exploratory. This phase uses simple statistics to get a general understanding about the data such as data properties like dimensions and field types, statistical profile or summaries like number of records, missing values or field value max, min, median, field value distribution, etc. The exploratory analysis provides us with initial knowledge about the raw content without any deeper digging internal relationships. The process can suggest right strategies to perform deeper analysis. The phase can be done on a random sampled dataset with simple tools like excel sheet and visualized with basic chart types like bar chart, pie chart and scatter plot, etc. The characteristics of the descriptive analytics are:
  • Autonomy, the analytics performed is based on individual fields and their values and it’s self-government and independent of other fields without considering any connections between different fields and contents.
  •  Shallow and Straight forward, the result from the analysis is usually shallow basic statistics like the frequencies of word count, the number and percent of employees with a earning about 5k within a certain geographic area.
  • Simple and Easier understanding – As the method to analyze the data is basic statistical profiling without any extra effort involved, so the result is also simple and easier to understand and visualize.
With descriptive analytics, it can reach a general understanding about what happened. It’s like a doctor to find out what happened to a patient, the fact first before he digs out why the patient got the disease.
  2.      Relationship Analytics
This level analytics aims to dig out embedded valuable insight among the big data. Comparing with the descriptive analytics, the analysis is deeper – in order to succeed at this level, it requires ample mining algorithm or methods like advanced statistics, sophisticated machine learning, inter-disciplined studies, meta or scalable algorithms; the process involved is usually also complicated and performance demanding both in speed and volume.
The reason I called the analytics at this level as relationship analytics because, at this deeper level analytics, its primary goal is to find connection among data elements – the connection may be timely based like sequential dependent relationship or geo location based or functional category based like relationship between production and customer purchasing pattern or transaction based like marketing basket analytics.
During this level analysis, the methods used may be as below:
  • Inferential or Association draws insight from data through random processes that are developed with statistical methods. Inferential depends on the right population and randomly sampled. For example, the average children height tends to higher than their parents who are usually lower than average height of adults. For basket analysis, through mining millions of transactions, some of items have the higher probability to be bought together by customers like coffee and coffee mater – creams, etc. some of the conclusions are easier to understand and make common sense, however, the high value comes from the conclusions that are against people’s common sense or wrong assumption.
  • Model based analysis uses pre-developed model based on the known observed data to infer or predict what will happen in the future. Under this category, two sub categories are commonly known, classification and predictive modeling. Usually when the target variable is in different categories and the method is called classification; when it’s numerical or continuous variable, it’s called predictive method. Both methods need a training data set that are well labeled and a test dataset that are drawn from the same population with the training dataset. The analysis has two phases involved, first a model is built with training dataset then evaluated or tested with test dataset for measuring its performance. Once the model is developed, it’s used to predict the future events or target variables based on the independent variables. For example, a linear regression model can be built to predict sales amount based on the factors that affect sales in the last three months then predict the next month sales; a decision tree model can be built to predict whether a specific twitter message is positive or negative, etc. Sometimes classification and predictive methods are overlapped based on the business applications.
  • Segmentation dynamically group data into different clusters based predefined measurement like distance method. The method is different than the classification or predictive method. It does not need training data or test data. For example, an algorithm can be used to dynamically group similar twitter messages into different clusters.
 3.      Prescriptive Analytics
Prescriptive analysis is actually a business decision based on the conclusions or results drawn from relationship analysis. For a given situation, what kinds of best action to take so that we gain the expected result in the future? Suppose a patient go to see a doctor, first the doctor performs descriptive analysis, fact finding phase, to understand what happened to the patient and some relative factors like daily activities and workloads and food nutrition, next the doctor perform relationship analysis to find out what are the possible factors that cause the patient sick, finally the doctor will give prescription to the patient like medicines to take so that the patient can get well.
 Step 3: Reap Big Promise
In order to fully empower a business with insight drawn from analytics – first the veracity of the result has to be verified before it can be deployed into a business application for generating valuable results. The main approaches that are used to evaluate the veracity of analytics results or models built include precision, recall and accuracy. Also we need to consider the business cost for each error made in dollars. Basically, there are three phases in evaluating the performance, 1) once the model or algorithm developed, the performance can be evaluated based on an validation dataset that are drawn from the same population of the training dataset. If the result is not good enough, the model needs to be redeveloped by adding more data or perform some tuning by adjusting parameters or exploring other methods; 2) the model is evaluated against a test dataset that are drawn from a different dataset than training data. This dataset is more representative to the real world dataset at the point of developing the model and the associate error cost should be also measured based on business objectives; 3) the model will be evaluated on an on-going process. Because the world changes so fast, new data comes in and they may be pretty different from the dataset used to develop the model. The phase 3 should be performed in a regular scheduled base so that the prediction will not go too far off the expected and causes business crashes. In the process, once it’s found the model does not perform well enough anymore, the process will go back to 2).
The values from data veracity of 2) are also count on how well the business takes full advantage of them – how many opportunities to use them to provide business intelligence to customers. Exploring the right business opportunities and defining the right objectives are the key factors for generating business values. If a company can generate higher revenue, victory will be shining out brightly tomorrow.

Original Source : http://smartdatacollective.com/ling-zhang/123661/journey-big-data-big-promise