Stamford, CT-based analyst and market research firm Gartner released its annual data warehouse Magic Quadrant report Monday.
On the one hand, data warehousing (DW) and Big Data can be seen as
different worlds. But there's an encroachment of SQL in the Hadoop
world, and Massively Parallel Processing (MPP) data warehouse appliances
can now take on serious Big Data workloads. Add to that the number of
DW products that can integrate with Hadoop, and it's getting harder and
harder to talk about DW without the discussing Big Data as well. So, the release of the Gartner data warehouse report is germane to the Big Data scene overall and some analysis of it here seems sensible. The horse raceFirst,
allow me to answer the burning question: who "won?" Or put another way,
which vendor had, in Gartner's inimitable vernacular, the greatest
"ability to execute" and "completeness of vision?" The answer: Teradata.
Simply put, the company's 3-decade history; the great number of
industry verticals with which it has experience; the number and
diversity of its customers (in terms of revenue and geography); and the
contribution of the Aster Data acquisition to product diversity really impressed Gartner. Image credit: GartnerBut Teradata came out on top last
year as well, and its price points mean it's not the DW solution for
everyone (in fact, Gartner mentions cost as a concern overall for
Teradata). So it's important to consider what else the report had to
say. I won't rehash the report
itself, as you can click the link above and read it for yourself, but I
will endeavor to point out some overall trends in the report and those
in the market that the report points out.
Logical data warehouseIf there is any megatrend
in the DW Magic Quadrant (MQ) report, it's the emergence of the logical
data warehouse. Essentially, this concept refers to the federation of
various physical DW assets into one logical whole, but there are a few
distinct vectors here. Logical data warehouse functionality can allude
to load balancing, disbursed data (wherein different data is stored in
disparte physical data warehouses and data marts, but are bundled into a
logically unified virtual DW), and multiple workloads (where
relational/structured, NoSQL/semi-structured and unstructured data are
integrated logically). This multiple workload vector is a
Big Data integration point too, with 10 of the 14 vendors in the report
offering Hadoop connectors for their DW products. In-memory is hot In-memory technology, be it
column store-based, row store-based, or both, and whether used
exclusively or in a hybrid configuration with disk-base storage, is
prevalent in the DW space now. Gartner sees this as a competitive
necessity, and gives IBM demerits for being behind the in-memory curve.
On the other hand, it refers three times to the "hype" surrounding
in-memory technology, and generally attributes the hype to SAP's
marketing of HANA. Meanwhile,
Gartner notes that HANA's customer base doubled from about 500 customers
at the end of June 2012 to 1,000 at the end of the year.
Support for R Support for the open source R
programming language seems to be accelerating in mainstream DW
acceptance and recognition. Support for the language, used for
statistics and analytics applications, is provided by 2013 DW MQ vendors
Exasol, Oracle and SAP. Oracle offers a data connector for R, whereas
Exasol and SAP integrate R into their programming and query frameworks.
I think it's likely we'll see adoption of R gain even more momentum in 2013, in the DW, Business Intelligence and Hadoop arenas.
Several players with customer counts at 300 or lessNot
everything in the Gartner DW MQ report focuses on big, mainstream
forces. Alongside mega-vendors like IBM, Oracle, SAP and Microsoft, or
veteran DW-focused vendors like Teradata, the report includes several
vendors with relatively small customer counts. The report says that 1010Data has "over 250" customers and Infobright "claims to have 300 customers." And those numbers are on the high side of small with Actian (formerly Ingres) weiging in at "over 65" customers, ParAccel claiming "over 60," Calpont at "about 50 named customers" and the report explaining that Exasol "reports 38 customers in production and expects to have 50 customers by January 2013." I'm not saying this to be snarky, but this is an important
reality check. Many of us in the press/blogger/analyst community,
myself included, somtimes assign big-vendor-gravitas to companies that
actually have very few customers. Sometimes the tail wags the dog in
this startup-laden industry, and readers should be aware of this.
That said, while ParAccel only claims "over 60" customers, one of its
investors is Amazon, which licensed ParAccel's technology for its new Redshift cloud-based data warehouse service.
Multiple "form factors"Another trend
pointed out by Gartner is the vareity of deployment/procurement
configurations (or -- to use Gartner's term -- "form factors") that DW
products are available in. The options offered by vendors include straight software licenses, reference architectures, appliances, Platform as a Service (PaaS) cloud offerings, and full-blown managed
services, where vendors provision, monitor and administer the DW
infrastructure. And, in the case of non-cloud options, vendors may base
their pricing on number of servers, processor cores or units of data
(typically terabytes). Sometimes they even let customers decide which
model works best. Many vendors offer several of these form factor and licensing
options, and Gartner implies that the more such options a vendor offers,
the better. Those that offer only one option may disqualify themselves
from consideration by customers. Those that offer several, and
especially those that allow customers the agility to move between
deployment and pricing models, tend to score higher in customer
satisfaction. Data modelsSpeaking of models, Gartner
makes special mention that HP and Oracle offer industry-specific DW data
models and that Microsoft, through certain partners, does as well.
Gartner sees this as an important feature in vendors' data warehouse
offerings. I would agree...data models can quickly convey best
practices and serve, at the very least, as useful points of departure
for accelerating DW implementations. HCatalog for matadata management HCatalog, originally introduced by Yahoo/Hortonworks
and now an Apache incubator project in its own right, acts as a
metadata repository designed to unify storage and data management for
Hadoop stack components like Hive, Pig and
the Hadoop MapReduce engine itself. On the DW side of the
world, ParAccel and Teradata are each integrating with HCatalog as a way
to integrate Hadoop data into the DW design, rather than merely
connecting to and importing that data. This would seem to indicate good
traction for HCatalog, and perhaps we will see such support spread more
ubiquitously next year. Microsoft on the upswingI
think it's important to point out Gartner's coverage of Microsoft in
this year's DW MQ report. Microsoft was in the Leaders Quadrant last
year, but at its very lower-left corner, whereas this year it's smack in
the center of that quadrant. Last year, the Redmond-based software
giant led with its Fast Track data warehouse, based on its SQL Server Enterprise product. Its MPP data warehouse appliance, SQL Server Parallel Data Warehouse (PDW) had little momentum, and few customers. I once served on Microsoft's Business Intelligence Partner
Advisory Council, and was initially unimpressed with the PDW product.
It struck me at the time as a product created to give Microsoft
credibility in the Enterprise-grade database game and provide peace of
mind for customers, and less of a product that was actually designed to
generate siginificant unit sales. But things have turned around. A year later, the product is up
to its third "appliance update" (and much better aligned with non-PDW
editions of SQL Server) and a bona fide version 2.0 of the product is
due later this year. Gartner says PDW has been adopted by 100 new
customers over the last 18 months, and is likely to accelerate further,
as Dell's PDW-based appliance gains momentum. Gartner also cites the xVelocity in-memory technology, present in PowerPivot, as well as the 2012 release of SQL Server Enterprise, and the tabular mode of SQL Server Analysis Services, as an important advance for the company, and even gives mention to StreamInsight, Microsoft's little known Complex Event Processing (CEP) engine.
The next version of PDW will include the new PolyBase component,
which integrates PDW's MPP engine with data nodes in the Hadoop
Distributed File System (HDFS) to provide true parallelized, non-batch,
SQL query capability over Hadoop data.
And the next major version of SQL Server Enterprise will include an in-memory transactional database engine, code-named Hekaton. Add to that the ability to license SQL Server outright, obtain DW reference architectures for it, buy various SQL Server-based appliances,
and to use SQL Server in the Amazon and Microsoft clouds (in
Infrastructure as a Service or PaaS configurations) and the product's
trajectory would seem to be upward. What's it all mean?No matter what you may think
of the merits of Gartner's influence in the technology market, there's
no denying that influence exists. The DW MQ report is extremely
important and seems especially methodical, well-thought out, and
insightful this year. Analysts Mark A. Beyer, Donald Feinberg, Roxane Edjlali and Merv Adrian have produced a report that everyone in the field should read.
Original Article :http://www.zdnet.com/gartner-releases-2013-data-warehouse-magic-quadrant-7000010796/
0 comments:
Post a Comment