Top 5 storage vendors shows massive shift to the cloud

There’s a changing of the guard afoot in the storage industry, and it’s getting cloudy.

Each quarter 451 Research Group surveys it members in its Voice of the Enterprise series. Late last year, the company’s research revealed a dramatic reshaping of the storage market both in terms of which vendors enterprises consider strategic storage partners and where their future storage will be housed.

+ MORE AT NETWORK WORLD: Gartner says cloud will be the “default” application deployment option by 2020 | Deutsche Bank says one-third of finance apps will be in the cloud within 3 years +

To read this article in full or to leave a comment, please click here

RSS-1

RSS-3


RSS-4

Cloudera is building a new open-source storage engine called Kudu, sources say

EXCLUSIVE:

Big data company Cloudera is preparing to launch major new open-source software for storing and serving lots of different kinds of unstructured data, with an eye toward challenging heavyweights in the database business, VentureBeat has learned.

The storage engine, Kudu, is meant as an alternative to the widely used Hadoop Distributed File System and the Hadoop-oriented HBase NoSQL database, borrowing characteristics from both, according to a copy of a slide deck on Kudu’s design goals that VentureBeat has obtained. The technology will be released as Apache-licensed open-source software, the slides show.

Cloudera has had one of its early employees leading a small team to work on Kudu for the past two years, and the company has begun pitching the software to customers before an open-source release at the end of this month, a source familiar with the matter told VentureBeat.

From VentureBeat

Get faster turnaround on creative, more testing, smarter improvements and better results. Learn how to apply agile marketing at our roadshow in SF.

That source and others believe Kudu could present a new threat to data warehouses from Teradata and IBM’s PureData (formerly Netezza), and other vendors. It may also be used as a highly scalable in-memory database that can handle massively parallel processing (MPP) workloads, not unlike HP’s Vertica and VoltDB, the sources say. And one day Kudu — which works across multiple data centers with RAM and fast solid-state drives (SSDs) — could even play a part in backup and disaster recovery.

Cloudera declined to comment.

However Cloudera chooses to market Kudu, it’s clear that the software is a big step forward for the company, not only in the company’s efforts to outdo other Hadoop vendors, but also in its quest to become a prominent player in enterprise software.

Not that Cloudera is a nobody. It’s worth almost $ 5 billion, according to one recent estimate, it has considerable backing from Intel, and it’s been positioning itself as a competitor to much larger database companies, like IBM and Oracle. But the fact is, fellow Hadoop vendor Hortonworks has gained credibility after it went public last year, and Hadoop company MapR is still around, too.

Cloudera recently doubled down on the rising Apache Spark open-source big data processing framework, but Spark is something Cloudera has been working on for years. And a few months ago, Cloudera brought new Python capability to Hadoop, following its acquisition of DataPad last year. Those are important efforts, but Kudu is something entirely new, something that can give the company freshness as it grows toward an initial public offering.

So what is Kudu, then?

It’s “nearly as fast as raw HDFS for scans” and, at the same time, “nearly as fast as HBase for random access,” according to one slide from a presentation on Kudu’s design goals. But Kudu is not meant to be a drop-in substitute for HDFS or HBase. “There are still places where these systems will be optimal, and Cloudera will continue to support and invest in them,” a slide said.

Kudu could be used for time-series data, or real-time reporting, or model building, according to another slide.

And it’s important to note that Kudu isn’t a SQL query engine for pulling up specific data. Cloudera has Impala for that, and others have Hive for that. Kudu has an “early integration” with Impala, and Spark support is coming, according to a slide.

The Kudu application programming interface (API) works with Java — the common language of Hadoop — as well as C++. Kudu’s architecture allows for operation across sites, according to one slide. That makes it comparable to Google’s Spanner and the Spanner-inspired CockroachDB. That could make Kudu a great choice for big companies looking to store their big data around the world.

Is Kudu well adopted, though? No, not yet.

“Looking for beta customers,” a slide said.

More information:

Powered by VBProfiles



RSS-4

Amazon Web Services to offer new hierarchical storage options after customer feedback

Amazon Web services (AWS) is adding a new storage class to speed up the retrieval of frequently accessed information.

The announcement was made by AWS chief evangelist Jeff Barr on his company blog. Customer feedback had made AWS conduct an analysis of usage patterns, Barr said. AWS’s analytical team discovered that many customers store rarely-read backup and log files, which compete for resources with shared documents or raw data that need immediate analysis. Most users have frequent activity with their files shortly after uploading them after which activity drops off significantly with age. Information that’s important but not immediately urgent needs to be addressed through a new storage model, said Barr.

In response AWS has unveiled a new S3 Standard, within which there is a hierarchy of pricing options, based on the frequency of access. Customers now have the choice of three S3 storage classes, Standard, Standard – IA (infrequent access) and Glacier. All still offer the same level of 99.999999999 per cent durability.‎ The IA Standard for infrequent access has a service level agreement (SLA) of 99 per cent availability and is priced accordingly. Prices start at $ 0.0125 per gigabyte per month with a 30 day minimum storage duration for billing and a $ 0.01 per gigabyte charge for retrieval. The usual data transfer and request charges apply.

For billing purposes, objects that are smaller than 128 kilobytes are charged for 128 kilobytes of storage. AWS says this new pricing model will make its storage class more economical for long-term storage, backups and disaster recovery.

AWS has also introduced a lifecycle policy option, in a system that emulates the hierarchical storage model of centralised computing. Users can now create policies that will automate the movement of data between Amazon S3 storage classes over time. Typically, according to Barr, uploaded data using the Standard storage class will be moved by customers to Standard IA class when it’s 30 days old, and on to the Amazon Glacier class after another 60 days, where data storage will $ 0.01 per gigabyte per month.


RSS-5

The cloud is commoditising storage for enterprises – report

Little known unbranded manufacturers are making inroads into the storage market as the cloud commoditises the industry storage, according to a new report by market researcher IDC. Meanwhile, the market for traditional external storage systems is shrinking, it warns.

The data centres of big cloud companies like Google and Facebook are much more likely to buy from smaller, lesser known storage vendors now, as they are no longer compelled to commit themselves to specialised storage platforms, said IDC in its latest Enterprise Storage report.

Revenue for original design manufacturers (ODMs) that sell directly to hyperscale data-center operators grew 25.8 per cent in the second quarter of 2015, in a period when overall industry revenue rose just 2.1 per cent. However, data centre purchases accounted for US$ 1 billion in the second quarter, while the overall industry revenue is still larger, for now, at $ 8.8 billion. However, the growth trends indicate that a shift in buying power will take place, according to IDC analyst Eric Sheppard. Increasingly, the platform of choice for storage is a standard x86 server dedicated to storing data, said Sheppard.

ODMs such as Quanta Computer and Wistron are becoming increasingly influential, said Sheppard. Like many low-profile vendors, based in Taiwan, they are providing hardware to be sold under the badges of better known brand names, as sales of server-based storage rose 10 per cent in the second quarter to reach $ 2.1 billion.

Traditional external systems like SANs (storage area networks) are still the bulk of the enterprise storage business, which was worth $ 5.7 billion in revenue for the quarter. But sales in this segment are declining, down 3.9 per cent in that period.

With the cloud transferring the burden of processing to data centres, the biggest purchasers of storage are now Internet giants and cloud service providers. Typically their hyper-scale data centres are software controlled and no longer need the more expensive proprietary systems that individual companies were persuaded to buy, according to the report. Generic, unbranded hardware is sufficient, provided that it is software defined, the report said.

“The software, not the hardware, defines the storage architecture,” said Sheppard. The cloud has made it possible to define the management of storage in more detail, so that the resources can be matched more evenly to each virtual machine. This has cut the long term operating costs. These changes will intensify in the next five years, the analyst predicted.

EMC remained the biggest vendor by revenue with just over 19 per cent of the market, followed by Hewlett-Packard with just over 16 per cent.


RSS-3