Saturday, January 31, 2015

big data: Microsoft "Cosmos"

Could Cosmos be Microsoft's next commercial big-data service? | ZDNet
"Microsoft is launching more and more internal and external services on Azure. Given CEO Satya Nadella's focus on creating a "data culture," it's not too surprising that big data services are a top priority. HDInsight, Microsoft's Hadoop on Azure service , was Microsoft's first commercially available big-data service. I'm wondering if Cosmos might be its next. 

Currently, Cosmos is an internal-facing Microsoft service. It's Microsoft's massively parallel storage and computation service that handles data from Azure, Bing, AdCenter, MSN, Skype and Windows Live. According to a recent Microsoft job posting, there are 5,000 developers and "thousands" of users inside Microsoft using Cosmos. Cosmos was built using Microsoft's Dryad distributed-processing technology."

Report: Microsoft to revive Hadoop killer in cloud form | SiliconANGLE
Microsoft’s analytics framework has apparently evolved a great deal since the paper came out in 2011.

Cosmos, Big Data and Big Challenges, Pat Helland, July 2011

"Cosmos is composed of a distributed compute component (somewhat comparable to Hadoop's Map-Reduce, using the Microsoft Dryad solution, which (unlike Map-Reduce) allows an arbitrary DAG of computation. Cosmos supports a SQL-like syntax (similarly to HIVE/PIG) and includes a distributed storage component (comparable to HDFS); Overall, Cosmos provides highly scalable, reliable, fault-tolerant and automatically scaled compute operations on huge data sets."

image

Note: there is another "Cosmos", an open source OS based on C#, not related.

There is a similar service from Google:
Google Cloud Dataflow — Google Cloud Platform

No comments: