Tuesday, October 13, 2015

U-SQL language for Azure Data Lake

Introducing U-SQL – A Language that makes Big Data Processing Easy - The Visual Studio Blog - Site Home - MSDN Blogs
"Microsoft announced the new Azure Data Lake services for analytics in the cloud that includes a hyper-scale repository, a new analytics service built on YARN that allows data developers and data scientists to analyze all data, and HDInsight, a fully managed Hadoop, Spark, Storm and HBase service. Azure Data Lake Analytics includes U-SQL, a language that unifies the benefits of SQL with the expressive power of your own code. U-SQL’s scalable distributed query capability enables you to efficiently analyze data in the store and across relational stores such as Azure SQL Database."
...
U-SQL is built on the learnings from Microsoft’s internal experience with SCOPE and existing languages such as T-SQL, ANSI SQL, and Hive."

@t = EXTRACT date string
           , time string
           , author string
           , tweet string
     FROM "/input/MyTwitterHistory.csv"
     USING Extractors.Csv();

@res = SELECT author
            COUNT(*) AS tweetcount
       FROM @t
       GROUP BY author;

OUTPUT @res TO "/output/MyTwitterAnalysis.csv"
ORDER BY tweetcount DESC
USING Outputters.Csv();

No comments: