Tuesday, October 13, 2015

big data tool: Kudu

Resolving transactional access and analytic performance trade-offs - O'Reilly Radar
"HDFS is terrific at scans, can’t do random access at all. The idea with Kudu is, we’re building a data store that [is] pretty darn good at both. … If you’re 70-80% of the way there on both axis, then the convenience you get out of having a single system, for most people, will win out because engineering time is expensive and computers are cheap.

… In the IoT use case, you’re probably less interested in updates, but one thing that is popular is random access in that workload. You may have a bunch of time series … You do some big analytics to do some modeling..."



Kudu - Fast Analytics on Fast Data
"A new addition to the open source Apache Hadoop ecosystem, Kudu completes Hadoop's storage layer to enable fast analytics on fast data.
Currently, a limited-functionality version of Kudu is available as a Beta."

Written in C++ , not in Java, to get max speed and avoid GC issues. 


No comments: