Sunday, March 31, 2024

performance: One Billion Rows Challenge

strange thing: using strings where binary format would be much more appropriate...
anyway, interesting code challenge


The One Billion Row Challenge (1BRC) is intended to be a fun exploration of how far modern Java can be pushed for aggregating one billion rows from a text file. .

What is the 1BRC challenge?

Input: A text file containing temperature values for a range of weather stations. Each row is one measurement in the format <string: station name>;<float: measurement>.

Output: For each unique station, find the minimum, average and maximum temperature recorded and emit the final result on STDOUT in station name’s alphabetical order with the format {<station name>:<min>/<average>/<max>;<station name>:<min>/<average>/<max>}.


This article describes the nine solutions in Go, each faster than the previous. The first, a simple and idiomatic solution, runs in 1 minute 45 seconds on my machine, while the last one runs in 3.4 seconds.


No comments: