![](https://i2.wp.com/softwareengineeringdaily.com/wp-content/uploads/2017/02/go-logo.jpg?resize=405%2C226&ssl=1)
GitHub - pachyderm/pachyderm: Reproducible Data Science at Scale!
"Pachyderm is a tool for production data pipelines. If you need to chain together data scraping, ingestion, cleaning, munging, wrangling, processing, modeling, and analysis in a sane way, then Pachyderm is for you. If you have an existing set of scripts which do this in an ad-hoc fashion and you're looking for a way to "productionize" them, Pachyderm can make this easy for you."