Monthly Archives: October 2018

Recommended: 1.1 Billion Taxi Rides with Spark 2.2 & 3 Raspberry Pi 3 Model Bs

Mark Litwintschik has taken a large open source data set (1.1 billion taxi rides with data storage on the order of hundreds of gigabytes) and ran some benchmark queries on a variety of different systems. Perhaps the most humble of these systems is a cluster of three Raspberry Pi computers. This webpage talks about how he set up the software on this cluster. Continue reading