Data is read from these log files using Ptail, an internally built tool to aggregate data from multiple Scribe stores. It tails the log files and pulls data out. Ptail data is separated out into three streams so they can eventually be sent to their own clusters in different data centers Puma is used to manage periods of high data flow (Input/Output or IO). Data is processed in batches to lessen the number of times needed to read and write under high demand periods