Data
Ground Truth Data Collection
We employ crowd sourcing to collect labels we take as ground truth data on the presence of MDD. Crowd sourcing is an efficient mechanism to gain access to behavioral data from a diverse population, is less time consuming, and is inexpensive (Snow et al., 2008). Using Amazon’s Mechanical Turk interface, we designed human intelligence tasks (HITs) wherein crowd workers were asked to take a standardized clinical depression survey, followed by several questions on their depression history and demographics. The crowd workers could also opt in to share their Twitter usernames if they had a public Twitter profile, with an agreement that their data could be mined and analyzed anonymously using a computer program. We sought responses from crowd workers who were located in the Unit-ed States, and had an approval rating on Amazon Mechanical Turk (AMT) greater than or equal to 90%. Each crowd worker was restricted to take the HIT exactly once, and was paid 90 cents for completing the task.