The ICA staff applied for, and obtained, an ECSS-supported allocation through XSEDE (Charge No. TG-HUM130001), and with the help of HPC experts at TACC, developed familiarity with the HPC environment. Together they developed a metadata extraction workflow that can be used by data curators on HPC resources with minimal Linux training. The workflow is interactive so that the data curators can conduct and verify the different steps when needed. The development of the workflow involved: 1. Porting the software to an open science HPC platform; 2. Transferring and synchronizing terabytes of data over the network to HPC resources at TACC for analyses and long term storage; and 3. Developing scripts for simultaneously running multiple copies of the software on different portions of the collection, also known as parallelization. In this paper, we present an overview of the test collection, provide an introduction to HPC, and discuss the methods used in parallelizing the metadata extraction.