Pre-processing: Lemmatization and string sanitizing. In this step tags associated with a given artwork are filtered so as to eliminate flaws like spelling mistakes, badly accented characters, and so forth. Then, tags are converted into lemmas by applying a lemmatization algorithm, which builds upon Morph-It!, a corpus-based morphological resource for the Italian language