Text analytics helps analysts extract meanings, patterns, and structure hidden in unstructured textual data. The
information age has led to the development of a wide variety of tools and infrastructure to capture and store
massive amounts of textual data. In a 2009 report, the International Data Corporation (IDC) estimated that
approximately 80% percent of the data in an organization is text based. It is not practical for any individual (or
group of individuals) to process huge textual data and extract meanings, sentiments, or patterns out of the data.
A paper written by Hans Peter Luhn, titled “The Automatic Creation of Literature Abstracts,” is perhaps one of
the earliest research projects conducted on text analytics. Luhn writes about applying machine methods to
automatically generate an abstract for a document. In a traditional sense, the term “text mining” is used for
automated machine learning and statistical methods that encompass a bag-of-words approach. This approach is
typically used to examine content collections versus assessing individual documents. Over time, the term “text
analytics” has evolved to encompass a loosely integrated framework by borrowing techniques from data
mining, machine learning, natural language processing (NLP), information retrieval (IR), and knowledge
management.