ISSUES AND CHALLENGES OF TEXT MINING
A. High share of unstructured information This is the most important challenge of text mining. As the share of unstructured information of any organisation/ group is much more, the text mining operations becomes more challenging and involves much more steps.
B. Intermediate form (IF) Preparation of intermediate form is an important aspect of text mining. Based on the IF, further processes of text mining are carried out. Different type of IF is used for text mining.
C. Semantic Analysis Semantic analysis is performed to obtain fine grain domain specific information. Most of the semantic analysis tools arequite expensive. They have slow speed. It is a challenging task to make the semantic analysis more proficient and scalable.
D. Very high number of possible “dimensions” Very high number of words and phrases in different natural languages are available. It is a very challenging task to group them as per our requirement.
E. Multilingual text refining Whereas data mining is largely language independent, text mining involves a significant language component. While most text mining tools focus on processing English documents, mining from documents in other languages is a challenging and cumbersome task.
F. Text context The meaning and context of a text in different language is distinctive. Even in same language a particular word can be used in different context. How to differentiate among them is a great challenge.
G. Trained manpower To get specialized trained manpower to work on highly sophisticated and technical text mining toll is a challenge. Most of the conventional library and information professionals are not trained adequately to operate text mining tolls.
H. Jargon Mainly text mining is centred on natural knowledge processing. In natural language processing various jargons are used. It is a challenging task of text mining tolls to overcome these jargons and find the solution for it.