A. Data Format Open source tools usually do as a business tool as data on a variety of formats provide good support, but there will be a certain format restrictions, or even require their own proprietary data formats. When selecting tools, you should first consider whether the data meets or after conversion tool can meet the requirements, while, if the results of the analytical tools but also for subsequent processing, it should also take into account the output format previously used tools are common or can NO is converted to a common format, to support the work of the late. Weka
input formats include ARFF, CSVXRFF and C4.5, output formats including ARFF, CSV, stored in the database via JDBC. LingPipeinput formats include XML, HTML and Text, output formats including XML. Four open source tool has its own fixed format requirements, the need for data collection to make formatting. Although Weka support for common CSV format, but the effect of making the document more time ARFF format for later analysis, generally using its own tools will convert ARFF.Weka CSV txt format does not support document, requires the user to use another tool or write your own code format conversion. LIBSVM data output format requires a special tool to open the view, difficult to integrate other applications into the data output format three other open source tools easier expansion[14][15].