Microblogging nowadays became one of the major types
of the communication. A recent research has identified it
as online word-of-mouth branding (Jansen et al., 2009).
The large amount of information contained in microblogging web-sites makes them an attractive source of data for opinion mining and sentiment analysis.
In our research, we have presented a method for an automatic collection of a corpus that can be used to train a sentiment classifier. We used TreeTagger for POS-tagging and observed the difference in distributions among positive, negative and neutral sets. From the observations we conclude that authors use syntactic structures to describe emotions or state facts. Some POS-tags may be strong indicators
of emotional text. We used the collected corpus to train a sentiment classifier. Our classifier is able to determine positive, negative and
neutral sentiments of documents. The classifier is based on the multinomial Na¨ıve Bayes classifier that uses N-gram
and POS-tags as features. As the future work, we plan to collect a multilingual corpus of Twitter data and compare the characteristics of the corpus
across different languages. We plan to use the obtained data to build a multilingual sentiment classifier.