This paper describes our solution for mining and analysis data from social media. Our approach for data processing consist from three stages and combines the use of two different concepts – big data and cloud computing. First and second stages are data mining from social media and its filtration or aggregation which in result gives relatively small datasets with data relevant to the solving task. Data mining is performed by crawler which is based in MapReduce model for distributed computations and which we implemented using Hadoop framework [2]. On the last stage obtained small datasets are analyzed using sophisticated models. Whole data analysis process is formalized in composite application which is run in our environment for distributed computing-based cloud platform CLAVIRE (CLoud Applications VIRtual Environment)
Another part of the paper describes analysis of people who write about drugs in social media. We present an idea of using social media as an additional data source for analysis and modeling of illegal activities in society. Developed technologies for mining and analysis are applied to characterize users who write about drugs. Characteristics reveal additional interests of users and compose their psychological portrait. This paper also describes prediction model for the level of drug use among population which considers various factors, like macro-state of the population and individual characteristics of residents. Results of social media analysis such as level of interest to the drug theme or characteristics of users who uses drugs can be used to increase accuracy of this model.