Privacy-Preserving Data Mining as a Service in the Cloud
The discovery of frequent patterns, association rules, and correlation relationships among huge amounts of data is useful to business intelligence.
A typical example of frequent itemset mining is market basket analysis.
This process analyzes customer buying habits by finding associations between the different items that customers place in their shopping baskets.
The discovery of such associations can help retailers develop marketing strategies by gaining insight into which items customers frequently purchase together.
For a decade, there has been a growing interest in data mining as a service.
In this paradigm, a company (data owner) that lacks data storage, computational
resources, and expertise, stores its data in the cloud and outsources the mining tasks to the cloud service provider (server).
Without doubt, data mining as a service offers valuable benefits to business intelligence.
However, it also presents a serious privacy problem; that is, the server has access to company data and could learn business secrets from it.
To protect a company’s data privacy and yet enable the server to perform association rule mining on the data in the cloud, a naïve solution is for the data owner to hide the meanings of items in its transaction database by substituting items with unique numbers
(where the same item is substituted by the same number and different items are substituted by different numbers).
This one-to-one substitution approach doesn’t hide the frequencies of items. If the server
has some background knowledge (for example, information on the frequencies of some items), it can reidentify them, particularly the most frequent items.
For example, if bread is the most frequent item in retail transaction databases, the server can conclude that the most frequently occurring number refers to bread in the transformed database.
Privacy-Preserving Data Mining as a Service in the CloudThe discovery of frequent patterns, association rules, and correlation relationships among huge amounts of data is useful to business intelligence. A typical example of frequent itemset mining is market basket analysis. This process analyzes customer buying habits by finding associations between the different items that customers place in their shopping baskets. The discovery of such associations can help retailers develop marketing strategies by gaining insight into which items customers frequently purchase together.For a decade, there has been a growing interest in data mining as a service.In this paradigm, a company (data owner) that lacks data storage, computationalresources, and expertise, stores its data in the cloud and outsources the mining tasks to the cloud service provider (server). Without doubt, data mining as a service offers valuable benefits to business intelligence.However, it also presents a serious privacy problem; that is, the server has access to company data and could learn business secrets from it.To protect a company’s data privacy and yet enable the server to perform association rule mining on the data in the cloud, a naïve solution is for the data owner to hide the meanings of items in its transaction database by substituting items with unique numbers(where the same item is substituted by the same number and different items are substituted by different numbers). This one-to-one substitution approach doesn’t hide the frequencies of items. If the serverhas some background knowledge (for example, information on the frequencies of some items), it can reidentify them, particularly the most frequent items.For example, if bread is the most frequent item in retail transaction databases, the server can conclude that the most frequently occurring number refers to bread in the transformed database.
การแปล กรุณารอสักครู่..
