Special Issue Article
Privacy Preserved Data Scheduling For Cloud Data Services
Cloud computing provides massive computation power and storage capacity. Cloud enables users to deploy computation and data-intensive applications without infrastructure investment. Intermediate data are generated under the cloud applications and stored to save the cost of recomputing. Adversaries may recover privacy-sensitive information by analyzing multiple intermediate data sets. Encrypted data storage mechanism is used to secure cloud data values. Encrypting all intermediate data sets are neither efficient nor cost-effective one. In data intensive applications encrypt/decrypt operation requires high time and cost. Partial encryption mechanism is used to provide privacy on data with minimum resource levels. The original data and intermediate data are protected with the support of encryption and anonymization techniques. Intermediate data sets in cloud are accessed and processed by multiple parties, but rarely controlled by original data set holders. Encrypting all intermediate data sets will lead to high overhead and low efficiency. Single intermediate data privacy model is used to protect intermediate data under only one node. Multiple intermediate data sets is protected by using joint privacy leakage model. An upper bound privacy leakage constraint-based approach is used to identify which intermediate data sets need to be encrypted. Sensitivity relationship between multiple data set is represented under Sensitive Intermediate data set Graph (SIG). Privacy-Preserving Cost Reducing Heuristic algorithm is used to control privacy leakage in multiple data sets. Multiple intermediate data set privacy models is integrated with data scheduling mechanism. Privacy preservation is ensured with dynamic data size and access frequency values. Storage space and computational requirements are optimally utilized in the privacy preservation process. Data distribution complexity is handled in the scheduling process.