A Simplified Framework for Data Cleaning and Information Retrieval in Multiple Data Source Problems
Agusthiyar.R,1, Dr. K. Narashiman2
|Related article at Pubmed, Scholar Google|
Nowadays, data cleaning solutions are very essential for the large amount of data handling users in an industry and others. Normally, data cleaning, deals with detecting and removing errors and inconsistencies from data in order to improve the quality of data. There are number of frameworks to handle the noisy data and inconsistencies in the market. While traditional data integration problems can deal with single data sources at instance level. But the data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. This paper proposed a framework to handle errors in heterogeneous data sources at schema level and this framework detecting and removing errors and inconsistencies in a simplified manner and improve the quality of the data in multiple data source of the company having different sources of different locations.