ICARUS: Interactive Cleaning and Rule System
User friendly database validation software
Data sets are booming larger and more complex, in order to use big data ideas all the data must be stored in an organized manner. Data cleaning is the process of removing corrupted records from a set of data, with the problematic data being updated or removed. Data validation is the process of confirming that the entered date is accurate. When these processes are done by hand it is exhausting and tedious. The problem increases in difficulty as the size of the dataset increases. A software which could reduce these headaches would be a big step forward in big data.
Dr. Arnab Nandi and his colleagues at The Ohio State University have developed a platform that works with a user to create a system of rules to automate the data validation process. Dr. Nandi’s proposed solution is a software system named ICARUS: Interactive Cleaning And RUles System. ICARUS shows potentially problematic data in the form of a matrix. After the user makes a change ICARUS proposes a rule for the user to approve which is then applied to the entire data set. The user can see what the outcome of rule will be by hovering their mouse over the field to be manipulated. Once finished ICARUS stores the set of rules for future use by other users. This software will lead to less time being spent on preparing data and more time analyzing making better business decisions.