Running Data-Validation with Excel
We have two set of data came from different system, S-A and S-B, with the same format and contents. How shall we run data-validation using Excel? The Order Number from each data set is the primer key for each table.
In order to compare the data easily, we appended the data set next to the other one and created a new field to mark the data source, S-A and S-B. The idea of the data validation is to compare information, Model Name and Updated Date, with the same Order Number.
The validation on primer key could be easy by counting each order number in the field of the appended table. Order number counting as two means it appeared in both data source. For those counting as one, they should be the orders different from other data set. The formula we are using is:
=COUNTIF([Order Number],[@[Order Number]])
Secondly, we are going to build flags for each field to show if they matched the contents with the same order number. Before comparing the contents, we have to locate the same order number in the table.
We may mark the location of order number by function =([Order Number]=[@[Order Number]]), which returns an array containing 0 and 1. Further more, by multiplying the array of row number, ROW([Order Number]), we may convert the location to row number. For every matched order, it will return two number, the small one refers to the order from S-A and large one refers to the one from S-B. The function AGGREGATION can help us choosing the one we need. The completed formula for the flag should be:
=IF([@[Order_CHK]]=2,INDEX([Updated Date],IF(ROW()=AGGREGATE(14,4,([Order Number]=[@[Order Number]])*ROW([Order Number]),1),AGGREGATE(14,4,([Order Number]=[@[Order Number]])*ROW([Order Number]),2),AGGREGATE(14,4,([Order Number]=[@[Order Number]])*ROW([Order Number]),1))-1)=[@[Updated Date]],"")
There is another way to find the row number by using MATCH function. The key of it is using OFFSET function to exclude data from S-A. The formula is showing enclosed:
=IF([@[Order_CHK]]=1,"",INDEX([Model_Name],IF(ROW()-1=MATCH([@[Order Number]],[Order Number],0),MATCH([@[Order Number]],OFFSET([Order Number],ROW()-1,,),0)+ROW()-1,MATCH([@[Order Number]],[Order Number],0)))=[@[Model_Name]])
Now we have had the dataset marked properly and the next step will be analyze the data by setting filter as we want.