The case of the missing data: Spreadsheet applications
From time to time I get asked:
What makes PROLab (our software for interlaboratory studies) so special?
People often say that they already evaluate their data with a basic spreadsheet application, like Excel. So why do we believe in PROLab so much?
One of my first points of contention for the need of specialized software, is to point out the flaws in formatting behavior of programs like Excel. Spreadsheet applications often ignore important details in your data.
What do I mean?
When we handle data we put big emphasis in using it as-is. That means our aim is to work with exact values including all decimal places and formatting. We know how important the difference between various significant figures can be.
However most spreadsheet applications disregard significant figures. If you copy the value 0.500 into Excel it will save 0.5 instead. Although this is the same numerical value, some information is lost—information that can have an impact. Significant figures show how precise measurements are respective to the method and they are an indicator of how precisely data was collected. This can make a substantial difference when assessing the uncertainty of a measured value.
You can format Excel to change the number of zeros taken into account, but only if all values are formatted the same way. If there are differences, which are inevitable when different laboratories with different methods become involved, these differences are lost. Even if you are aware of this when importing values, are all the other participating laboratories also aware?
Equally important is how your data is saved. Can you guarantee that your spreadsheet saves your values exactly as written and that it does not condense the values without regard for original significant figures?
Consider the European Commission’s Joint Research Centre as an example. When participating National Reference Laboratories (NRLs) join a PT, it is important that all labs submit their results as originally recorded for assessment of measurement precision.
It is for all these reasons that we offer PROLab and a data collection tool similar to an Excel import. PROLab is designed to handle data from different sources and with different significant figures, while maintaining accuracy and precision in the end result.
I agree completely. As a great believer in the use of database technology, the whole concept of importing and copy/pasting data between systems is an anathema to me. It is both time-consuming and prone to error. If you have web-based results entry then you can use validation (eg Javascript) to some extent but it is generally best to store EXACTLY what the participant entered into a TEXT field in your database; you can handle such issues as different decimal place symbols for a multi-national system or greater-than, less-than values which are perfectly acceptable for many PT schemes. You can code in rules which take the participants' results and convert them for the appropriate statistical manipulation. VBA within Excel may indeed be slow but using stored procedures and functions within the database itself along with a client-based C++ or similar application gives the PT organiser an efficient and consistent set of figures. The final report can then be generated, once again directly from data in the database, showing exactly what the participant entered (including typos etc!) against their performance.
For example? What would make you turn away from Excel?
But you can format each row, right? There are actually many better reasons why not to use EXCEL!!