Query error detection: using base rates to improve end user query performance

Robb, David A. (2004). Query error detection: using base rates to improve end user query performance PhD Thesis, School of Business, The University of Queensland.

End users with extensive experience with an organisation's data can often detect query errors when query results do not correspond to their ex ante expectations. Many end users, e.g., newly hired business analysts, however, compose queries of data with which they were previously unfamiliar. Their lack of familiarity with the data means that these end users are less able to evaluate the accuracy of their query results. Although additional query experience will eventually give them the familiarity with the data that they need, in the interim, the results of incorrect queries may lead to poor decisions and substantial losses.

Judgment tasks have been the focus of considerable prior research. In particular, many studies have examined how base rate information affects judgment performance. Further investigation of whether base rate information can be used to improve query accuracy thereby enhancing the quality of the data used for decision making is worthwhile for at least two reasons. First, such research will contribute to the overall body of knowledge of base rate use. Second, if end users can begin to use base rates more effectively, their query accuracy will improve, their information retrieval becomes more accurate, and losses due to poor decisions can be significantly reduced.

Five experiments were conducted to test whether base rate information and various inducements to use that information improve end users' query performance. Performance was measured on two primary dimensions: query accuracy and the alignment of end users' assessments of the correctness of their queries with actual query accuracy.

Experiment 1 tested whether the presence or absence of base rate information affected end users' query performance. Experiment 2 tested whether providing base rate information and admonishing the end users to use the base rate information affected their query performance. Experiment 3 tested whether varying the level of base rate information granularity in conjunction with admonishments to use the base rate information affected end users' query performance. In experiments 1 to 3, the base rate information consisted of the number of records found for prior, similar queries.

Experiments 4 and 5 participants received more realistic base rate information in the form of management reports. These reports were provided to both treatment and control group participants. In experiment 4, treatment group participants' sensitisation consisted of specifically admonishing those participants to consider and use the base rate information. In experiment 5, sensitisation of treatment group participants was via requests to provide a priori estimates of each query result based on the base rate information.

The results of the series of experiments verify that merely providing base rate information did not significantly affect end users' query performance. Admonishing end users to heed base rate information was, however, an effective means of enhancing query performance. Admonitions to use base rate information in conjunction with precise base rate information also positively affected end users' query performance. Requesting a priori estimates of expected query results had a positive affect on end users' query performance. End users' alignment of their levels of confidence in the correctness of their queries with actual query correctness was not significantly changed by use of base rate information.

The results of the series of experiments have at least the following implications for business. First, organisations need to regularly remind their managers and other decision makers of the value of base rate information as a diagnostic tool. Second, organisations should educate people about how to calculate and apply base rate information. Third, organisations should foster production of more flexible routine reports that allow inclusion of prior period data.
