3 Approach and 3.1 Differential Testing for XML Processors
3.2 XPath Expression Generation
4.3 Comparison to the State of the Art
4.4 Analysis of BaseX Historical Bug Reports
6 Conclusion, Acknowledgments, and References
4.1 EffectivenessIn this section, we show XPress’ effectiveness through the number of bugs found, developer feedback, and illustrative examples.
\
\
\ Methodology. We implemented the tool while intermittently testing the systems over a period of 3 months. For every found discrepancy, we reduced the test case. If the test case exhibited an unreported pattern, we considered it likely to be an unknown bug and reported it to the developers. Note that this was a best-effort approach, and that it is an unsolved problem of how to identify duplicate bugs effectively. Whether we considered a bug as unique was based on the developers’ verdict; we considered a bug only as unique if an issue was addressed through an independent bug fix. Unfixed bugs hinder testing, as the duplicates tend to be repeatedly triggered. To tackle this, we attempted to disable the construction of bug-inducing elements, and also ignored known discrepancy patterns before the reported bug was resolved.
\ Found bugs overview. As shown in Table 1, we successfully found 27 unique bugs in total, 15 in BaseX, 6 in eXist-DB, 4 in Saxon, and 2 in the commercial DBMS. As detailed subsequently, we could have reported additional bugs for eXist-DB and the commercial DBMS, but refrained from doing so due to the high number of unfixed bugs for eXist-DB, and lack of developer feedback for the commercial DBMS. The bug-inducing test cases we found were not covered by the W3C qt3 test suite [19], which contains around 30,000 tests for XPath and XQuery—Saxon 11.1 passes all applicable tests in the W3C qt3 test suite [17]. Out of the 27 bugs found, the majority, 19 bugs, were logic bugs. Based on developer feedback, we learned that among the 20 fixed bugs, at least 8 bugs were due to incorrect optimizations. We detected the remaining bugs through unexpected errors. All systems we tested were implemented in Java, so we did not observe any crash bugs. We did not find any bugs in PostgreSQL and libXML2, both of which are known to be robust systems. For example, previous bug-finding efforts on testing DBMSs using SQL queries also found no logic bug in PostgreSQL [34, 35].
\ Small-scope hypothesis. We observed that the reported bugs are mainly reproducible by short test cases. 70% of all the reported cases can be reproduced with an XML document that consists of only one node and 91% of XPath expression consists of only one section. The average length of the XML documents in the reported test cases was 12 characters and XPath expressions 30 characters. This
\
\
\ phenomenon is known as the small-scope hypothesis [21], and this observation has been exploited in test.
\ Developer reception. Developer feedback is an important indicator of the bugs’ importance. A core developer of BaseX stated *"Thanks for sharing the bug reports with us. I appreciate that, they’re definitely helpful."[*3] All 15 bugs reported to BaseX were resolved within one month—10 bugs were resolved even within 24 hours. This indicates not only that the team was fast in resolving bugs, but also that the bug reports were considered valuable. Due to the timely fixes of the BaseX team, we invested most time and effort in testing BaseX. After encouragement from the developers of BaseX, we contributed the bug-inducing test cases to the W3C XQuery and XPath test suite [19]. Most bugs submitted to eXist-DB have not yet been fixed, which is likely the result of the many open issues (over 400). Nevertheless, the developers from eXist-DB confirmed the bugs quickly and also expressed appreciation towards the bug reports "thank you for finding and reporting."[4][5] Because the reported bugs remained unfixed for over two months, we stopped testing and reporting to eXist-DB after reporting the first few found inconsistencies due to the difficulties of filtering out duplicate bugs. We believe that XPress has the ability to find more bugs in eXist-DB after the known bugs are resolved. Similarly, for the commercial DBMS, since the developers did not follow up on the bugs that we reported, we stopped testing this DBMS. For Saxon, all four bugs reported were resolved quickly within one week’s time.
\ Selected bugs. Below, we give a few selected examples of bugs found by XPress to illustrate its bug-finding capability.
\ Incorrect optimization of comparison conditions. Figure 4 shows a fixed bug that we reported to BaseX.[6] The XPath expression selects all T nodes with attribute @t that satisfies @t >= 0 or @t <= 1. When @t exists and is a numeric value, this is a condition that always evaluates to true. Therefore, an optimization in BaseX rewrote the predicate to true. However, when @t does not exist for node T, @t evaluates to an empty sequence and returns false for both @t >= 0 and @t <= 1. Before we reported this bug, this case was overlooked and resulted in an incorrect optimization.
\
\
\ Arithmetic overflow in pre-check conditions. Figure 5 shows a fixed bug that we reported to BaseX.[7] last() and position() returns the context size and the context position from the dynamic context respectively. In the context XML document, the prefix //S selects only one node, and therefore both last() and position() return 1. Therefore, the condition is true and node S should be selected. In BaseX, an empty result set was returned. The problem was related to optimization for positional arguments in conditional comparisons. BaseX substituted last() with the greatest theoretical last() value and checked if the condition could evaluate to true. If not, the condition could not be satisfied regardless of the actual context and could be rewritten to false to reduce context analysis. When calculating the multiplication, as the theoretical maximum value for last() is a big integer, calculating the expression with long instead of double caused an overflow and produced the incorrect result.
\ Result of tail after subsequence off by one. Figure 6 shows a fixed bug that we reported to eXist-DB.[8] 1 to 2 creates an integer sequence consisting of 1 and 2. The subsequence() function in this example selects two elements starting from index 1, and the tail() function returns a new sequence excluding the first element of the input sequence. The correct result is to return 2. Unexpectedly, eXist-DB returned an empty result set. This was caused by a mistake when processing a call to tail that has a call to subsequence as an argument, which incorrectly reduced the ending index by 1.
\ Incorrect reduce in positional expressions. Figure 7 shows a fixed bug that we reported to Saxon.[9] The dot (.) stands for the current context in XPath expressions. For node B, (., .)/parent::* selects the single node A as the parent. Therefore, last() = 1 and the condition evaluates to false. Saxon unexpectedly returned the node B. The = operator is considered to be an unordered operator, which does not require operands to be sorted. In Saxon, an optimization was applied to eschew removing duplicate nodes when evaluating the sub-expression, which resulted in A being selected twice and last() evaluated to 2. After we found and reported the bug, a patch was applied by the developers to remove the duplicates, when the left operand of = is positional sensitive.
\
:::info This paper is available on arxiv under CC BY 4.0 DEED license.
:::
[3] https://www.mail-archive.com/[email protected]/msg15173. html
\ [4] https://www.mail-archive.com/[email protected]/msg15204. html
\ [5] https://github.com/eXist-db/exist/issues/4830
\ [6] https://github.com/BaseXdb/basex/issues/2190
\ [7] https://github.com/BaseXdb/basex/issues/2220
\ [8] https://github.com/eXist-db/exist/issues/4830
\ [9] https://saxonica.plan.io/issues/6093?pn=1#change-24136
:::info Authors:
(1) Shuxin Li, Southern University of Science and Technology China and Work done during an internship at the National University of Singapore ([email protected]);
(2) Manuel Rigger, National University of Singapore Singapore ([email protected]).
:::
\
All Rights Reserved. Copyright , Central Coast Communications, Inc.