Your resource for web content, online publishing
and the distribution of digital products.
S M T W T F S
 
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
16
 
17
 
18
 
19
 
20
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
 

An Efficient One-Round Adaptive Algorithm In Equivalence Testing

Tags: testing
DATE POSTED:December 9, 2024

:::info Authors:

(1) Diptarka Chakraborty, National University of Singapore, Singapore

(2) Sourav Chakraborty, Indian Statistical Institute, Kolkata;

(3) Gunjan Kumar, National University of Singapore, Singapore;

(4) Kuldeep S. Meel, University of Toronto, Toronto.

:::

Table of Links

Abstract and 1 Introduction

2 Notations and Preliminaries

3 Related Work

4 An Efficient One-Round Adaptive Algorithm and 4.1 High-Level Overview

4.2 Algorithm Description

4.3 Technical Analysis

5 Conclusion

6 Acknowledgements and References

A Missing Proofs

B An O(log log n)-query fully adaptive algorithm

4 An Efficient One-Round Adaptive Algorithm

Our main contribution is a one-round adaptive tester for the equivalence testing problem in the COND model that makes at most O˜(log n) queries.

\

\ In the remaining part of this paper, we will prove the above theorem. We start with a high-level idea behind our algorithm, and then we provide a formal description of the algorithm with a detailed analysis.

4.1 High-Level Overview

The first attempt to design an equivalence tester is to pick a sample, say i, from P and compare the probability mass P(i) and Q(i) in P and Q respectively. If P and Q are far (in total variation distance), then with enough probability, the probability mass of i in both distributions is significantly different. However, the issue is that estimating the probability mass of the element i can be very expensive. To bypass this issue, the idea is to sample a subset S of the domain, hoping the following:

\ • the probability mass of S in P is comparable to that of the mass of S in Q, and

\ • the probability mass of i (in P) is similar to the probability mass of S.

\ Assuming that all the above two statements hold, one can use conditional sampling (conditioned on S ∪i) to compare P(i)/P(S ∪ i) and Q(i)/Q(S ∪ i). Since, we assumed P(S) is similar to Q(S) so with high enough probability P(i)/P(S ∪ i) and Q(i)/Q(S ∪ i) will be different enough. And since we assumed that P(i) is comparable to P(S), one can estimate P(i)/P(S ∪ i) and can upper bound Q(i)/Q(S ∪ i) using only a few samples, which should be sufficient to distinguish P from Q if the the two distributions are far. But the issue is how to take care of the two assumptions.

\

\ • (Case 2) There is large “tail probability,” because of which concentration is not possible. (The notion of tail probability is formalized in Section 4.3).

\

\

:::info This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.

:::

\

Tags: testing