CS代考计算机代写 decision tree Prof. Mark Bun

Prof. Mark Bun
CAS CS 591 B: Communication Complexity
Lecture Notes 18:
Pattern Matrix Method
Fall 2019
Reading.
• Sherstov, The Pattern Matrix Method
We’ll continue our discussion of lifting theorems by studying Sherstov’s Pattern Matrix Method, which is a technique for lifting lower bounds on the approximability of boolean functions by poly- nomials to communication lower bounds.
De􏰣nition 1. Let ε > 0 and let f : {−1,1}n → {−1,1}. A real polynomial p : {−1,1}n → R ε-approximates f if |p(x) − f (x)| ≤ ε for all x ∈ {−1, 1}n . The ε-approximate degree of f is the least degree of a polynomial p which approximates f, and is denoted adegε(f).
Approximate degree itself is versatile for proving lower bounds in query complexity. For example, it lower bounds randomized query (decision tree) complexity.
Proposition 2. Let f : {−1,1}n → {−1,1}. Then for every ε > 0, BPPdtε(f) ≥ adeg2ε(f).
Proof. Let T be a randomized decision tree of depth d computing f to error ε. We will use T to construct a (2ε)-approximating polynomial for f of degree d.
First observe that if T is a deterministic decision tree of depth d, then T(x) can be written exactly as a degree-d real polynomial pT . Now de􏰣ne
p(x) = ET ←T [pT (x)]
which is also a degree-d-polynomial and satis􏰣es p(x) ∈ [−1,1] for all x. Moreover, p(x) has the
property that if f(x) = 1 then
p(x) = Pr[pT (x) = 1] − Pr[pT (x) = −1] ≥ 1 − 2ε,
and similarly if f(x) = −1 then p(x) ≤ −1 + 2ε. Hence p is a 2ε-approximating polynomial for f.
Although we won’t really talk about it, much of the power of approximate degree comes from the fact that it also lower bounds quantum query complexity.
Example 3. Let XORn be the parity function on n bits. Then for every ε ∈ (0,1) we have adegε(XORn) = n.
Proof. Every boolean function is exactly computed by a polynomial of degree n. One can see this by considering the polynomial obtained from a depth-n deterministic decision tree. To see that degree n is necessary, suppose p : {−1, 1}n → R is any polynomial which agrees with XORn in sign.
1

(An approximating polynomial with ε < 1 certainly does this). Then consider the symmetrization P : {0, . . . , n} → R of p de􏰣ned by P(k) = E|x|=kp(x). One can show that deg P ≤ deg p. Moreover, P (k) < 0 whenever k is even and P (k) > 0 whenever
k is odd. So P has at least n sign changes, and therefore degp ≥ degP ≥ n.
Example 4. adeg1/3(ORn) = Θ(√n). For any ε ∈ (0, 1), we have adegε(ANDn1/3 ◦ ORn2/3 ) ≥
Ω(n1/3 ).
The Pattern Matrix Method applies to composed functions of the form F = f ◦ gmn where
gm : {−1, 1}m × ([m] × {−1, 1}) is a variant of the indexing gadget de􏰣ned by g(x, (i, w)) = xi ⊕ w.
It is quite a versatile technique with several di􏰢erent speci􏰣c formulations (lifting di􏰢erent variants of approximate degree to di􏰢erent measures of communication complexity). Here we will only state two.
Theorem 5. Let F = f ◦ gmn . Then For every δ > 0,
disc(F ) ≤ δ + m− adeg1−δ (f )/2.
To see how such a theorem can be used to prove communication lower bounds, suppose we were able to show that
adeg1−2−c (f ) ≥ c
for some function f. Then taking m = 4 and δ = 2−c we would have
disc(F ) ≤ 2−c + 4−c/2 = 2−c+1.
Hence PPcc(F ) ≥ Ω(log(1/disc(F ))) ≥ Ω(c).
Here is another formulation. For a sign matrix S and ε > 0, let arankε(S) be the minimum rank
of a real matrix R such that |R[x, y] − S[x, y]| ≤ ε for every x, y. By similar arguments to what we used to show that rank lower bounds Pcc and sign rank lower bounds UPPcc, approximate rank gives a lower bound on randomized communication complexity:
BPPcc1/3(F ) ≥ Ω(log arank1/3(SF )) − O(log n).
(The additive loss of O(log n) comes from using Newman’s theorem to convert a protocol for F to a private-coin protocol.) Actually, approximate rank gives the same kind of lower bound for quantum communication complexity as well, and the Pattern Matrix Method is the only known tool for lifting a generic query complexity measure (such as approximate degree) to quantum communication complexity.
Theorem 6. Let 0 < δ < ε and let F = f ◦ gmn . Then 􏰈ε−δ􏰉2 arankδ(F) ≥ 1 + δ · madegε(f) 2 1 Dual Formulation of Approximate Degree The proof strategy underlying Theorems 5 and 6 is totally di􏰢erent from what we used to prove the deterministic lifting theorem. Recall that in the deterministic lifting theorem, we took a protocol Π for the composed function F and used it to construct a decision tree for f. Here, we take the opposite approach. We will argue that if f is hard to approximate by low degree polynomials, there is a certain object ψ which serves as a witness to this fact. We will then use ψ to construct a distribution under which F has high communication complexity. The following theorem produces the required ψ through an equivalent characterization of ap- proximate degree. Theorem 7. Let f : {−1,1}n → {−1,1} be a boolean function. Then adegε(f) > d if and only if there exists a function ψ : {−1, 1}n → R such that
1. ⟨f, ψ⟩ := 􏰂x∈{−1,1}n f (x)ψ(x) > ε 2. ∥ψ∥1 := 􏰂x∈{−1,1}n |ψ(x)| = 1
3. ψˆ(S) = 2−n 􏰂x∈{−1,1}n ψ(x)χS(x) = 0 for every S ⊆ [n] with |S| ≤ d. Here χS(x) = 􏰹i∈S xi. One way to prove this theorem is using linear programming duality. The question 􏰠what is the
least ε for which there exists a degree-d approximating polynomial for f􏰡 is captured by the LP min ε
p,ε
s.t. |p(x)−f(x)|≤ε ∀x∈{−1,1}n
This is a linear program in the variables cS in the representation p(x) = 􏰂|S|≤d cSχS(x) and the parameter ε. One can take the dual of this LP and obtain
max
ψ
s.t.
ψ(x)f (x)
􏰕 |ψ(x)| = 1
x∈{−1,1}n 􏰕
x∈{−1,1}n
ψ(x)χS(x) = 0
∀|S| ≤ d.
By strong LP duality, the dual has value greater than ε i􏰢 the primal has value greater than ε. So there exists a function ψ with the properties listed in Theorem 7 i􏰢 f cannot be approximated to error ε by degree-d polynomials.
To gain intuition for this characterization, it is helpful to work out by hand the 􏰠easy􏰡 direction of the equivalence, which is that the existence of such a function ψ implies that adegε(f) > d. To see this, suppose for the sake of contradiction that such a ψ existed as well as an approximating polynomial p for f of degree d. Then we would have
⟨f − p, ψ⟩ = ⟨f, ψ⟩ − ⟨p, ψ⟩ > ε 3

using properties 1 and 3 of ψ together with the fact that p is a linear combination of degree-d monomials. On the other hand,
|⟨f−p,ψ⟩|≤ 􏰕 |f(x)−p(x)|·|ψ(x)|≤max|f(x)−p(x)|·∥ψ∥1 ≤ε
x∈{−1,1}n
x
using the fact that p is an approximating polynomial together with property 2 of ψ. This is a contradiction.
4

Leave a Reply

Your email address will not be published. Required fields are marked *