Notes from the 'Hardness of Approximation: From the PCP Theorem to the 2-to-2 Games Theorem'
The information in this post consists of information provided by Subhash Khot ↗ in the ‘Prague Summer School on Discrete Mathematics 2020’.
We will focus on Hardness of Approximation, meaning that approximation algorithm of factor $\alpha$ is still NP-hard.
Known results
- Bin packing $1+\epsilon$ PTAS
- 3-SAT 7/8, hard for $7/8+\epsilon$
- set cover $\ln n$, hard for $(1-\epsilon) \ln n$
- clique $n$, hard for $n^{1-\epsilon}$
Plan:
- pre PCP Theorem – Edge disjoint Paths
- post PCP Theorem – 3 LIN oevr $F_2$
- Via Unique Games conjecture – Max Cut
- Proof of 2 to 2 games conjecture – clique($n/4$, $\epsilon n$)
Gap reductions
Poly time reduction 3SAT -> P with instance Phi -> I, where Yes (completeness) instance has $OPT(I) \ge h(n)$ and no instance (soundness) has $OPT(I) \le \alpha h(n)$.
Example: Hamiltonian cycle to TSP reduction.
Let us reduce graph from Ham cycle to TSP by setting edges for TSP to 1 iff they are in G for Ham.cycle, otherwise their weight is $\infty$.
Edge disjoint paths
- Input: $(s_i,t_i)$
- Goal: find maximum amount of disjoint paths from the sources to the sinks.
- is NP-hard to approximate EDP with factor $|E|^{1/2-\epsilon}$
Reduction from 2EDP (edge disjoint paths, same but with only two sources and sinks) which is known to be NP-hard. Create paths for each $i$, and cross all pairs. Substitute the intersection with $G$ from 2EDP. Yes instance will have $OPT(H)=n$ because EDP is solvable, otherwise we can only satisfy one, since every pair is crossing.
PCP (Probabilisticaly Checkable Proofs) Theorem
There is a reduction from 3SAT to Gap-3SAT (either we have satisfiable or at most $\alpha$ satisfiable 3SAT formulae). So if $\Phi$ is safisfiable $OPT(\Phi)=1$, but if $\Phi$ is not satisfisable, then $OPT(\Phi) \le \alpha$, so a big fraction of clauses are not satisfied.
NP class
Languages L that have membership proof that can be checked in deterministic polytime.
- Completness $x \in L \Rightarrow \exists \pi \text{ s.t. } V(\pi) \text{ accepts}$
- Soudness $x \not\in L \Rightarrow \forall \pi \text{ s.t. } V(\pi) \text{ rejects}$
PCP Theorem
Every $L \in NP$ has probabilistic polytime verifier that uses $O(\log n)$ random bits and makes $O(1)$ queries to proof, and has completness $1$ and soudness $\le 1/2$ (probability of wrong accept is at most $1/2$ or any constant).
Idea: For 3SAT the ‘supposed’ proof is the assignment to variables. (‘Supposed’ because we verify probabistically, and we could be fooled by a wrong proof by an adversary.) PCP proof creates a new proof $\pi$ which is an encoding of the original NP proof $\sigma$ via error correcting code s.t. $|\pi|=|\sigma|^{O(1)}$ (new proof has poly-length in the old proof).
We will show that two version of PCP theorem are equivalent:
- PCP Theorem version 1 (yes=1, no $\le \alpha$)
- PCP Theorem version 2 (yes accept with prob=1, no acecpts with prob \le 1/2)
1 $\Rightarrow$ 2
- Ask proved for the proof reduced via reduction to the reduced proof (Gap 3SAT).
- Pick random clause and check if it is satisfied. Read it and check those 3 bits.
- If yes we accept always; if no then we know that we accept with probability at most $\alpha$.
2 $\Rightarrow$ 1
We will design the reduction
- We have the probabilistic verifier and are given the input x; the $\pi$ is an unknown proof.
- Unknown proof $\pi = y_1y_2y_3\dots y_{|\pi|}$, but we know that if we read any constant number of random bits then we get constant that it must be satisfied (assume x is ok)
- We built polynomial number of constrants and get CSP (constrant satisfiability problem)
- Completness is 1 because every constraint is satisfied if x is yes; soudness is 1/2 (idk why 1:00:00 in the video)
- Reduce CSP to 3SAT
- yes => yes
- no (wrong $\le 1/2$) => no (wrong $\le 1-1/{2^2}$)
Hardness
Hardness results for approximation algorithms overwheamingly build upon PCP theorem (aside few exceptions).
The gap reductions can squeeze or widen the gap for the approximation.
3bit PCP
NP has PCP verifier that uses $O(\log n)$ random bits, $3$ queries, linear predicate (over $\mathbb{F}_2$); accepts correctly with $1-\epsilon$ and rejects incorrectly with $1/2+\epsilon$.
Given 3LIN instance S where each equation has $x + y + z = 0$ or $=1$ it is NP hard to diestinguish bet${}^n$. We can always find assignment which satisfies at least $1/2$ off all equations; but we cannot do better. (Loss of perfect completness is inherent, otherwise we could solve the problem faster.)
High level: The result comes from combination of two following problems.
- Dictatorship testing, Linearity testing, $A:{-1,1}^n \rightarrow {-1,1}$.
- PCP Theorem, build Label Cover, and use Parallel Repetition Theorem
Label cover problem
(Description is just cumbersome.) Labelling / constraint satisfaction problem. We have bipartite graph; vertices are variables, task is to assign labels to vertices. Labels come from an (constant-sized) alphabet and there is different alphabet for each side of the graph partity. Left-hand side will be assigned labels from $N$ and right-hand side from $M$. Each edge gives a constraint $\pi_{vw} : [M] \rightarrow [N]$ ($|M| » |N|$). Edge is satisfied if $\pi_{vw}(l(w))=l(v)$. Goal is to find labeling with maximal number of satisfied edges OPT.
It is NP-h to distinguish of $\mathbb{L}(G(V,W,E),[2],[7],{\pi_{vw}}) has yes=1, no$\le \alpha$.
Pf: Reduction from ${\rm Gap3SAT}_{1,\beta}$ (PCP Theorem) Left-side are variables $x_i$, right-hand side are clauses;
Each failed clause will have at least one failed edge, this is where $/3$ comes from. After reduction no is wrong in $\le 1-(1-\beta)/3 = \alpha$.
Label color problem has gap amplification property. There is a product construction, where the result has a boosted gap, i.e., using $k$ we have wrong no in $\le \epsilon = 2^{-k}$. So, this problem is very hard to approximate.
HW1
- max 2-SAT for $0.943\dots+\epsilon$ is hard assuming UGC; there exists a polynomial-time $0.943\dots$ approximation algorithm
Lecture 2
Basics of Fourier Analysis
$2^n$-dimensional space of $A:{-1,1}^n \rightarrow \mathbb{R}$ (but we are interested in $\rightarrow {-1,1}$)
Character functions – for every $\alpha \subseteq [n]$ character $X_\alpha(x)=\Pi_{i\in\alpha} x_i$ (these are in fact linear functions)
Every function on hypercube can be written as a combination of character functions $\sum_{\alpha\subseteq[n]} A_\alpha X_\alpha(x)$.
Linearity Testing
We will use multiplicative notation where we use ${-1,1}$ and use multiplication instead of ${0,1}$ and xor.
Theorem: 3 query test, perfect completeness, soudness $Pr[{\rm accept}] = \mathbb{E}{x,y}[(1+A(x)A(y)A(xy))/2]$, (note nominator is 1 iff A is linear). Magic happens and probability of acceptance results to $1/2 + 1/2 \sum\alpha \hat A_\alpha^3$.
Dictatorship Testing
Dict. functions: ${\rm dict}_i(x_1,\dots,x_n)=x_i$.
Test if fun is dictator.
Hastad: there is a 3-query linear test s.t. if a is a distator it accepts with $\ge 1-\epsilon$, if $Pr[{\rm accept}] \ge 1/2 + \eta$ then $A$ is $(2\eta,O(1/\epsilon \cdot \log (1/\eta))$-junta.
TODO
Dictatorship as Encoding (Long Code)
Encoding ${1,\dots,n} \rightarrow {-1,1}^{2^n}$, i.e., table of values of ${\rm dict}i (=X{{i}})$
Label cover (again)
Achieving Boolean queries, idea: we ask the prover to give us encoding by LongCode (from dictatorship), we can then check that the encoding is valid: 2 bits from encoding of w (codework test), and 1 bit from encoding of v (consistency test)