Algorithms for Hard Problems

Lecturers

Karolina Sołtys, Magnus Wahlström

Description

Note: the workshop will be in English with the possibility to ask questions in Polish.

In computer science, there is a division of problems into "hard" and "easy". For the "easy" problems, we know algorithms to solve them perfectly, in reasonable time, for all inputs — technically speaking, these problems have algorithms which solve the problem in an amount of time that grows polynomially with the size of the input, and these problems make up the complexity class P. For example, the problem of finding a short path through a road network belongs here, as solved by GPS navigation systems.

Still, many problems, important problems that we need to solve, belong to the class of hard problems — technically, they are NP-complete. This implies that an algorithm which solves them completely, for all possible inputs, will for some inputs need an amount of time which is exponential in the input size. At least, this is true for all algorithms we know today — to really prove that this is necessary is a huge open problem; see the Clay institute's P?=NP entry or the Wikipedia "P vs NP" page for more information.

In this workshop, we review three ways to deal with such hard problems, while retaining a solid theoretical footing, and providing proven guarantees on the algorithms' behaviour. First, we focus on faster exact algorithms: algorithms that have to solve all instances perfectly, and therefore need exponential time, but which are still much faster than a naive or "first attempt" at an implementation would be.

Second, we look at so-called parameterized algorithms. For many problems, only some input types really need exponential time, while other inputs are relatively simple to solve. Often, we can extract this difference formally into a parameter, and prove that problems with a small parameter value (for example, problems with a simple structure or a small solution) can be
solved efficiently.

Finally, we give a quick look into the world of approximation, where the goal is to find a solution in polynomial time, which is not too much worse than the best possible solution (for example, if looking for the cheapest solution for a problem, one may be able to find a solution that is guaranteed to be at most twice as costly as it needs to be).

We will show useful techniques for creating all three kinds of solutions and give you some nice exercises to practice them.

Programme

  • Exact (exponential) algorithms
  • Approximation algorithms
  • Parameterized algorithms
    • kernelization
    • bounded search trees
    • color coding
    • iterative compression
    • treewidth

Prerequisites

You should have some basic algorithmic know-how and understand the notion of time complexity of an algorithm (big-Oh notation etc.). It would be good if you were aware of the existence of computationally hard problems: it will be enough if you read the chapters "What is the P versus NP Problem?" and "Dealing with Hardness" from this article.

Course Materials

Here are the sketchy lecture notes + exercises we were handing out in class:

  • Day 1 — Approximation Algorithms
  • Day 2, part 1 — Exact Exponential Algorithms
  • Day 2, part 2 — Introduction to Fixed Parameter Tractability: Branching (Bounded Search Trees), Kernelization
  • Day 3 — More FPT: Sunflower Lemma, Color Coding, Iterative Compression

Qualifying exercises

Please send your solutions to our emails before July 15. The solution to at least one of the four tasks should be written in English. The quality of your English won't influence the scoring as long as you're able to get your point across. Feel free to send in partial solutions and if you have any questions just ask via email. Good luck!

  1. Assume that you are given $n$ points in the plane, and are asked to draw at most $k$ straight lines, such that every point has at least one line going through it. Expect that $n$ is much bigger than $k$ (say, $n=k^3$ or bigger).
    1. Find simplification rules, for example, points you do not have to care about, or lines that have to be used in the solution, or even conditions that prove that the task is impossible. Make sure that the simplifications are correct. Can you find rules which can always be applied for instances with $n>p(k)$, for some polynomial $p(k)$?
    2. Sketch how such a set of simplification rules gives an algorithm with running time better than brute force.
  2. Given a tree with nodes labeled by non-negative integers, find the largest possible sum of label values for a set of nodes which does not include any adjacent pair of nodes. The algorithm should run in polynomial time. (Clarification: we mean "tree" - type of graph and not "tree" - data structure.)
  3. A group of $n$ people want access to $m$ locked boxes. Each box has two identical keys, owned by different people. Nobody will give their keys away to anyone else. We want to find a set of as few people as possible such that all boxes can be unlocked.
    1. Find in polynomial time a solution which uses at most twice as many people as necessary.
    2. Give an algorithm that decides in time $O(f(k) \cdot p(n+m))$ whether there is a solution that only uses $k$ people, where $p(\cdot)$ is some polynomial and $f(k)$ is a function that only depends on $k$. Any such function $f(k)$ is accepted, although $f(k)=2^k$ or better is a good reference level to aim for.
  4. One of the most studied problems in theoretical computer science is Satisfiability: determine whether a given Boolean formula over $n$ variables is satisfiable. This is a canonical NP-complete problem. In this exercise we will be looking at several variants of this problem:
    1. In the problem 2-SAT we restrict our attention to conjunctions of binary clauses, that is formulas of the following form called 2-CNF (Conjunction Normal Form): $C_1 \wedge C_2 \wedge \dots \wedge C_m$, where $C_i = l_1^i \vee l_2^i$, $l_j^i \in \{x_k, \neg x_k | k \in 1 \dots n\}$. For example this is a 2-CNF formula: $(x_1 \vee \neg x_3) \wedge (\neg x_1 \vee x_2) \wedge (x_1 \vee x_3) \wedge (\neg x_2 \vee \neg x_1)$ and it is not satisfiable, that is no assignment of truth values to the variables $x_1, x_2, x_3$ satisfies all the four clauses. Give a polynomial time algorithm testing if a given 2-CNF formula is satisfiable.
    2. If in the definition of our previous problem we allow the clauses to be ternary (ie. now $C_i = l_1^i \vee l_2^i \vee l_3^i$, where $l_j^i \in \{x_k, \neg x_k | k \in 1 \dots n\}$ for $j = 1, 2, 3$) the problem becomes NP-hard, so we can't really expect a polynomial algorithm. We can obviously test all possible assignments, which will give us an $O(2^n \cdot \rm{poly}(n))$ algorithm. Find an $O(c^n)$ algorithm for some constant $c < 2$.
    3. Another thing we can do with the hard 3-SAT problem is to try to find an assignment satisfying as many clauses as possible. In the optimization problem MAX-3-CNF we ask about the maximum number $OPT$ of clauses that we can satisfy at the same time with any assignment. Describe a polynomial 2-approximation algorithm for this problem, that is an algorithm always returning an assignment satisfying at least $OPT/2$ clauses.
O ile nie zaznaczono inaczej, treść tej strony objęta jest licencją Creative Commons Attribution-ShareAlike 3.0 License