TFNP, or I know it’s there, somewhere

Continuing on the PPAD train of thought, I want to examine other problems in the TFNP zoo. As I’m personally trying to make sense of this whole thing, we’re first going to define TFNP formally. I will try to read along the definitions provided in [1,2], and write down some examples to illustrate the definitions.

Introduction to TFNP

TFNP is the class of problems where a solution is mathematically guaranteed to exist for every instance, and when given one, we can quickly verify it. The entire challenge is to actually find said solution. An easy example is the problem of Factorisation:

Input: An integer $n \in \mathbb{N}$.
Output: A list of numbers $\{ n_1, \ldots, n_k : n_i \in \mathbb{N} \}$ such that $\prod_{i \leq k} n_i = n$.

Notice that

Every integer has a factorisation in primes (and it’s unique up to re-ordering, but that’s not important for us).
Given a list of numbers, we can verify quickly if it is indeed a factorisation. Just multiply all the numbers to check $\prod_i n_i = n$, and check that $n_i$ is prime (Since Primality Testing is in P, this is a poly-time procedure).
However, we cannot easily find the prime factorisation of a number; indeed, the hardness of this problem is the core hypothesis of cryptography.

Why is this important ?

TFNP’s importance comes from two things. The first part of this importance comes from the problems in TFNP. Among others,

Our first example, Factorisation, is in TFNP, as we’ve already shown.
Finding $\varepsilon$-approximate Nash Equilibria in Bimatrix games is a PPAD-Complete problem, and PPAD is a subclass of TFNP [3].
Finding a Pure Nash Equilibrium of a potential game is PLS-complete, and PLS is also a subclass of TFNP [4].
Performing Gradient Descent on a continuous function over a bounded domain is CLS-Complete, and CLS is yet another subclass of TFNP [5].

Before we get too acronym-happy,

PLS, or Polynomial Local Search, captures the complexity of finding locally optimal solutions to NP-Hard optimisation problems.
CLS, or Continuous Local Search, captures the complexity of finding locally optimal solutions to NP-Hard optimisation problems, when the function is continuous. It is a set of problems in the intersection of PLS and PPAD, and is therefore also a subset of TFNP. ([5] proved that CLS is exactly the set of problems in both PPAD and PLS, but we’ll skip that for now).

The other half of the story is that these problems are computationally hard to solve. As discussed in a previous post, PPAD-complete problems are highly unlikely to admit poly-time algorithms. A future post will discuss approximability in more detail for the other classes in this zoo. As established before, Factorisation is widely believed to be difficult, a hypothesis central to cryptography.

To conclude, some important problems are in this class, and they’re all computationally difficult to solve, in general. This motivates us to formally characterise these types of problems, which is the subject of this post.

Relations and Strings

We first pass through the definitions of (binary) relations defined on finite-length strings. This will help us precisely formulate the classes FNP and TFNP shortly.

Let $\Sigma$ be some alphabet - this is a finite set of symbols. For convenience, we can set it to be $\{ 0, 1 \}$. Then, let $\Sigma^\ast$ be the set of all finite length strings created using this alphabet. For example, $001100 \in \Sigma^\ast$ (This is the Kleene star operator on this alphabet).

A Relation $R$ is a subset of the possible pairs of strings i.e. $R \subseteq \Sigma^\ast \times \Sigma^\ast$. $R$ states that for two strings $x, y \in \Sigma^\ast$, $x$ is related to $y$ if and only if $(x,y) \in R$. As an example, take $R_{div}$ as the relation “divisible by” i.e. $(x,y) \in R_{div}$ if and only if $x$ is divisible by $y$. For example, $(100, 010) \in R_{div}$.

Polynomially Balanced/Computable Relations

We want to consider relations that are at least easy to evaluate. Clearly, we cannot consider every possible relation over strings. Indeed, [1,2] consider polynomially balanced and polynomial-time computable/recognisable relations. First, we define the polynomially balanced relations.

A relation $R$ is polynomially balanced $\iff$ $(x,y) \in R$ implies that $|y| \leq p(|x|)$, for some polynomial $p$.

The implications of this are quite direct; the “solution” $y$ cannot be exponentially bigger than the “input” $x$ for a polynomially balanced relation $R$. This also means that reading $y$ takes time at most polynomial in the size of $x$. Of course, $R_{div}$ is polynomially balanced, for the following reason:

Suppose $(x,y) \in R$ i.e. $y$ divides $x$. Then, it is necessarily true, since these represent integers, that $y \leq x$. This implies that $y$ requires at most $|x|$ bits to be represented, and this is a polynomial dependence. Yes, the polynomial indicating the size is the identity polynomial, but shush!

Indeed, if $y$ takes more bits than $x$ to write, we can already decide the problem for $R_{div}$ as No.

Note that the notion of polynomially balanced does not say how much time it takes to verify $(x,y) \in R$. Even for polynomially sized $|y| \leq p(|x|)$, some relations $R$ might need to read $y$ exponentially many times during their computation; this is not efficient. This is where the polynomial time computability / recognisability kicks in.

A relation $R$ is polynomial-time computable $\iff$ there exists some Turing machine that can decide in polynomial-time if $(x,y) \in R$ or not.

For $R_{div}$ we can implement the integer division algorithm in a Turing machine. For $n$-bit numbers, this takes, say, $\mathcal{O}(n^2)$ time using Schoolbook Long Division, which is a polynomial time procedure. Therefore, “divisible by” is a polynomial-time computable relation on $\Sigma^\ast$.

On the other hand, for some relations, if the output $y$ takes exponentially many bits to write, then the algorithm is forced to take exponential time to compute a response. For example, let $R_{un}$ be the relation between the binary string $x$ and its representation in unary $y$. Concretely, $3$ ($x=011$) becomes $y=111$, $4$ ($x=100$) becomes $y=1111$ and so on. Verifying this takes polynomial time in $|x|$ and $|y|$, but since the unary representation $y$ is exponentially larger than $x$, this entire operation will take exponential time.

As you can see, this breaks any notion of efficiency that we may have in mind, so we want to exclude this case. This is why we need to consider the polynomially-balanced assumption as well.

Overall, the upshot with this restricted class of relations is that we can check a solution in polynomial time.

Search Problems and Functional NP

We’re almost there. We just need to define FNP; but this is straightforward now with the vocabulary we have. FNP is defined as the class of all problems of the following type.

Input: Let $R$ be a polynomially balanced and polynomial-time computable relation, and $x \in \Sigma^\ast$ be some string.
Output: Find any $y \in \Sigma^\ast$ such that $(x,y) \in R$, otherwise reply No.

There are some comparisons to make with NP, the more famous class of decision problems. FNP and NP both admit polynomial time verification of candidate solutions - we formalise it here via our polynomially balanced/computable relations. On the other hand, FNP explicitly requires us to output a solution. It is not sufficient to simply assert that a solution exists. This latter point is why NP is a class of decision problems, while FNP is a class of search problems.

While FNP is of independent interest, we will skim over this for a first pass.

TFNP : Total FNP

Finally, we can define what TFNP is.

TFNP is the subset of FNP problems which are guaranteed to have a solution. In other words, for an FNP problem of the form we discussed, it will never output No. Formally, we specify that the relation used in the canonical FNP problem is total; this is defined as follows.

A relation $R$ is said to be total if for every $x \in \Sigma^\ast$, there is always a $y \in \Sigma^\ast$ such that $(x,y) \in R$.

This gives us the definition of TFNP as “the class FNP restricted to total relations R”[1]. More precisely, TFNP is the set of all problems of the following type.

Input: Let $R$ be a total, polynomially balanced and polynomial-time computable relation, and $x \in \Sigma^\ast$ be some string.
Output: Find any $y \in \Sigma^\ast$ such that $(x,y) \in R$.

Conclusion

We went through the definition of TFNP via relations and FNP. In summary, a lot of computationally hard problems are in TFNP and this makes it an interesting class to study. Interesting exercise, and since I haven’t seen relations and their ilk since my Masters’, it’s good to revise things once in a while.

Otherwise, this sets up the stage nicely for the hardness of TFNP problems, which will be the subject of a future post.

[EDIT, 1 May 2026] Just to write things out explicitly

By efficient, we usually mean that it takes a polynomial time in the representation size of the input $x$. For example, $R_{un}$ is considered inefficient since it takes time exponential in $|x|$ to verify.
We defer the problem of membership in TFNP for another day. This is tricky since TFNP is a semantic class defined by the formal properties of its problems, and not those of its solutions. Compare this for example, to NP: For each problem in NP, a candidate solution is verifiable in polytime. On the other hand, each problem in TFNP is guaranteed to have a solution. See the difference?

References

Megiddo, Nimrod, and Christos H. Papadimitriou. “A note on total functions, existence theorems, and computational complexity.” Tech. report, IBM, Tech. Rep. (1989).
Papadimitriou, Christos H. “On the complexity of the parity argument and other inefficient proofs of existence.” Journal of Computer and system Sciences 48.3 (1994): 498-532.
Chen, Xi, Xiaotie Deng, and Shang-Hua Teng. “Settling the complexity of computing two-player Nash equilibria.” Journal of the ACM (JACM) 56.3 (2009): 1-57.
Fabrikant, Alex, Christos Papadimitriou, and Kunal Talwar. “The complexity of pure Nash equilibria.” Proceedings of the thirty-sixth annual ACM symposium on Theory of computing. 2004.
Fearnley, John, et al. “The complexity of gradient descent: CLS= PPAD∩ PLS.” Journal of the ACM 70.1 (2022): 1-74.