The Mathematics behind PQC: Multivariate Polynomials
Introduction
Multivariate cryptography, along with lattice-based, code-based and hash-based cryptosystems, is one of the primary classes of cryptographic schemes resistant to quantum attacks. These schemes are based on the difficulty of solving systems of multivariate equations over a finite field.
This branch of asymmetric cryptography emerged in the late 1980s with the scheme devised by Matsumoto and Imai (1988). Subsequent cryptanalysis of the scheme, proposed by Patarin in 1995, led to the development of new paradigms that today form the basis of numerous digital signature schemes.
Although none of the schemes standardized by NIST[1] are based on multivariate systems, recent years have seen significant advancements in this field due to the work of the cryptographic community. Indeed, multivariate cryptography now represents the majority of schemes proposed in NIST’s recent additional call for the standardization of new digital signatures.
Multivariate systems
A multivariate polynomial is a sum of monomials in multiple variables. Formally, it can be described as follows
p(x_1,x_2,\dots,x_n) = \sum_{i=1}^s c_i \cdot x_1^{\alpha_{i,1}} x_2^{\alpha_{i,2}} \cdots x_n^{\alpha_{i,n}}
where s is a positive integer, the coefficients c_i are elements of a field \mathbb{K} and the exponents \alpha_{i,j} are natural numbers.
The degree of the polynomial is defined as the maximum degree of the monomials it consists of. Remembering that the degree of a monomial is the sum of the exponents of the variables it comprises, we have
degree(p) = \max \limits_{i=1,\dots,s} \sum_{j=1}^n \alpha_{i,j}
Finally, note that a multivariate polynomial in n variables with coefficients in a field \mathbb{K} is a map p : \mathbb{K}^n \longrightarrow \mathbb{K} .
Multivariate systems are systems composed of m multivariate polynomials in n variables, which can be represented as follows \begin{cases} p^{(1)}(x_1,x_2,\dots,x_n) = \sum_{i=1}^{s_1} c_i^{(1)} \cdot x_1^{\alpha_{i,1}} x_2^{\alpha_{i,2}} \cdots x_n^{\alpha_{i,n}}\\ p^{(2)}(x_1,x_2,\dots,x_n) = \sum_{i=1}^{s_2} c_i^{(2)} \cdot x_1^{\alpha_{i,1}} x_2^{\alpha_{i,2}} \cdots x_n^{\alpha_{i,n}} \\ \vdots \\ p^{(m)}(x_1,x_2,\dots,x_n) = \sum_{i=1}^{s_m} c_i^{(m)} \cdot x_1^{\alpha_{i,1}} x_2^{\alpha_{i,2}} \cdots x_n^{\alpha_{i,n}} \end{cases} Consequently, these structures are maps \mathcal{P}=(p^{(1)},p^{(2)},\dots, p^{(m)}):\mathbb{K}^n \longrightarrow \mathbb{K}^m of m equations p^{(i)}:\mathbb{K}^n \longrightarrow \mathbb{K} in n unknowns.
Computational problems
The primary mathematical problem underlying the security of multivariate cryptographic schemes is the problem of solving polynomial systems, which involves the difficulty of solving a system of multivariate equations over a finite field. More formally, given a multivariate system \mathcal{P}:\mathbb{F}_q^n \longrightarrow \mathbb{F}_q^m with m equations and n unknowns defined over the finite field \mathbb{F}_q with q elements, the problem consists of finding an element \bar{\bm{x}}=(\bar{x}_1,\bar{x}_2,\dots,\bar{x}_n) in \mathbb{F}_q^n such that \mathcal{P}(\bar{\bm{x}}) = \mathbf{0} . Equivalently, such that
\begin{cases}p^{(1)}(\bar{\bm{x}}) = 0 \\
p^{(2)}(\bar{\bm{x}}) = 0\\
\vdots \\
p^{(m)}(\bar{\bm{x}}) = 0
\end{cases}
In cryptography, this problem is typically restricted to solving quadratic polynomial systems, that are, those of degree 2, and is referred to as the Multivariate Quadratic (MQ) Problem. Contrary to what one might think, the general version of the problem, for higher-degree polynomials, has not proven to be more complex than the MQ problem. In fact, the MQ problem, like its general version, belongs to the class of very difficult problems known as NP-Complete[2].
Very important parameters within this problem are m (the number of equations) and n (the number of variables). The most difficult instances are obtained when m abd n have similar values. If the difference in value between m and n is very large, there are algorithms that can efficiently solve the problem.
The MQ problem is often used in cryptography alongside a second problem called the Isomorphism of Polynomials (IP) Problem.
The latter has multiple variants, but in practice, it is typically reduced to its extended (EIP) form. Intuitively, this problem asks to find two affine maps that transform one multivariate system into another. More formally, given two multivariate systems \mathcal{A}, \mathcal{B} : \mathbb{F}_q^n \longrightarrow \mathbb{F}_q^m , the EIP problem consists of finding two affine maps S and T such that \mathcal{A} can be decomposed as follows
\mathcal{A} = S \circ \mathcal{B} \circ T
Unlike the MQ problem, not much is known about the difficulty of the IP problem and its variants.
Currently, the most effective algorithms for solving the MQ problem are based on algebraic techniques that utilize Gröbner bases, which are special algebraic structures introduced by Buchberger in 1965 that allow for finding the solutions of a system by providing a simpler representation. The computation of a Gröbner basis can be intuitively seen as an extension of Gaussian elimination to the multivariate case.
However, the greatest complexity of these algorithms lies precisely in the computation of the Gröbner basis, whose computational cost increases rapidly with the size of the problem.
This makes such approaches impractical for instances of sizes suitable for cryptographic applications. Among the algorithms that follow this approach, there are Buchberger’s algorithm and its improvements, such as F4 and F5.
Another well-known algorithm for solving the MQ problem is the XL (eXtended Linearization) technique, which, along with its more efficient variants, allows for transforming a quadratic multivariate system into a linear one with a larger number of variables.
Finally, there are hybrid approaches that maximize the effectiveness of a particular combination of algorithms on specific instances, for example, by combining Gröbner basis calculation techniques with the XL algorithm, or by applying one of the two after the exhaustive search technique, which randomly fixes the value of some variables.
History of Multivariate Cryptography
The development of multivariate cryptography dates back to the late 1980s, in the context of the rapid advances in public-key cryptography that followed the Diffie-Hellman and RSA constructions.
A fundamental contribution to this branch was made in 1988 by Matsumoto and Imai (MI), with the introduction of the use of multivariate schemes in public key cryptography.
The MI scheme is based on an innovative technique, still today the basis of many multivariate systems and more generally traceable to the class of bipolar constructions.
This approach proposes the use of a quadratic map, called central map, belonging to particular classes of maps that are easy to invert. This map is then suitably masked in order to obtain another quadratic map that appears to be random, and for which there are no more efficient approaches than those generally applicable to the MQ problem.
More precisely, in these constructions, the private key consists of the central map \mathcal{F}\colon \mathbb{F}_q^n \to \mathbb{F}_q^m and two affine maps T\colon\mathbb{F}_q^n \to \mathbb{F}_q^n and S \colon\mathbb{F}_q^m \to \mathbb{F}_q^m . The public key is instead the quadratic map \mathcal{P} = S \circ \mathcal{F} \circ T obtained by combining the secret maps.
Sometimes the second affine map S is not necessary and the public key is simply obtained as \mathcal{F} \circ T .
Bipolar constructions are potentially adaptable in the context of both public key encryption and digital signatures.
In the case of encryption, given a message \bm{v} \in \mathbb{F}_q^n , the sender will compute the ciphertext \bm{t} = \mathcal{P}(\bm{v}) . To decrypt, the receiver will use the public key decomposition via the private one to find a pre-image, calculating \bm{v} = T^{-1}(\mathcal{F}^{-1}(S^{-1}(\bm{t}))) .
In this case, by choosing m \ge n (where m is the dimension of the codomain and n that of the domain), it is guaranteed that the value \bm{v} calculated by the receiver is unique and coincides with the original message. The computation of the inverse via T and S is straightforward since it uses linear maps, while the inversione of the quadratic map \mathcal{F} is due to the particular structure of the central map.
In the case of digital signatures, the process is similar: to sign a message m , the signer uses a hash function \mathsf{H}\colon \{0,1\}^* \to \mathbb{F}_q^m to compute the value \bm{t} = \mathsf{H}(m) and the signature \bm{\sigma} = T^{-1}(\mathcal{F}^{-1}(S^{-1}(\bm{t}))) . To verify the signature, it suffices to check that \mathcal{P}(\bm{\sigma}) = \mathsf{H}(m) . In this case, it is appropriate to choose n \ge m , so that a preimage for \bm{t} exists with high probability.
Bipolar constructions are intuitive and effective, and are preponderant in the design of multivariate schemes. On the other hand, such constructions are based on the use of trapdoor functions and their security is therefore not entirely attributable to the MQ problem. An attacker could in fact either attempt to invert the public map \mathcal{P} , treating it as a random map, or exploit the particular shape of the central map, seeking a solution to the extended IP problem.
The absence of definitive results on the complexity of the IP problem obstructs the security of these schemes from being completely reduced to hard problems. Historically, there are in fact numerous examples in which the decomposition of the public map \mathcal{P} was particularly easy for certain classes of central maps.
The first major attack concerned the scheme proposed by Matsumoto and Imai, which was compromised in the late 1990s by Jacques Patarin. The latter subsequently proposed several schemes and improvements of the MI construction, in particular the Hidden Field Equation (HFE) and Oil and Vinegar (OV) schemes. Shortly after the presentation of OV, Kipnis and Shamir identified a vulnerability in the scheme and proposed, together with Patarin, a revised version known as Unbalanced Oil and Vinegar (UOV). UOV is also the basis of the Rainbow signature scheme, proposed in 2005 and known for reaching the third round of the NIST competition only to be completely broken in 2022 by Ward Beullens.
In the context of digital signatures, it is possible to adopt an alternative construction that allows the security of the scheme to be reduced entirely to the MQ problem. This approach requires the definition of an identification protocol, from which a signature scheme can be derived by applying the Fiat-Shamir transform. This technique is not exclusive to the multivariate domain and is widely used in the construction of Post-Quantum signatures (see for example the Dilithium scheme).
The first identification protocol based on the MQ problem was proposed in a 2011 paper by Sakumoto et al. Subsequently, all multivariate Fiat-Shamir signature schemes are based on this construction.
Multivariate systems in PQC
Although multivariate systems lend themselves to the construction of both digital signatures and encryption schemes, interest in the latter has diminished over time. The difficulty in constructing multivariate encryption schemes and their lack of competitiveness within this class of schemes have led to their gradual abandonment by the scientific community.
On the other hand, several multivariate digital signature schemes have been the subject of study and analysis during the standardization process led by NIST. In the third round of the competition, two multivariate schemes were included: Rainbow, a signature scheme based on UOV and a candidate for standardization (later discarded due to an attack), and GeMSS, an alternative candidate based on HFE (which was excluded in favor of the hash-based scheme SPHINCS+). Additionally, the MQDSS signature scheme, based on the identification protocol by Sakumoto et al., reached the second round of the competition, during which it suffered an attack that rendered it less competitive compared to other candidates.
Despite the limited success in the first NIST competition, recent years have seen significant efforts by the cryptographic community, leading to important developments in the design and security analysis of multivariate schemes. This type of scheme is the most represented in NIST’s recent additional call dedicated exclusively to digital signatures, with 11 submissions in the first round, 7 of which are based on UOV, the subject of the next article in this series.
In conclusion, multivariate signatures already appear to be a viable choice in numerous scenarios, with several promising proposals currently under evaluation for general use.
[1]CRYSTALS-Kyber, CRYSTALS-Dilithium e SPHINCS+.
[2]The NP class includes problems for which finding a solution is very difficult, but verifying a proposed solution is easy. NP-Complete problems are the most general and difficult among NP problems. Indeed, an algorithm that solves an NP-Complete problem could be adapted to solve any other NP problem.
This article belongs to a series of contributions, edited by the Telsy Cryptography Research Group, devoted to quantum computing and its implications on Cryptography. For reference to other articles, please refer to the index.
For other articles related to Quantum and Cryptography topics, please refer to the related categories in the blog.
The authors
Veronica Cristiano, a bachelor’s degree in Mathematics from the University of Pisa and a master’s degree in Mathematics with a specialization in Cryptography at the University of Trento, joined the Telsy Cryptography research group in mid-2021.
Federica Zanetti, a bachelor’s degree in Mathematics from the University of Trento, where she is currently pursuing a Master’s degree with a specialization in Cryptography. Since June 2024, she has been an intern at Telsy in the Cryptography Research Group.
Edoardo Signorini, a graduate in Mathematics with a curriculum in cryptography from the University of Trento, did his Master’s Thesis in Post-Quantum Cryptography (PQC) at Telsy in 2020. He is currently a cryptographer within the research group at Telsy and holds an Industrial PhD in Pure and Applied Mathematics at the Polytechnic University of Turin. His work focuses on research in PQC and the analysis and development of cryptographic protocols.