% pour n'avoir que les transparents en landscape
\documentclass[12pt,a4,slidesonly]{seminar}

% pour avoir les transparents et les commentaires
%\documentclass[12pt,a4,article]{seminar}

% pour pouvoir choisir la forme du cadre
\usepackage{fancybox,amsmath,amssymb,amsfonts}


\newtheorem{definition}{Definition}
\newtheorem{proposition}[definition]{Proposition}
\newtheorem{corollary}[definition]{Corollary}
\newtheorem{theorem}[definition]{Theorem}
\newtheorem{lemma}[definition]{Lemma}
\newtheorem{example}{Example}
\newenvironment{proof}{\begin{trivlist}\item[]{\em Proof.\\ }}%
{\samepage \hfill$\diamond$\end{trivlist}}
\renewcommand{\a}{\alpha}
\newcommand{\m}{\mu}
\renewcommand{\l}{\lambda}
\newcommand{\pf}{{\bf Proof : }}

\newcommand{\comb}[2] {\mbox{$\left( { #1 \atop #2 } \right)$}}
\newcommand{\G}{\Gamma}
\newcommand{\om}{\omega}
\newcommand{\ot}{\otimes}
\newcommand{\h}{\eta}
\newcommand{\card}[1]{\mbox{$\mid #1 \mid$}}
\newcommand{\e}{\epsilon}
\newcommand{\expo}{\mbox{$m+2+\lfloor\frac{n-m-2}{d}\rfloor$}}
\renewcommand{\b}{\beta}
\newcommand{\un}{\underline}

\slideframe{Oval}


\title{Shorter keys for code based cryptosystems} 
\author{Philippe Gaborit (University of Limoges, France)}

\date{}

\begin{document}

\begin{slide}
\maketitle


\end{slide}
\begin{slide}
{\bf Summary}

$\bullet$ Code based cryptosystems

$\bullet$ Usual attacks

$\bullet$ Quasi-Cyclic McEliece scheme

$\bullet$ A family of decodable quasi-cyclic codes 

$\bullet$ Security of the new scheme 

$\bullet$ Conclusion
\end{slide}
\begin{slide}
{\bf Notation}

$\bullet$ $F_q$ a field with $q$ \'el\'ements, $C$ a linear $[n,k,d]$ code
$w_H(x)$: Hamming weight of a word $x$

$\bullet$ usual scalar product 
$x.y=\sum_1^n x_iy_i$, 
dual of a code $C$: $C^{\perp}=\{y \in F_q^n | x.y=, \forall y \in C\}$.

$C$ a $[n,k,d]$ code  $\rightarrow$ $k\times n$ generator matrix

$C^{\perp}$ $\rightarrow$ $(n-k) \times n$, parity check matrix.

\end{slide}
\begin{slide}
{\bf Code based cryptosystems}

Interest: not based on number theory, does not need cryptoprocessors
 
Hard problem: {\bf Syndrom decoding problem}

Let $H$ be a binary $k\times n$ random matrix, let $s \in F_2^k$ and let
 $w$ be a positive integer. Does it exist a word $x$ in $F_2^n$ with weight at most $w$
such that $Hx^t=s$ ?

Proven NP-complete '78 BMT

Related to the decoding problem: if one knows $y=x+e$
how recover $x$ ?

\end{slide}
\begin{slide}

{\bf McEliece cryptosystem '78}

$\bullet$ Private key: $C$ a $[n,k,d]$ linear code which corrects $t$ errors,
a  $k \times n$ matrix $G$, a $k \times k$ invertible matrix $S$, and a 
$n \times n$ permutation matrix $P$.

$\bullet$ Public key: $G'=SGP$

$\bullet$ Encryption: $x \rightarrow xG'+e$ with $w_H(e) \le t$.

$\bullet$ Decryption:  $y \rightarrow \Phi_C(yP^{-1})S^{-1}$, 
where $\Phi_C(z)$ is a decoding algorithm up to the distance $t$.\\


Codes proposed : Goppa codes with parameters [1024,524,101], t=50.

\end{slide}
\begin{slide}

Complexity:

Encryption : $O(n^2)$

Decryption : $O(n^2)$ 

Size of key : $kn$ 

Transmission rate: $k/n$

$\rightarrow$ very fast system but public key very big: about
 500000 bits for the original system!

\end{slide}
\begin{slide}

{\bf Niederreiter scheme} '86

\medskip
\noindent {\bf Private key:} 

$\bullet$ Private key: $C$ a $[n,k,d]$ code which corrects $t$ errors,
$H$ a parity check matrix of $C$, a  $k \times n$ matrix $G$, 
a $k \times k$ invertible matrix $S$, and a
$n \times n$ permutation matrix $P$.\\
{\bf Public key :} $H'=SHP$.\\
{\bf Encryption:} $x \rightarrow y=H'x^T$, with $x$ of weight $t$.\\
{\bf Decryption:} Decode $y$ in $z$,then  $zP^{-1}$ gives $x$.\\

Variation on the McEliece scheme, permits to win 
on certain points (size of certain parameters). Security equivalent to McEliece scheme.

\end{slide}
\begin{slide}

{\bf Signature scheme, Courtois,Finiasz,Sendrier '01}

Encryption schemes are not all invertible like 
RSA 1-1. If one considers a random codeword of $F_2^n$ it
is not necessary decodable in $C$.

{\bf Idea:} consider a code with small covering radius (t=7-9),
then the density of decodable codewords is roughly $\frac{1}{t!}$

For $h(m)$ a hashing of $m$, iterate the computation $h(m,i_0)$
until a decodable word is found, for $i_0=1,2,\cdots$.

{\bf Weakness:} needs to decode $t!$ random codewords on the average
before finding a suitable codeword. Moreover for security
$t \approx 7-9$, implies a key of size 1 Moctet. But the 
signature can be very small (  $<100$ bits).

\end{slide}
\begin{slide}

{\bf Stern authentication scheme '93}

Based on zero-knowledge, probability to cheat 2/3.
Big size of key, also works for signature.

{\bf Sidelnikov scheme '94}

Variation on McEliece by considering 
the Reed-Muller codes rather than Goppa codes

{\bf Rank codes, Gabidulin '91}

Variation on McEliece by considering another
kind of distance: the rank distance.

\end{slide}
\begin{slide}

{\bf Usual attacks}

$\bullet$ {\bf Message attacks}

One tries to decode directly.

Information set decoding

Lee and Brickell '87, Stern '89, Leon '89, Canteaut-Chabaud '98


Let $C$ $[n,k,2t+1]$ code, one chooses a
random set of $k$ columns,
an error woth weight $t$ is said to be {\it decodable} 
when its support does not meet the $k$ random columns.

\end{slide}
\begin{slide} 

The probability for an error to be decodable is then 
$P_{dec}=\frac{\binom{n-k}{t}}{\binom{n}{t}}$,
let $P_{dec} = O(1).2^{-nH_2(t/n)-(1-k)H_2(t/(n-k))},$
for $H_2(x)=-xlog_2(x)-(1-x)log_2(1-x)$.\\ 

 One deduce then the work for finding ONE word of weight $t$:
$$ WF=\frac{P(k)}{P_{dec}},$$
where $P(k)$ is the cost of a Gaussian elimination.
$P(k)$ is roughly $O(k^3)$, the different methods
improve this facteur.

$\rightarrow$ exponential complexity for the best known algorithm.


\end{slide}
\begin{slide}

Original McEliece cryptosystem
broken in '98 by Canteaut-Sendrier.

Previous attack is optimized for 
$k \approx 2/3 n$.
 
To have a security of $2^{80}$: one needs
$n \ge 1400$, 

key $\approx$ $800$kbits.

The security of the system is not as such equivalent to a NP-complete
problem.
\end{slide}
\begin{slide}
 
{\bf Structural attacks}

One tries to recover the secret permutation matrix $P$ directly.

Support splitting algorithm  Sendrier '97

Algorithm which permits to decide when two codes are equivalent 
or not. Variation on an algorithm by Leon.

Complexity: $O(2^{dim(C \cap C^{\perp})})$.

Random codes have a small hull,
Reed-Muller have a big hull and are not vulnerable
to this attack.

Goppa codes : many of them. 


\end{slide}
\begin{slide}

{\bf Quasi-Cyclic McEliece scheme}

{\bf idea :} use quasi-cyclic codes to shorten the size of the public key

\begin{definition} A code of length $n$ is said to be {\it quasi-cyclic}
 of order $s$, for $s$ a divisor of $n$, if the shift of any codeword
by $s$ positions is still a codeword.
\end{definition}

A cyclic code is quasi-cyclic with order $s=1$.

\begin{lemma}
If $C$ is a cyclic code of length $n$ then $C$
is also quasi-cyclic of order $s$, for any $s$ which divides $n$.
\end{lemma}

\begin{lemma}
If $C$ is quasi-cyclic of order $s$, then 
$C^{\perp}$ is also quasi-cyclic of order $s$.
\end{lemma}


\end{slide}
\begin{slide}

{\bf Compact representation of quasi-cyclic codes}

Let $G$ be a generator matrix of a quasi-cyclic code.
\begin{proposition}\label{quasi}
Let $C$ be a $[n,k]$ quasi-cyclic code of order $s$ with $n=rs$
then there exists a $k' \times n $ matrix ${\cal G}$ (with $k' \ge k$),
wihich generates $C$, of the form: 
\[ 
{\cal G}= \left( \begin{array}{ccccc}
A_1  &  A_2 & A_3 & \cdots & A_r     \\
A_r  &  A_1 & A_2 & \cdots & A_{r-1}     \\
\vdots & \vdots & \vdots & \vdots & \vdots \\
A_2  &  A_3 & A_4 & \cdots & A_1     \\
\end{array} \right),
\]
where $A_i$ are $\frac{k'}{r} \times s$ matrices.
\end{proposition}


\end{slide}
\begin{slide}

{\bf A new way to describe the key}

The $A_i$ permit to recover the generator matrix by a simple shifts.

{\bf Idea:} mask the structure of the matrix with a permutation $\Pi$ 
which preserves the quasi-cyclicity of the original matrix:

\[
\Pi= \left( \begin{array}{cccc}
\pi  &  0 & \cdots & 0  \\
0  &  \pi & \cdots & 0  \\
\vdots & \vdots & \vdots & \vdots \\
0  &  0 & \cdots & \pi     \\
\end{array} \right),
\]


\end{slide}
\begin{slide}
 
one constructs the masked matrix:

\[
{\cal G}'= \left( \begin{array}{ccccc}
A_1\pi  &  A_2\pi & A_3\pi & \cdots & A_r\pi     \\
A_r\pi  &  A_1\pi & A_2\pi & \cdots & A_{r-1}\pi     \\
\vdots & \vdots & \vdots & \vdots & \vdots \\
A_2\pi  &  A_3\pi & A_4\pi & \cdots & A_1\pi     \\
\end{array} \right),
\]


\end{slide}
\begin{slide}


\end{slide}
\begin{slide}

{\bf An extraction procedure}

In the scheme the same matrix has to be used
to encrypt and to decrypt. The quasi-cyclic 
matrix may have more rows than $k$.
One has then to extract a matrix in unique way which can be used for decryption
and encryption. This procedure needs only to be done once.

This method can also be applied to the dual code
to have a smaller key.

\end{slide}
\begin{slide}

{\bf Quasi-cyclic McEliece scheme}


$\bullet$ {\bf Key creation:}

Let $C$ be a $[n,k,d]$ $s$-quasi-cyclic code. 
One chooses a random permutation $\pi$ on $s$ coordinates.

One constructs the $(n,k')$ matrix ${\cal G}$ associated to $C$,
one applies the permutation $\Pi$

to {\cal G} in order to construct  ${\cal G}'={\cal G}\Pi$.
The public key $K_P$ is then the  $\frac{k'}{r}$
first rows of ${\cal G}'$, the number of errors
 $t$ and the order of quasi-cyclicity.
The private key is the permutation $\pi$.\\


\end{slide}
\begin{slide}

$\bullet$ {\bf Encryption:}\\
From the public key one rebuilds the matrix ${\cal G}'$ by quasi-cyclicity
and one extracts the encryption matrix $G'$.
Then one proceeds as in the usual scheme.

$\bullet$ {\bf Decryption:}\\
As in the usual scheme.


\end{slide}
\begin{slide}

{\bf A big family of quasi-cyclic codes}

One needs : quasi-cyclicity, a good decoding algorithm, 
non vulnerability to known attacks.

{\bf Problem: } find such a family of codes!

Only 'small' families of quasi-cylic Goppa codes are known.
 
An example of quasi-cyclic Goppa codes: the primitive BCH codes
$[2^m-1,k,2t+1]$, which are cyclic and therefore quasi-cyclic
for any divisor of $n$.

\end{slide}
\begin{slide}

{\bf Quasi-cyclic subcodes}

\begin{proposition}\label{sub}
Let $C$ be a $[n,k]$ $s$-quasi-cyclic code with $n=rs$
then there exist quasi-cyclic subcodes
strictly included in $C$ of order $s$
and dimension greater or equal to
$k-r=k-\frac{n}{s}$.
\end{proposition}

{\bf sketch of proof}: one adds the $r$ $s$-shifts of a random codeword 
to a parity check matrix of $C$.

\begin{proposition}\label{num}
If $C$ is a $s$-quasi-cyclique $[n,k,d]$ code, then 
at least $2^{k-r}$ distincts codes can be constructed
by the previous proposition.
\end{proposition}

\end{slide}
\begin{slide}

{\bf A big family of suitable codes}

Since a subcode of a code is still decodable,
the previous results permit to construct
a big family of quasi-cyclic codes from any
quasi-cyclic code.

Price to pay: the dimension decreases.

One can then use the family of primitive BCH codes
or any known quasi-cyclic Goppa code as starting code.

\end{slide}
\begin{slide}

{\bf Security of the new scheme}

{\bf Message attack }

Algorithme d\'ecodage par ensemble d'informations. L'erreur $e$
ajout\'ee ne tient pas compte de la quasi-cyclicit\'e,
et donc l'algo ne peut \^etre vraiment optimis\'e pour cette structure.

Il n'existe pas d'algorithme g\'en\'eral de d\'ecodage \`a
distance born\'ee $d=(n-1)/2$ pour les codes cycliques
(et a fortiori quasi-cyclique).

\end{slide}
\begin{slide}

{\bf Attaque par perforation du support} T. Berger. 

On peut remarquer que si on perfore le support du 
code quasi-cyclique sur des parties du support
adequate, cette op\'eration conserve l'action
de la permutation. Et on peut alors
utiliser l'algorithme de s\'eparation des supports
pour retrouver $\pi$.

$\rightarrow$ comme on part d'un grand nombre de codes
             on ne peut appliquer cette attaque


\end{slide}
\begin{slide}

{\bf Proposition de param\`etres}


$\bullet$ Sch\'ema de McEliece 

\noindent {\bf Param\`etres de s\'ecurit\'e  A : $ n=4095, t=26 , s = 45$.}

\noindent Param\`etres du code:$[4095,3692,53]$\\
Ordre de g\'en\'eration de la matrice: $5$\\
Cl\'e publique: $20475$ bits\\ 
Taux de transmission: 0.9\\  
S\'ecurit\'e du message : $2^{100}$\\ 
S\'ecurit\'e structurelle : $2^{91}$


\end{slide}
\begin{slide}


\noindent {\bf  Param\`etres de s\'ecurit\'e  B : $ n=2047, t=31 , s = 23$.}

\noindent Param\`etres du code:$[2047,1617,63]$\\
Ordre de g\'en\'eration de la matrice: $5$\\
Cl\'e publique: $12282$ bits\\          
Taux de transmission: 0.79\                 
S\'ecurit\'e du message : $2^{80}$\\                
S\'ecurit\'e structurelle : $2^{85}$


D'autres param\`etres sont possibles.

M\^eme chose pour le sch\'ema de Niederreiter.


\end{slide}
\begin{slide}

{\bf Conclusion}


Ne s'adapte au sch\'ema de signature car le fait
de prendre un sous-code casse le faible
rayon de recourement du code.

Une famille importante de codes quais-cycliques
permettrait d'adapter cette id\'ee \`a la signature,
pour diminuer la cl\'e jusqu'\`a 100kbits-200kbits

Id\'ee d'utiliser des codes quasi-cycliques est naturelle
et a permis dans d'autres applications d'obtenir
de bons r\'esultats.

Rend la taille de la cl\'e quasi-lin\'eaire dans
la longueur du code.


\end{slide}

\end{document}