Wednesday, March 03, 2010

Can The Hill Cipher ever be used?

Alice and Bob want to sent a message so that even if Eve intercepts it, she cannot tell what it is. We will allow Alice and Bob a short private meeting to exchange information (or perhaps they will use RSA or Diffie-Helman for that). But the key can't be that long and has to be reusable (so one-time pad does not qualify). I describe below a well known cipher called the Hill Cipher. I think that there are circumstances where it could do well; however, I am curious what you think.
Let n be a parameter we pick later. Alice generates a random n x n matrix of elements from {0,...,25} and checks that the Det mod 26 is nonzero. (CORRECTION ADDED LATER: the Det has to have an inverse mod 26, so has to be rel prime to 26.) Alice gives this to Bob. Alice and Bob both compute its inverse. Alice and Bob can exchange messages by encoding every block of n by this matrix. So the first n letters of the text get multiplied by the matrix to get a diff n letters. Then the next n letters after that, etc.
  1. n has to be small enough so that Alice and Bob don't mind exchanging n2 elements of {0,...,25}
  2. n has to be large enough so that going through all possible n x n matrices is not practical for Eve.
  3. n has to be large enough so that tables of how often particular n-sized blocks occur are useless.
  4. Can combine with other techniques. Perhaps Alice and Bob first encode using the Vigenere Cipher and then apply the Matrix. They would then have to also share the Key for the Vigenere Cipher.
  5. QUESTION: If ALL Eve gets is the text then is this a good cipher? Clearly if Eve also somehow gets her hands on a message and what it was coded to she will easily crack the code. But if not then does this work well? This code is not used because Eve might get her hands on such, but I wonder if these are circumstances where it would be reasonable.
  6. Is there a value of n that is both big enough and small enough (a Goldilocks n).
  7. What else is known about this?


  1. “Is there a value of n that is both big enough and small enough (a Goldilocks n)”: this is one of the best maths jokes ever.

  2. Known plaintext attacks can be quite feasible in practice, so I wouldn't want to use any cipher vulnerable to them. For example guessing pieces of the plaintext allowed Britain and its allies to crack the German Enigma during World War II. See for that interesting story.

  3. the determinant being non-zero mod 26 does not imply that the matrix has an inverse. Consider the 1 times 1 matrix "2"

  4. Luca's right. According to Wikipedia a matrix is invertible iff its determinant is invertible, which in this case occurs if the determinant and 26 are relatively prime.

    Or just do mod 29 instead.

  5. Luca and Warren- thank you, I have made the correction in the origninal post.

  6. What is the motivation for using the Hill cipher in the first place? The efficiency advantage as compared to using AES (with a good mode of encryption) seems negligible. Or you could switch to a faster (but possibly less secure) stream cipher which would still be better than using the Hill cipher.

  7. @Anon: I imagine part of the motivation of the Hill cipher is its simplicity.

    @Gasarch: Actually, if Eve gets only a single plaintext-ciphertext pair, that only takes care of a rank 1 part of the matrix. She has to get her hands on n linearly independent plaintext-ciphertext pairs in order to completely crack the code. (Obviously the more such lin. indep. pairs she gets ahold of, the easier it is to crack.)

    c^{n^2} grows rather rapidly, so I assume that there is in fact a Goldilocks n. The number of atoms in the universe is ~10^{80}~ 26^{57} and the square root of 57 is a little under 8. So it seems like even n as small as n=8 might do okay (assuming Eve doesn't get her hands on any plaintext-ciphertext pairs). Even n=16 won't break the bank for Alice & Bob though.

  8. If you do a google search you can find some ciphertext-only attacks on the Hill cipher, better than brute force, relying on plaintext letter frequencies. Not sure what is the best known attack though.

  9. Can you use a Sudoku matrix? A Sudoku matrix is (a special case of) a Latin square. but I'm not quite convinced that any Latin square is invertible (especially mod n), although I think that this maybe says that they are.

    At any rate, if Sudoku matrices are invertible, then you could just send the initially-filled-in digits, which would take less space. Deciphering then requires solving a Sudoku, which is sort of the reverse of the computational postmarks used by, e.g., MS Exchange ("pay-to-read" instead of, or maybe as well as, "pay-to-write".)

    For all I know, using Latin squares make the Hill Cipher even weaker, so this isn't a serious suggestion. (Although Google searching found some crypto uses of Sudoku.)

  10. doing a Google search on
    crypto uses of Sudoku
    brings up this post as #1 now

  11. Very modest values of n lead to extremely large keyspace. A lower bound can be found as follows: the upper triangular matrix with 1s along the diagonal is guaranteed invertible, regardless of the remaining entries. If the matrix is n by n, then there are (n^2-n)/2 entries, so there are at least 26^((n^2-n)/2) matrices you can use.

    If n = 4, this gives you at least 26^6 = 308915776 different keys.

    For n = 8, the number of keys grows to 26^28, or about 4 x 10^39.

  12. I suggest to choose a field as alphabet, so you'll have valid keys for every non singular matrix. Z_29 is good for encoding plain english.
    GF(256) would be good for a more general encryptyon. Furthermore for n=8 the keyspace would have a lower bound of 256^28=2^224.

    What I was wondering about is: are there any other attacks apart from chosen plaintext?

    And,also: how using CBC would increase security?

  13. Hill cipher, being completely linear even in CBC mode, is weak to differential cryptanalisys and "meet-in-the-middle"s.
    There are few known results on the effects of adding some non-linear transformations, but then anyway it wouldn't be Hill cipher anymore...

  14. Please any one can provide python code for this procedure.
    I try to code it but its time complexity is not good. help if any one know more about this if possible.
    Thanks in advance!