12. Déchiffrement ECB octet par octet (simple)

Byte-at-a-time ECB decryption (Simple)

Copy your oracle function to a new function that encrypts buffers under ECB mode using a consistent but unknown key (for instance, assign a single random key, once, to a global variable).
Now take that same function and have it append to the plaintext, BEFORE ENCRYPTING, the following string:

Um9sbGluJyBpbiBteSA1LjAKV2l0aCBteSByYWctdG9wIGRvd24gc28gbXkg
aGFpciBjYW4gYmxvdwpUaGUgZ2lybGllcyBvbiBzdGFuZGJ5IHdhdmluZyBq
dXN0IHRvIHNheSBoaQpEaWQgeW91IHN0b3A/IE5vLCBJIGp1c3QgZHJvdmUg
YnkK

Spoiler alert.
Do not decode this string now. Don't do it.

Base64 decode the string before appending it. Do not base64 decode the string by hand; make your code do it. The point is that you don't know its contents.
What you have now is a function that produces:
AES-128-ECB(your-string || unknown-string, random-key)
It turns out: you can decrypt "unknown-string" with repeated calls to the oracle function!
Here's roughly how:

Feed identical bytes of your-string to the function 1 at a time --- start with 1 byte ("A"), then "AA", then "AAA" and so on. Discover the block size of the cipher. You know it, but do this step anyway.
Detect that the function is using ECB. You already know, but do this step anyways.
Knowing the block size, craft an input block that is exactly 1 byte short (for instance, if the block size is 8 bytes, make "AAAAAAA"). Think about what the oracle function is going to put in that last byte position.
Make a dictionary of every possible last byte by feeding different strings to the oracle; for instance, "AAAAAAAA", "AAAAAAAB", "AAAAAAAC", remembering the first block of each invocation.
Match the output of the one-byte-short input to one of the entries in your dictionary. You've now discovered the first byte of unknown-string.
Repeat for the next byte.

Congratulations.
This is the first challenge we've given you whose solution will break real crypto. Lots of people know that when you encrypt something in ECB mode, you can see penguins through it. Not so many of them can decrypt the contents of those ciphertexts, and now you can. If our experience is any guideline, this attack will get you code execution in security tests about once a year.

Le but de cet exercice est de montrer les faiblesses inhérentes au fait de pouvoir appeler librement une fonction de chiffrement.

Fonction de chiffrement

Comme demandé dans les consignes, voici une fonction de chiffrement sur laquelle on peut se baser pour l'exercice :

import os
from base64 import b64decode

secret_text = b64decode("Um9sbGluJyBpbiBteSA1LjAKV2l0aCBteSByYWctdG9wIGRvd24gc28gbXkgaGFpciBjYW4gYmxvdwpUaGUgZ2lybGllcyBvbiBzdGFuZGJ5IHdhdmluZyBqdXN0IHRvIHNheSBoaQpEaWQgeW91IHN0b3A/IE5vLCBJIGp1c3QgZHJvdmUgYnkK")
random_key = os.urandom(16)

store_method = ""
def append_ecb_encrypt(text):
    method = random.randint(0, 1)
    text = text+secret_text
    return ecb_encrypt(text, random_key)

On notera que tant que la fonction est appelé dans le même processus Python, la clef secrète sera toujours la même.

Détecter la taille de bloc

Comme demandé dans l'énoncé, pour détecter la taille d'un bloc de la fonction de chiffrement on peut utiliser ce code:

base_size = len(append_ecb_encrypt(b"\x00"))
i = 0
while True:
    if(base_size != len(append_ecb_encrypt(b"\x00"*(i)))):
        block_size = len(append_ecb_encrypt(b"\x00"*(i))) - base_size
        break
    i += 1
print("Block size:", block_size)

La logique est la suivante:

on soumet une première chaine d'un caractère et on mesure la longeur du résultat obtenu (on envoi un caractère car on imagine des scénarios où la fonction de chiffrement ne rajouterait pas de texte après le notre)
on chiffre notre texte auquel on ajoute des caractères un à un jusqu'à ce la longueur du résultat change
on mesure la différence entre cette longueur et la longueur d'origine

On obtient ainsi bien la réponse 16.

Détecter le type de chiffrement

Idem, comme demandé dans l'énoncé on peut détecter le type de chiffrement comme vu dans le problème précédent :

cipher_text = append_ecb_encrypt(b"\x00"*48)
if(cipher_text[16:32] == cipher_text[32:48]):
    print("AES in ECB mode is used...")

Déchiffrement octet à octet

Comme expliqué dans l'énoncé, la méthode de déchiffrement est de procéder comme suit:

sachant que ECB chiffre toujours un bloc de la même manière, si l'on soumet un texte de 15 caractères X, on aura: XXXXXXXXXXXXXXXY => AAAAAAAAAAAAAAAA où Y sera le premier caractère du texte secret ajouté par la fonction de chiffrement et AAAAAAAAAAAAAAAA le résultat chiffré.
de fait, si l'on chiffre toutes les combinaisons de 15 X suivi d'un caractère N, alors lorsqu'on obtient le même texte chiffré en résultat, on sait que le caractère N est le premier caractère du texte ajouté par la fonction de chiffrement. De manière simplifiée, on va d'abord chiffré XXXXXXXXXXXXXXXa, puis XXXXXXXXXXXXXXXb, puis XXXXXXXXXXXXXXXc etc, jusqu'à trouver le caractère qui permettra d'obtenir le même texte chiffré AAAAAAAAAAAAAAAA

On peut écrire le code suivant:

recovered_text = b""
base_cipher_text = append_ecb_encrypt(b"")

for block in range(0, len(base_cipher_text) // 16):
    for i in range(0,16):
        base_string = (b"\x00"*(16 - i - 1))
        cipher_text = append_ecb_encrypt(base_string)
        for c in range(0,256):
            cipher_char =  append_ecb_encrypt(base_string+recovered_text+bytes([c]))
            print(chr(c), cipher_text[block*16:(block*16)+16])
            if cipher_text[block*16:(block*16)+16] == cipher_char[block*16:(block*16)+16]:
                recovered_text += bytes([c])
                break
print(recovered_text.decode())

Qui nous donne le résultat:

Rollin' in my 5.0
With my rag-top down so my hair can blow
The girlies on standby waving just to say hi
Did you stop? No, I just drove by

Remarque: Au premier abord on pourrait considérer que le code ci-dessus a une complexité algorithmique de O(n^3), ce qui serait très élevé. En réalité la complexité est inférieure à O(n) car la deuxième et la troisième boucles for ont un nombre d'itération maximum fixes. Pour un texte de 16 caractères on aura au pire 1 x 16 x 256 = 4096 possibilités, pour un texte de 32 caractères on aura 2 x 16 x 256 = 8192 possibilités, etc