let's explore the basics of cryptography with RSA
A basic implementation of the most widespread encryption algorithm on the internet.
Hello, friends!
Welcome to ProgrammingZero (to HERO), where I simplify complex computer science concepts into easy-to-understand pieces.
In this blogpost I will outline the super basic RSA encryption scheme I constructed with Python. You can find the full project on my github if you're interested in seeing (and running) the whole prototype!
Here, we'll be diving into some of the more exciting parts of the code. Let's begin!
Why do we care about encryption in the first place?
In short, life would be really difficult if all our conversations were massively public (like on the scale of the internet public) all the time. We can illustrate this more clearly with the Two Generals Problem.
The Scenario
Two generals are stationed on opposite hills and need to coordinate an attack on a city in the valley below. They can only communicate by sending messengers through enemy territory, but there's a risk that any messenger could be intercepted by the enemy (and the message exposed).
The problem is: How can the generals guarantee that both sides attack at the same time?
Okay, so what's the problem?
Suppose General A sends a message: "Attack at dawn."
General B receives the message but needs to confirm receipt, so they send back "Message received. Attack confirmed."
General A now worries: What if the confirmation was intercepted? He won’t know for sure that General B will attack.
To solve this, General A could send another confirmation: "I got your confirmation. Proceed as planned."
But now General B has the same worry—what if this new message gets lost?
No matter how many confirmations are sent, there is always the risk that a given message has been lost. This makes it impossible to guarantee coordination without having a secure channel. You can imagine that doing things like logging in to your email inbox, doing online banking, and messaging your friends would be really challenging without coordinating with other parties on the internet!
Rivest-Shamir-Adleman (RSA)
RSA encryption, named after the three researchers who independently discovered the encryption scheme in 1977 while at MIT (though a previous implementation was discovered and promptly classified by the British government four years prior), is one of the most widely used cryptographic algorithms, securing everything from online banking to private messaging.
At its core, RSA relies on the mathematical principles of modular arithmetic, prime factorization, and number theory. And we'll break down each of these concepts while we examine my RSA implementation.
RSA: A high level summary
RSA relies on an encryption architecture called Public/Private Key Cryptography.
At its core, public-key cryptography uses two keys:
Public Key: Anyone can have this. It’s like an open mailbox where people can drop messages for you.
Private Key: Only you have this. It’s the only key that can open the mailbox and read the messages.
How It Works in Action
1. Encryption (Locking the Message)
Let’s say Alice wants to send Bob a secret message. Bob shares his public key with Alice. She uses this key to encrypt the message, locking it up in a way that only Bob’s private key can unlock. Even if someone intercepts the message, they can’t read it because they don’t have Bob’s private key.
2. Decryption (Unlocking the Message)
Bob receives the encrypted message and uses his private key to decrypt it. Now he can read what Alice sent, and no one else can.
3. Digital Signatures (Proving Who Sent a Message)
Encryption is one thing, but how do you know a message is actually from the person who claims to have sent it? That’s where digital signatures come in (though these aren't included in my prototype.)
Bob can use his private key to sign a message.
Anyone can verify it using Bob’s public key, proving the message really came from him.
How do we achieve this?
With a hell of a lot of math...that's how!
Here are some of the highlights:
Prime Factorization: The security of RSA is based on the difficulty of factoring large numbers into their prime components. This is what actually secures our messages. Try it on your own, I dare you!
Modular Arithmetic: RSA heavily relies on modular exponentiation, ensuring encryption and decryption work efficiently with large numbers. I'm talking primes with 200 digits large.
Greatest Common Divisor (GCD): The Euclidean algorithm ensures certain values are relatively prime, which is crucial for choosing your public/private key pair.
Modular Inverse: The Extended Euclidean Algorithm also helps find the modular inverse, which is needed to compute the private key so you can decrypt messages sent to you.
Phew! That was a lot of background - but I think we're finally ready to dive into some Python 🐍
Quick note: before you go about using this implementation of RSA to secure your own messages I want to point out there are several vulnerabilities with this particular prototype. Think of this project as more of a 'proof of concept' before you go out and build a full-scale version.
Breaking Down the Code
In this section I want to call out some of the specific functions that are crucial to my implementation. Again, if you want to see the full prototype (and run it with my super simple text-based UI) you can head over to the repository on github.
Edit: The code’s formatting is only retained on the desktop site! My apologies to the mobile users out there :)
Helper Functions
Let's start with the functions that are directly implementing the math concepts we talked about above. I've included extensive commenting to help you parse what each line is doing.
1. Fast Modular Exponentiation (FME)
The FME
function computes large exponentiation operations under a modulus (there are built in ways to do this in Python but I wanted the extra challenge of building it out myself). This will be useful when using our set of keys to encode and decode messages.
def FME(b, n, m):
'''Calculates the mod of numbers with large exponents. Accepts integers in format b^n mod m, returns integer.'''
result = 1
square = b % m # Initially, square is b mod m
while n > 0:
if n % 2 == 1: # If the least significant bit is 1
result = (result * square) % m # Update result with current square
square = (square ** 2) % m # Square the base
n //= 2 # Divide n by 2 (shifting right in binary)
return result
2. Finding the Greatest Common Divisor (GCD)
The GCD
function determines the greatest common divisor of two numbers using the Euclidean algorithm. This is crucial in RSA encryption to ensure that the public exponent e
is relatively prime to (p-1)(q-1)
where p
and q
are our original primes, making modular inversion possible. By iteratively computing remainders, the function finds the largest number that divides both inputs without leaving a remainder.
def Euclidean_Alg(a, b):
'''Computes the gcd of two numbers (Sriram's Algo) and returns gcd as an integer.'''
while (b > 0):
# Run until a % b = 0
k = a % b
# Set a, b respectively for next iteration
a = b
b = k
return a # The final value of a is the gcd
3. Extended Euclidean Algorithm (EEA)
The EEA
function (Extended Euclidean Algorithm) finds the greatest common divisor of two numbers while also computing the Bézout coefficients. These coefficients are essential for finding the modular inverse, which is required to determine the private key d
in RSA encryption. By iteratively updating coefficients, the function solves for values that satisfy the equation ax + by = GCD(a, b)
.
def EEA(a, b):
'''Computes the gcd of two numbers (Sriram's Algo) and returns gcd as an integer, along with Bézout coefficients as a tuple.'''
(s1, t1) = (1, 0) # Initial values for the Bézout coefficients of a
(s2, t2) = (0, 1) # Initial values for the Bézout coefficients of b
while (b > 0):
# Run until a % b = 0
k = a % b
q = a // b
# Set a, b respectively for next iteration
a = b
b = k
# Update s1, s2, t1, t2 using the recurrence relation
s1, s2 = s2, s1 - q * s2
t1, t2 = t2, t1 - q * t2
return a, s1, t1 # The final value of a is the gcd, and s1, t1 are the Bézout coefficients
Caller Functions
1. Generate Public Key
The Find_public_key_e
function selects a valid public exponent e by ensuring it is relatively prime to k = (p-1)(q-1)
. It starts with 13 (a common choice for RSA prototypes) and iterates through odd numbers until it finds an e where GCD(e, k) = 1
. This guarantees that e has a modular inverse, allowing the private key d
to be computed.
def Find_Public_Key_e(p, q):
'''Generates a public key based on two large primes'''
# Compute values n, k and set an initial e.
n = p * q
k = (p - 1) * (q - 1)
e = 13 # Set an initial prime for e (will most likely change)
# Find e such that 1 < e < 90000 and GCD(e, k) == 1 (relatively prime)
while Euclidean_Alg(k, e) != 1:
e += 2 # Start from 2 to avoid e == 1
return n, e
2. Generate Private Key
The Find_private_key_d
function calculates the private key exponent d
by finding the modular inverse of e modulo k = (p-1)(q-1)
. It uses the Extended Euclidean Algorithm (EEA) to solve for d
. This also ensures that d can correctly decrypt messages encrypted with the public key exponent e
.
def Find_Private_Key_d(e, p, q):
'''Finds a private key based on two large primes and e.'''
# Calculate k
k = (p - 1) * (q - 1)
# Find the Bézout Coefficients, s1 is our inverse of e mod k
a, s1, s2 = EEA(e, k)
# We know that d = s1
# The modular inverse of e mod k is s1 (from Bézout coefficients)
d = s1 % k # Ensure d is positive by taking it mod k
return d
3. Encode
The Encode
function prepares a message for transmission by using the public key to transform each character into an encrypted numerical form. The core of this process relies on the FME
function, which handles the actual encryption. This ensures that each encoded number is mathematically tied to the public key, allowing for proper decryption with the private key later.
One significant drawback of my implementation is that each character is encoded separately, making it vulnerable to frequency analysis.
def Encode(n, e, message):
'''Encodes a message using RSA. Accepts n, e, message as a string and returns list of integers.'''
cipher_text = []
integer_list = Convert_Text(message)
for number in integer_list:
# Append each encoding calculation to cipher
cipher_text.append(FME(number, e, n))
return cipher_text
4. Decode
The Decode
function processes a list of integers and decrypts them using the FME
function, this time applying the private key. Notably, only the private key can successfully decrypt messages, ensuring that the public key can be freely shared for encryption. Pretty slick.
def Decode(n, d, cipher_text):
'''Decodes a message using RSA. Accepts n, d, list of integers and returns a string.'''
integer_list = []
for number in cipher_text:
# Append each decoding calculation to integer list
integer_list.append(FME(number, d, n))
message = Convert_Num(integer_list)
return message
Putting It All Together
To wrap up here is a step-by-step breakdown of how RSA works in this implementation.
1. Generate Keys
Choose two large prime numbers,
p
andq
. Their size directly impacts the security of the encryption.Compute
n = p * q
. This value forms part of both the public and private keys.Calculate
k = (p-1)(q-1)
, which represents Euler’s totient function and is used to determine valid key pairs.Choose a public exponent
e
such thatGCD(e, k) = 1
. This ensurese
has an inverse, allowing proper decryption.Compute
d
, the modular inverse ofe
modk
, using the Extended Euclidean Algorithm. Thisd
serves as the private key.
2. Encryption
Convert the message into a numerical format. (In this implementation, an ASCII table is used to automatically map characters to numbers.)
Use the public key
(e, n)
to computecipher = message^e mod n
, securing the data.The result is a sequence of encrypted numbers that can be safely transmitted.
3. Decryption
Use the private key
(d, n)
to computemessage = cipher^d mod n
, reversing the encryption.Convert the resulting numbers back into ASCII characters to reconstruct the original message.
Since only the private key holder can decrypt the message, RSA ensures secure communication.
Thanks for taking the time to check out my implementation of RSA. I hope I was able to help demystify this extremely widespread algorithm!
Cheers,
Morgan