Jasper's blog: December 2016

Sunday, December 18, 2016

The one-time pad

A long, long time ago, on a blog right, right here, I briefly mentioned that I wanted to write a blog post about the one-time pad. Today is the day that I do so.

The one-time pad is an elementary part of cryptography and I feel that every computer scientist should know about it. Don't worry, though, I'll keep this understandable for just about everyone. In fact, the one-time pad is very simple and could probably be explained to a seven-year-old.

Let's say that you have written a love letter that you want to be readable only by the target of your affections. In order to hide its content you take piece of paper with completely random letters, which we will call the key.

Now you take the first letter of your love letter and remember its position in the alphabet. Then, you take the first letter of the key and also remember its number. You add the two, and if the total is higher than 26, subtract 26. You end up with a number that can once again be translated back to a letter. This will be the first letter of your encrypted message.

You do this for the second letter as well. And then for the third. In fact, you keep doing this until you have encrypted the entire love letter. You send this encrypted message to your beloved, who must also have a copy of the key; only people with the key can read your message.

Do note that you do not loop around on the key. This means that the key must be at least as long as the message itself. Moreover, you cannot use this key again (or at least the part of the key you used, if the key is longer than the message, you can use the remainder it next time). This is really an important part of the strength of the one-time pad.

In fact, the strength of the one-time pad is such that it is unbreakable. Actually, it has been mathematically proven that it cannot be broken without knowledge of the key. The proof goes something like this: there is any number of possible decrypted messages that the encrypted message could represent. Without any knowledge of the key, each of those is exactly equally likely to be the original message. This means that you can't discern between them in any way. Some might make sense while others may not, but there is absolutely no way to tell which of the decryptions that does make sense is the intended one.

It should be noted that in the way we did this above, we are "leaking" a lot of information because we are not encrypting spaces and punctuation, which compromises the strength of the encryption. There are of course ways to encrypt these as well. In fact, there is no reason we should stick to the traditional alphabetical encryption. Instead of using letters as a basis for the encryption, you could use "bytes of data represented in binary" and instead of asking the key to the message you could use a "bitwise xor". This way you would end up with system that has the same mathematical properties, makes more sense for computers and can represent any type of data.

Though the one-time pad is very strong, it is not used all that much. This is because it isn't very practical. You need to share a key with the recipient, which needs to be as long as your message. If you can do this securely, why wouldn't you just send the message instead? With that in mind, the only practical use of the one-time pad is that you can share the key at a moment when secure communications are possible and send the message later, when they aren't.

While the one-time pad may be the ultimate in symmetric encryption, most of the encryption we do today is actually asymmetric. This is when there are different keys for encryption and decryption. One of the two can be public knowledge while the other is kept secret. In order to sign something and prove that you have written it (and nobody has changed anything about it) you use a secret key to encrypt and a public key to decrypt your "signature". In order to make sure there are no eavesdroppers, you use a encryption key that is freely available that has a corresponding decryption that only the intended recipient has.

Using one-time pads isn't entirely trivial. First of all, your key needs to be truly random. Any kind of pattern in it reduces the strength of the encryption. True randomness actually isn't easy, especially not in the large amounts needed for a one-time pad. Computers actually often simulate randomness, which really doesn't cut it for this purpose.

The second problem is sharing the key. We already mentioned this, but the problem here becomes that you may not know how much data you will want to send later on. This means that you might need to share a very, very long key just to be sure you can send enough long messages later on. And having a long key lying around is a problem in itself, as a stolen key loses all its value...

Finally, you really really really can't reuse (parts of) a key. This really can't be stressed enough. During the Second World War, the Germans actually used one-time pads. Except... they reused keys and thus the encryption wasn't unbreakable and was in fact broken. Because of this, the Allies were able to listen in on their communications...

Allright, that is about all that I feel everyone ought to know about the one-time pad. I do intend to write about the ways I would use pads in different situations some other time, but I don't yet know when that will be. The situations I am thinking of would be modern day espionage and and an interstellar empire, but nothing is set in stone yet.

Tuesday, December 13, 2016

Taiga on my Orange Pi PC

Recently, one of my hobby projects made it to a very nice first prototype state. It doesn't do much yet, but it is in a state where I can show it to people and personally I think it shows a lot of potential that is in the ideas I had. However, working on the prototype was kept simple because it was so basic; there were only a couple of tasks to work on. In order to keep the further progress of the project structured, I decided I need some form of an issue tracker. I have used Taiga.io for such a project in the past and it worked well. Unfortunately, though, their hosted version only allows a single private project and I already had one.

Remember all the Single-Board Computers I wrote about a while back? The Pi 2 is still actively running Kodi, the newest Pi 3 is driving an official Pi touch screen and some of the CHIPs are part of the project I mentioned while their versatility means I have ideas for quite a few more of them. However, the Pine and the Orange Pi have been idle since my last post about them. However, for the Orange Pi, this has recently changed.

I took the Orange Pi and installed Taiga on it. It is quite nice to have my own "dev server" and I'm quite happy about the current situation. I won't describe the exact procedure I followed to get where I am now, as it was basically following their guide for installing it. However, I will post a list of notes. Some of them will relate to installing it on the Orange Pi (or on an SBC or on an ARM based device) whereas others will just be my personal gripes with the installation or the guide.

The guide uses the user "taiga" instead of telling you where to use your own username. This makes it easy to miss a spot if you are using a different user.
To make matters worse, they chose the name of their software, so not every instance of the word should in fact be replaced by your user.
I would probably have been better off creating a user and naming it taiga, though. The procedure installs quite a lot in the home directory and now that's under some other user.
The procedure calls for installing postgresql 9.5, but Debian only has 9.4 on stable. By simply replacing the 5 by a 4 it works, as the minimum requirement is version 9.4. (Alternatively, I could have used 9.5 from unstable.)
The guide installs python3 and then just uses Python 3.5. On Debian, you will have installed Python 3.4. This earlier version does simply seem to work for me.
Installation of lxml is known to be problematic on the Raspberry Pi, but can be achieved by increasing the amount of virtual memory. For me, it worked without such hacks, though I do not know if this is due to running Armbian, running a headless image or using an Orange instead of a Raspberry.
Circus isn't available in Debian stable, so I had to add the testing repositories and get it from there.
Pay attention to the output of "service circus status". This is the only way to tell if the backend is actually running.
I skipped everything that was even remotely optional, so I might revisit some of that in the future. The optional parts are sometimes somewhat out of order, which makes it seem you have to pick them up again before the end, but this isn't really the case.
I currently use a .local domain. I think that means I use mdns, but it works on Windows (which mdns shouldn't) so I'm not 100% sure how this is working. It definitely isn't working from android, though, which is a shame.

It's a real pity there is no "normal" installation procedure like running a script or installing a package. Nevertheless, I do now have the whole things running and I am quite happy with having my own development server.