Software | kzeise.com

Encryption Backdoors

21. February 2016 · Categories: Politics, Software

What are the options with current technology to provide encryption backdoors? And which policy goals should we have?

As policy, we want backdoors that provide inherent checks on abuse. When police searches a house, it is obvious to the neighbors, it requires the expense of manpower to do the search, and the subject of the search knows about it. Similarly, we want any mechanism to be expensive enough to deter casual use, obvious enough that it does not happen behind our backs, and safe enough that criminals cannot exploit it.

On these counts, a master key is awful policy. It would have to be kept secure in at most a handful offline locations. There would be a huge temptation to make more copies to provide better access, and most governments will want to have their own master key for their subjects. There will be so many copies in the end that at least one will be stolen. The cost for each decryption would be so low, and the act so invisible, that there would be no deterrent from overusing it. Worse, if the master key was stolen, it would be very hard to detect, and probably impossible to prove. And as Jonathan Zdziarski observes, the legal system will basically force the tool into the open for validation, making such a threat very plausible.

A much better alternative would be to include a random key on each device¹ used to store enough bits of the password that the full one could be brute forced on a 10 million dollar computer in a week or so. We would also need to update this regularly to account for the increasing processor power that a cracking device can have over time. One would retrieve the key by destroying the processor and carefully checking the nonvolatile storage on it to determine the device key, not an easy task given the small size of such structures. As policy, it would be much better, since it is expensive to hack each device. There is no scaling that after you cracked one, further ones would become much cheaper. You need the device in your physical possession so it is both an additional protection, and you cannot work undetected. And the special equipment needed to analyze the chips would make it difficult for criminals to acquire such equipment unnoticed.

This would require discussion about how difficult we want to make breaking into phones. What should be the price such cracking should be costing? The real big problem is that anything that would be worth spending millions on prosecuting would find it worthwhile to actually use proper encryption software that is effectively unbreakable, while we cannot make backdoors so weak that people would be subject to persecution by repressive regimes. In the end, backdoors are only effective against criminals doing awful stuff while being stupid enough not to employ proper encryption, or for checking on information stored by crime victims. Such a backdoor could be provided such that people can consciously activate it so that in case of death there is still a way to access information. But in general, we know that nowadays we leave such a thick digital trace in all our interactions, plus clandestine surveillance is now so powerful, that law enforcement has more than enough other venues to fight crimes.

The classical way to store secrets on a processor have been eFuses, which are relatively large, in the μm range, and can be read quite effectively. More modern approaches use anti fuses, for example from Sidense and Kilopass. These are much more difficult to read, and I suspect are the technology Apple is currently using to store their per device keys. To read out the keys, companies like Chipworks do look at chips very carefully, and it is interesting to read about the technology used. ⏎

Dealing with Strong Encryption

21. February 2016 · Categories: Politics, Software

When we are talking about how we should balance privacy and surveillance in the age of encryption, we basically all want the same thing:

Good Guys should be safe from intrusions, from identity theft, from banking fraud, from espionage, from exposure of their private lives that could make them vulnerable to extortion.

Bad Guys should be monitored so that they cannot do bad things, their private lives exposed as needed so that we can put them behind bars preventing further trouble, their plans visible so that we can counter them.

The trouble is that encryption, the technology, itself cannot distinguish between the good and the bad. In fact, our collective understanding of what is good changes over time. So there is no hope of ever achieving such a goal. For Hitler, Stauffenberg¹ was terrorist, and when we define any policy of how we are dealing with encryption, we should be careful what the implications are for the Hitlers of our world.

Often people are saying that we should add a backdoor for law enforcement to gain access when needed. This relies on a few key assumptions to work out well:

the police is fair, and will not abuse this power. This requires strong checks and balances to prevent the few bad apples from abusing their position. On the other hand, there are counties counting on citations to balance their budget: how can we trust them not to peak into people’s private lives to find some fines?
the key for the backdoor is kept safe. Again very difficult to believe given the data breaches governments have. Since it would be a universal key that would expose a few hundred million people, the stakes are high. Will we be willing to guard them as well as we do for nuclear launch codes² now? Can we guarantee that the guardians will do their job when the rewards would justify 100 million dollar bribes?
we have seen governments taken over by bad actors. Is any policy we are formulating robust for such a case?

The impact on foreign governments is important to consider: will they be happy that foreigners can access the phones? Will they demand that they get their own backdoor as well? Or will one universal backdoor be too widely know and quickly spread to thieves? Will they have the same regard for political rights as Western Governments? Wouldn’t the lack of universal encryption make it harder to fight for democracy? I believe the negative impact adding backdoors would have in repressive regimes is reason enough not to pursue this option.

Just as Americans accept thousands of gun casualties every year as the price for the right to own a gun, we need to be aware that we cannot achieve perfect security from terror, and that we need to accept somewhat less efficient crime and terror prevention as the price for keeping our data safe from criminals and espionage. And honestly, we cannot prevent people having awful plans. We can only work hard to deny them the tools, guns, bombs which enable them to become actually destructive.

We value our freedom of expression, we celebrate those who fought against injustice and made the world a better place. Privacy is important because it allows experimentation without public condemnation, because it prevents totalitarian oversight, because it keeps you safe from extortion. We must not allow fear to rule us, to cause us to limit the freedoms that have enabled so much progress.

Stauffenberg is now celebrated for the failed attempt to kill Hitler in 1944. ⏎
Actually, launch codes are easier to protect, as there is a human receiving them, and doing extra checks. Our devices would happily accept anyone with the right key. ⏎

Using Patronage to Finance Apps

25. November 2015 · Categories: Software

When Marco Arment decided to make his podcasting app Overcast free and ask for donations instead, there was some pushback that it would destroy developer pricing. Actually I believe that it is a viable model for popular apps, but that it will have less impact on developer pricing than free-to-play games.

Patronage basically changes the motivation to pay from “I want to use the app, so I pay” to “I am feeling better when I support this app”. Patronage will normally deliver less revenues, as only some of the people with a money surplus will pay, but then often more than what the market price would be. It requires a large base of affluent users that could become patrons to generate a good amount of revenue. So it is a viable business model for apps that deliver good value to a lot of customers. It essentially leaves a lot of money in the table, and when the value delivered is much greater than the cost to provide it, it suffices to have a small percentage of users supporting the development.

As could be already seen in the popularity of Pocket and Instapaper, such markets delivering huge surpluses are an attractive target for Venture Capital. VCs want to make money, so they attempt to corner the entire market so that they can extract money from ancillary services thanks to their market position. This is also why the influence on other software markets is more limited: Patronage only works in markets that are also attractive targets for VC founded companies, and is in my eyes vastly preferable to having a market dominated by a rent extracting startup. I simply trust the users to do better with their saved money than what a rentier would do with his surplus.

Critical Sections with Keil C++

07. May 2015 · Categories: Software

The Critical Sections on Cortex-M post has been expanded to cover the Keil compiler.

Open Source Founding

11. February 2015 · Categories: Software

We see that for open source projects, appropriate founding can be difficult to secure, even though people derive a lot of commercial benefit from it. It might be a good idea to get people selling stuff that includes an open source component to sell two variants: the normal variant and one which includes an open source contribution. This would make it much easier for people to support open source as it removes any payment friction: there is only the hassle to decide whether to support, not a new payment to be set up. And it could help that we ask for support at a moment where a customer is willing to part with money already.

Securing Web Accounts

26. January 2015 · Categories: Software

One problem we have with more and more services moving into the cloud is to provide appropriate security. The goals for the user are simple, if a bit contradictory:

her data must not be disclosed against her will,
no one else should be able to pretend to be her to others,
it should be easy to authenticate herself to the service, and
she does not want to loose anything should she forget her authentication.

The ideal solution would be a token, equipped with biometric sensors, to be your key to all your accounts, with a backup stored somewhere, which uses your DNA to authenticate you. Currently this is not feasible, and even the intermediate step of having a key storage that would do manual DNA checks is too expensive.

So we are stuck with using passwords, ungainly constructs that need almost 50 letters and/or numbers to encode a password of 256 bits strength. This can be reduced to 90 bits, or around 16 symbols, if you encode the password using a key derivation function set to 10ms, but this remains too unwieldy to remember for all but your master key.

If we could make the assumption that there is a secure storage for all our complicated passwords, then the ideal model for security would be simple: have one regular password, and a backup. The backup could only be changed with the knowledge of the current backup, and email addresses would no longer be sufficient to demonstrate your identity. One would move to this model by sending a conformation link to your email when you first enter the backup password, and publicizing the fact that you should add the back up password soon.

The closest we are currently are with two factor authentication methods that generate a recovery key, and where you can disable the confirmation channel to enforce using your recovery key. Unfortunately this does not work for Apple IDs: you always require two of account password, recovery key, and a trusted device to authenticate, and once you are authenticated you can reset the third factor. Given this weakness I’d assume that iCloud content can be accessed with with a subpoena. That the recovery key is a bit short with 14 characters or roughly 80 bits of entropy is a minor issue in comparison.

Tricky C++

07. July 2014 · Categories: Software

One of the problems with C++ is that it has a very brittle syntax, meaning that it is easy to trip over subtle differences:

class D {
  D();
  D( int);
}
void test() {
  D a;
  D b( 1);
  D c();
}

class D {

D();

D( int);

}

void test() {

D a;

D b( 1);

D c();

}

In this code, what is c? Given the definition of b, it sure looks like a class instantiation, but it actually is a function declaration that returns a class D. This is a problem that could easily be solved by requiring a function declaration without a parameter to use void, as in:

class D {
  D( void);
  D( int);
}
void test() {
  D a;
  D b( 1);
  D c( void);
}

class D {

D( void);

D( int);

}

void test() {

D a;

D b( 1);

D c( void);

}

This is more verbose, but it is also more difficult to get wrong. I believe this is what will kill C/C++ in the future: the unwillingness to sacrifice some backward compatibility for a safer syntax, even if one could do a perfect automatic translation.

Making Bit Fields Work

08. May 2014 · Categories: Software

I have previously railed about the lack of support for bit fields in embedded libraries. It turns out there is a good reason for this: they are not portable, and very ill defined in the standard.

In order to provide reasonably portable implementations, we should provide a well defined alternative to bit fields, which capture the essence of what we are now doing with bit manipulation.

Each bit field must be based on a basic integer type. This makes the mapping to storage explicit, and reduces the chance for counting errors.
You can explicitly choose whether the fields start with the highest or lowest bit
You can specify whether the access to the field must be atomic or not
You can define a force value for each field. This value is written to a volatile bit field whenever you update any field, unless you specifically have changed the value in the update statement, and it also serves as the default during initialization.
We force the compiler to collect consecutive write statements to a field, and update the field with exactly one write, when you use the comma operator to separate the statements.

We could define define bit fields as follows:

bitfield< little_endian, unsigned int, atomic> 
ClearFlags {
   int clear_a : 1 = 0;
   int : 7 = 0;
   int clear_b : 1 = 0;
   int : = 0;
};
volatile ClearFlags* pClear;

bitfield< big_endian, unsigned char, atomic> 
ChangeSettings {
   bool flag_1 : 1;
   bool clear_2 : 1 = false;
   int : 4 = 0;
   int : 2;
};
volatile ChangeSettings* pChangeSettings;

bitfield< little_endian, unsigned int, atomic>

ClearFlags {

int clear_a : 1 = 0;

int : 7 = 0;

int clear_b : 1 = 0;

int : = 0;

};

volatile ClearFlags* pClear;

bitfield< big_endian, unsigned char, atomic>

ChangeSettings {

bool flag_1 : 1;

bool clear_2 : 1 = false;

int : 4 = 0;

int : 2;

};

volatile ChangeSettings* pChangeSettings;

Then we can clear either one or both of flags easily:

// clear one flag
pClear->clear_a = 1;
// and the other 
pClear->clear_b = 1;
// or both together
pClear->clear_b = 1,
pClear->clear_a = 1;

// clear one flag

pClear->clear_a = 1;

// and the other

pClear->clear_b = 1;

// or both together

pClear->clear_b = 1,

pClear->clear_a = 1;

Things are a bit more complicated when we mix stuff:

// read the data, set flag_1 = true, clear_2 = false, int:4 = 0, and write it back
pChangeSettings->flag_1 = true;
// this will just update flag_1
ChangeSettings val = *pChangeSettings;
val.flag_1 = true;
*pChangeSettings = val;

// read the data, set flag_1 = true, clear_2 = false, int:4 = 0, and write it back

pChangeSettings->flag_1 = true;

// this will just update flag_1

ChangeSettings val = *pChangeSettings;

val.flag_1 = true;

*pChangeSettings = val;

And I wonder whether we should add new operators ->{ and .{. This would allow us to write pClear->{clear_a = 1, clear_b = 1}; to simplify updating multiple fields in one go.

Critical Sections on Cortex-M

12. April 2014 · Categories: Software

On the Cortex-M processors, you typically use critical sections to isolate accesses of interrupt handlers from the main program. The assembler needed for this is pretty straightforward, the question is how do we best implement it in C++? First a version for Gnu C++:

class InterruptLock {
public:
    InterruptLock() {
        __asm volatile( "MRS  %[oldLevel], PRIMASK": [oldLevel] "=r" ( oldLevel) ::);
        __asm volatile( "CPSID I" :: "r"( oldLevel):);
        __asm volatile( ""::: "memory");
    }
    ~InterruptLock() {
        __asm volatile( ""::: "memory");
        __asm volatile( "MSR  PRIMASK, %[oldLevel]": : [oldLevel] "r" (oldLevel):);
    }
private:
    unsigned char oldLevel;
};

class InterruptLock {

public:

InterruptLock() {

__asm volatile( "MRS %[oldLevel], PRIMASK": [oldLevel] "=r" ( oldLevel) ::);

__asm volatile( "CPSID I" :: "r"( oldLevel):);

__asm volatile( ""::: "memory");

}

~InterruptLock() {

__asm volatile( ""::: "memory");

__asm volatile( "MSR PRIMASK, %[oldLevel]": : [oldLevel] "r" (oldLevel):);

}

private:

unsigned char oldLevel;

};

Here we do a few things to ensure the compiler generates correct and optimal code for us:

The saved interrupt status is accessed directly as a register. This allows the compiler to keep the status in a register if beneficial, and since you need to move it into a register for access anyways, no performance is lost.
The fake dependency on line 5 ensures lines 4 and 5 are never swapped. We could do this in one __asm statement, but splitting it into two allows the compiler to insert a store between the two of them, if needed, which minimizes the time spent blocking interrupts.
Lines 6 and 9 are memory barriers which prevent the compiler from moving any memory accesses outside the lock.
The volatile keyword is needed to ensure that the compiler does not optimize these statements away, as they appear to him to have no effect.
Be careful when calling it, it must be InterruptLock var;

If you are using the Keil compiler, the lock would look pretty much the same. Here the intrinsic __schedule_barrier() is used to ensure that the compiler does not reorder code across the lock.

class InterruptLock {
public:
    InterruptLock() __attribute__((always_inline)) {
        register unsigned char PRIMASK __asm( "primask");
        oldLevel = PRIMASK;
        __asm( "CPSID I");
        __schedule_barrier();
    }
    ~InterruptLock() __attribute__((always_inline)) {
        __schedule_barrier();
        register unsigned char PRIMASK __asm( "primask");
        PRIMASK = oldLevel;
    }
private:
    unsigned char oldLevel;
    InterruptLock( const InterruptLock& other);
    InterruptLock& operator=( const InterruptLock&);
};

class InterruptLock {

public:

InterruptLock() __attribute__((always_inline)) {

oldLevel = PRIMASK;

__asm( "CPSID I");

__schedule_barrier();

}

~InterruptLock() __attribute__((always_inline)) {

__schedule_barrier();

PRIMASK = oldLevel;

}

private:

unsigned char oldLevel;

InterruptLock( const InterruptLock& other);

InterruptLock& operator=( const InterruptLock&);

};

Long Multiplication on the Cortex-M0

05. April 2014 · Categories: Software

One of the instructions cut from the M0 with regard to the Cortex-M3 core was SMULL. This instruction is extremely helpful when you want to do fixed point arithmetic with more than 16 bits. Compilers typically emulate this instruction, so that you can write:

int fixed_point_multiply( int x, int y) {
    return ((int64_t) x * (int64_t) y) >> 16;
}

int fixed_point_multiply( int x, int y) {

return ((int64_t) x * (int64_t) y) >> 16;

}

To implement its function, int64_t z = x * y, we need to add the individual products. Let ~ denote a sign extended halfword, and 0 a zeroed halfword, then the multiplication x × y or [a,b] * [c,d] can be calculated as 00[b*d] + ~[~a*d]0 + ~[~c*b]0 + [~a*~c]00.

To do this efficiently in assembly, we need to take into account that we only have 8 registers to work with, and that we have a carry to transport between the lower and upper word. So we will start with calculating the middle terms, and add the remaining ones at the end. The following code takes its parameters in r0 and r1, and returns the product in r1:r0.

PUSH {r4-r7}
ASRS r4, r0, #16  ; r4 = [~a]
UXTH r5, r0       ; r5 = [0b]
ASRS r6, r1, #16  ; r6 = [~c]
UXTH r7, r1       ; r7 = [0d]
MOV  r0, r4
MULS r0, r7, r0
ASRS r1, r0, #16  
LSLS r0, r0, #16  ; r1:r0 = ~[~a*0d]0
MOV  r2, r6
MULS r2, r5, r2
ASRS r3, r2, #16
LSLS r2, r2, #16  ; r3:r2 = ~[~c*0b]0
ADDS r0, r0, r2
ADCS r1, r1, r3   ; r1:r0 = ~[~a*0d]0 + ~[~c*0b]0
MULS r5, r7, r5
MOVS r7, #0
ADDS r0, r0, r5
ADCS r1, r1, r7   ; r1:r0 += 00[0b*0d]
MULS r4, r6, r4
ADDS r1, r1, r4   ; r1:r0 += [~a*~c]00
POP  {r4-r7}
BX   lr

PUSH {r4-r7}

ASRS r4, r0, #16 ; r4 = [~a]

UXTH r5, r0 ; r5 = [0b]

ASRS r6, r1, #16 ; r6 = [~c]

UXTH r7, r1 ; r7 = [0d]

MOV r0, r4

MULS r0, r7, r0

ASRS r1, r0, #16

LSLS r0, r0, #16 ; r1:r0 = ~[~a*0d]0

MOV r2, r6

MULS r2, r5, r2

ASRS r3, r2, #16

LSLS r2, r2, #16 ; r3:r2 = ~[~c*0b]0

ADDS r0, r0, r2

ADCS r1, r1, r3 ; r1:r0 = ~[~a*0d]0 + ~[~c*0b]0

MULS r5, r7, r5

MOVS r7, #0

ADDS r0, r0, r5

ADCS r1, r1, r7 ; r1:r0 += 00[0b*0d]

MULS r4, r6, r4

ADDS r1, r1, r4 ; r1:r0 += [~a*~c]00

POP {r4-r7}

BX lr