Predictions of, and calls for, the end of passwords have been ringing through the press for many years now. The first instance of this that Google can find is from Bill Gates in 2004 , although I suspect it wasn’t the first.
None the less, the experience of most people is that passwords remain a central, albeit frustrating, feature of their online lives.
Security Keys are another attempt address this problem―initially in the form of a second authentication factor but, in the future, potentially as a complete replacement. Security Keys have gotten more traction than many other attempts to solve this problem and this post exists to explain and, to some extent, advocate for them to a technical audience.
Very briefly, Security Keys are separate pieces of hardware capable of generating public/private key pairs and signing with them. By being separate, they can hopefully better protect those keys than a general purpose computer can, and they can be moved between devices to serve as a means of safely authorising multiple devices. Most current examples attach via USB , but NFC and Bluetooth devices also exist.
Contrasts with existing solutionsSecurity Keys are not the first attempt at solving the problems of passwords, but they do have different properties than many of the other solutions.
One common form of second factor authentication is TOTP / HOTP . This often takes the form of an app on the user’s phone (e.g. Google Authenticator ) which produces codes that change every minute or so. It can also take the form of a token with an LCD display to show such codes (e.g. RSA SecurID ).
These codes largely solve the problem of password reuse between sites as different sites have different seed values. Thus stealing the password (and/or seed) database from one site no longer compromises accounts at other sites, as is the case with passwords.
However, these codes are still phishable: the user may be tricked into entering their password and code on a fake site, which can promptly forward them to the real site and impersonate the user. The codes may also be socially engineered as they can be read over the phone etc by a confused user.
Another common form of second factor authentication are SMS-delivered codes. These share all the flaws of HOTP/TOTP and add concerns around the social engineering of phone companies to redirect messages and, in extreme cases, manipulation of the SS7 network .
Lastly, many security guides advocate for the use of password managers. These, if used correctly, can also solve the password reuse problem and significantly help with phishing, since passwords will not be auto-filled on the wrong site. Thus this is sound advice and, uniquely amongst the solutions discussed here, can be deployed unilaterally by users.
Password managers, however, do not conveniently solve the problem of authenticating new devices, and their automated assistance is generally limited to a web context. They also change a password authentication from an (admittedly weak) verification of the user , to a verification of the device ; an effect which has provoked hostility and counter-measures from relying parties.
In light of this, Security Keys should be seen as a way to improve upon, and exceed the abilities of, password managers in a context where the relying party is cooperating and willing to make changes. Security Keys are unphishable to a greater extent than password managers because credentials are bound to a given site, plus it’s infeasible to socially engineer someone to read a binary signature over the phone. Also, like TOTP/HOTP, they use fresh credentials for each site so that no reuse is possible. Unlike password managers, they can work outside of a web context and they can serve to authenticate new devices.
They aren’t magic, however. The unphishability of Security Keys depends on the extent to which the user may be mislead into compromising other aspects of their device. If the user can be tricked into installing malware, it could access the Security Key and request login signatures for arbitrary sites. Also, malware may compromise the user’s login session in a browser after successfully authenticating with a Security Key. Still, that’s a heck of a lot better than the common case of people using the same password across dozens of sites.
All the different termsThere is a lot of terminology specific to this topic. The first of which I’ve already used above: “relying parties”. This term refers to any entity trying to authenticate a user. When logging into a website, for example, the website is the relying party.
The FIDO Alliance is a group of major relying parties, secure token manufacturers, and others which defines many of the standards around Security Keys. The term that FIDO uses for Security Keys is “Universal 2 nd factor” (U2F) so you’ll often see “U2F security key” used―it’s talking about the same thing. The terms “authenticator” and “token” are also often used interchangeably to refer to these devices.
At the time of writing, all Security Keys are based version one of FIDO’s “Client To Authenticator Protocol” (CTAP1). This protocol is split between documentation of the core protocol and separate documents that describe how the core protocol is transported over USB , NFC , and Bluetooth .
FIDO also defines a U2F javascript API for websites to be able to interact with and use Security Keys. However, no browser ever implemented that API prior to a forthcoming (at the time of writing) version of Firefox.
But sites have been able to use Security Keys with Google Chrome for some years because Chrome ships with a hidden, internal extension through which the U2F API can be implemented with a Javascript polyfill, which Google also provides . (Extensions for Firefox were also available prior to native support in that browser.)
Thus all sites which supported Security Keys prior to 2018 used some polyfill in combination with Chrome’s internal extension, or one of the Firefox extensions, to do so.
The FIDO Javascript API is not the future, however. Instead, the W3C is defining an official Web Authentication standard for Security Keys, which is commonly called by its short name “webauthn”. This standard is significantly more capable (and significantly more complex) than the U2F API but, by the end of 2018, it is likely that all of Edge, Chrome, and Firefox will support it by default.
The webauthn standard has been designed to work with existing (CTAP1-based) devices, but FIDO is working on an updated standard for tokens, CTAP2, which will allow them to take advantage of the new capabilities in webauthn. (The standards were co-developed so it’s equally reasonable to see it from the other direction and say that webauthn allows browsers to take advantage of the new capabilities in CTAP2.)
There are no CTAP2 devices on the market yet but their major distinguishing feature will be that they can be used as a 1 st (and only) factor. I.e. they have enough internal storage that they can contain a username and so both provide an identity and authenticate it. This text will mostly skim over CTAP2 since the devices are not yet available. But developers should keep it in mind when dealing with webauthn as it explains many, otherwise superfluous, features in that standard.
CTAP1 BasicsSince all current Security Keys use CTAP1, and webauthn is backwards compatible with it, understanding CTAP1 is pretty helpful for understanding the space in general. Here I’ll include some python snippets for communicating with USB CTAP1 devices to make things concrete, although I’ll skip over everything that deals with framing.
CTAP1 defines two operations: creating a new key, and signing with an existing key. I’ll focus on them in turn.
Creating a new keyThis operation is called “registration“ in CTAP1 terminology and it takes two, 32-byte arguments: a “challenge” and an “application parameter”. From the point of view of the token these arguments are opaque byte strings, but they’re intended to be hashes and the hash function has to be SHA-256 if you want to interoperate.
The challenge argument, when used with a web browser, ends up being the hash of a JSON-encoded structure that includes a random nonce from the relying party as well as other information. This nonce is intended to prove freshness: if it was signed by the newly generated key then the relying party could know that the key really was fresh and that this was the only time it had been registered. Unfortunately, CTAP1 doesn’t include any self-signature (and CTAP2 devices probably won’t either). Instead the situation is a lot more complex, which we’ll get to.
The application parameter identifies a relying party. In U2F it’s a hash of the origin (e.g.SHA-256(“ https://example.com ”)) while in webauthn it’s a hash of the domain (e.g.SHA-256(“ example.com ”)). As we’ll see, the signing operation also takes an application parameter and the token checks that it’s the same value that was given when the key was created. A phishing site will operate on a look-alike domain, but when the browser hashes that domain, the result will be different. Thus the application parameter sent to the token will be different and the token will refuse to allow the key to be used. Thus keys are bound to specific origins (or, with webauthn, domains) and cannot be used outside of that context.
Here’s some sample code that’ll shine some light on other aspects of the protocol, including the outputs from key creation:
while True: challenge_hash = hashlib.sha256('challenge').digest() app_param_hash = hashlib.sha256('https://example.com').digest() status, reply = transact(1, 3, 0, challenge_hash + appid_hash) if status == 0x6985: time.sleep(0.5) continue print 'Public key: ' + reply[1:66].encode('hex') key_handle_length = ord(reply[66]) print 'Key handle: ' + reply[67:67 + key_handle_length].encode('hex') reply = reply[67 + key_handle_length:] # This is a fragile way of getting an ASN.1 length; # just to keep the code small. cert_len = struct.unpack('>H', reply[2:4])[0] print '-----BEGIN CERTIFICATE-----' print reply[:4+cert_len].encode('base64'), print '-----END CERTIFICATE-----' print 'Signature: ' + reply[4+cert_len:] break(The full sourceis available if you want to play with it, although it’ll only work on linux.)
The first thing to note is that the operation runs in a loop. CTAP1 devices require a “user presence” test before performing operations. In practice this means that they’ll have a button or capacitive sensor that you have to press. While the button/sensor isn’t triggered, operations return a specific error code and the host is expected to retry the operation after a brief delay until it succeeds.
A user-presence requirement ensures that operations cannot happen without a human being physically present. This both stops silent authentication (which could be used to track people) and it stops malware from silently proxying requests to a connected token. (Although it doesn’t stop malware from exploiting a touch that the user believes is authorising a legitimate action.)
Once the operation is successful, the response can be parsed. In the spirit of short example code everywhere, errors aren’t checked, so don’t use this code for real.
Key generation, of course, produces a public key. For CTAP1 tokens, that key will always be an uncompressed, X9.62-encoded, ECDSA P-256 public key. That encoding happens to always be 65 bytes long.
After that comes the key handle. This is an opaque value that the token uses to identify this key, and this evinces another important facet of CTAP1: the tokens are nearly stateless in practice.
In theory , the key handle could be a small identifier for a key which is stored on the device. In practice, however, the key handle is always an encrypted version of the private key itse