Authentication in etcd

by: tobias
2020-09-22 8:41pm tech learnings

FReD uses etcd as a backend to persist configuration data.¹

By default, every etcd instance is open to the world. In this blog post, the author describes how easy it is to discover open etcd instances that have not been configured correctly (i.e., their maintainers didn’t set up any authentication or authorization) and, subsequently, how he was able to simply access their data. Among that data, he found 8781 key-value pairs named password and 650 with the key aws_secret_access_key. That should be enough motivation to put some form of security measures in place with our own etcd instances.

The most common form of security is probably the username/password combo. This is a form of authentication, i.e. we’re proving to somebody who we are: only the user with that username knows their password, so we must be them.² This is fairly easy for people because we use this form of authentication every day: for Facebook, e-mail, getting into exclusive sex clubs, etc.

Another form of authentication is through public key cryptography. A user holds a pair of public and private keys: the public key is the only way to decrypt messages encrypted with the private key and vice versa. The user can now send a message encrypted with their private key to the service and if the service is able to decrypt that message with the publicly available key, this user has proven to be who they claim. This method is more common in authentication between computers or services, because it doesn’t require the other party to have direct access to our password. It is, however, still susceptible to man-in-the-middle (or MITM) attacks: an attacker impersonates the victim by publishing their own public key as that of that victim. To forego this, we have introduced certificate authorities (or CAs). Certificate authorities use further cryptographic binding methods to ensure that everyone involved in the transaction is actually who they claim they are. Wikipedia has a short entry explaining the concept a bit better.

Anyway, back to the matter at hand. In etcd, there are three places where we need to put some form of authentication: transport security between client and server, client authentication, and transport security and authentication between peers.³

Encrypting Client↔Server Communication

Let’s start with transport security between client and server. Basically, we want all access to etcd to happen over secure HTTPS instead of insecure HTTP. HTTPS is short for HTTP over Transport Layer Security (TLS), which is TCP with some added SSL for communication encryption. In its basic form, this only encrypts communication so that only server and client can decrypt data and nobody else can read our communication. It does not guarantee the client that it’s talking to the correct server or give the server any proof that the client is who they claim and have access to data. Yet it is a necessary first step.⁴ To enable HTTPS for etcd clients with our own certificate, we will need three things: a working certificate of a certificate authority (CA), a certificate issued by that CA, and a key to use that certificate. Now, getting a certificate from a CA is normally pretty expensive. They gotta make some money, you see. Your operating system comes with a number of CA certificates from trusted sources such as Deutsche Telekom preinstalled (these are often called root certificates because all other certificates are derived from these certificates). Yet nothing stops us from running our own CA, right? The only problem we face is that our CA’s root certificate isn’t installed on everyone’s laptops, but for our purposes, this is completely fine. Instead, we will ship our CA’s root certificate with the clients (i.e., the FReD nodes).

So let’s generate some certificates! We will be following this great guide (at least I hope it’s any good, haven’t read it yet). First, we generate the CA private key file:

openssl genrsa -out ca.key 2048

Next, we will use this key to generate a CA x509 certificate file, using this command and entering information as prompted (later, we will automate this process a bit and pass in information as parameters):⁵

openssl req -x509 -new -nodes \
     -key ca.key -sha256 \
     -days 1825 -out ca.crt

$ You are about to be asked to enter information that will be incorporated
$ into your certificate request.
$ What you are about to enter is what is called a Distinguished Name or a DN.
$ There are quite a few fields but you can leave some blank
$ For some fields there will be a default value,
$ If you enter '.', the field will be left blank.
$ -----
$ Country Name (2 letter code) []:DE
$ State or Province Name (full name) []:Berlin
$ Locality Name (eg, city) []:Berlin
$ Organization Name (eg, company) []:MCC
$ Organizational Unit Name (eg, section) []:FRED
$ Common Name (eg, fully qualified host name) []:etcd
$ Email Address []:

For now, we have selected a period of validity of 1825 days, or five years. After these five years run out, the certificate expires. I hope I have a different job then.

We’re already halfway there! We have created a keyfile called ca.key for our CA and a certificate called ca.crt based on that keyfile. Now, let’s generate a server certificate.

openssl genrsa -out server.key 2048

You may have noticed that this command doesn’t use any of our CA’s files. That’s because for now, it’s simply a standalone key for our server. We now want our CA to build and validate a certificate around that key using what is called a Certificate Signing Request (or CSR). Save this file as csr.conf:

[ req ]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = v3_req
distinguished_name = dn

[ dn ]
C = DE
ST = Berlin
L = Berlin
O = MCC
OU = FRED
CN = etcd

[v3_req]
keyUsage = keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth
subjectAltName = @alt_names

[alt_names]
IP.1 = 127.0.0.1
IP.2 = 172.26.1.1

The req field holds some technical data about the certificate we are requesting to be signed (I think). The distinguished_name has some further meta information that I have filled out like that of our CA x509 certificate - I hope it doesn’t matter. Finally, there is the subjectAltName, which needs information about which IP address (or, if present, which DNS names) our server uses to communicate. First, etcd requires that the loopback address is present, which we have. Second, there is the 172.26.1.1 address that we use for our testing etcd server. So let’s use that for now. Now, generate the CSR:

openssl req -new -key server.key -out server.csr -config csr.conf

We can now use the generated server.csr to create the certificate (let’s make it valid for five years again):

openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key \
-CAcreateserial -out server.crt -days 1825 \
-extfile csr.conf -extensions v3_req

And there we have our very own CA and first server certificate issued by that CA! Now, we need to put it into etcd somehow. As we start etcd from a docker-compose file, we need to extend that file as follows:

version: "3.7"

services:
  etcd:
    image: quay.io/coreos/etcd:v3.4.10
    container_name: etcd-1
    entrypoint: "etcd --name s-1 \
      --data-dir /tmp/etcd/s-1 \
      --listen-client-urls https://172.26.1.1:2379 \
      --advertise-client-urls https://172.26.1.1:2379 \
      --listen-peer-urls http://172.26.1.1:2380 \
      --initial-advertise-peer-urls http://172.26.1.1:2380 \
      --initial-cluster s-1=http://172.26.1.1:2380 \
      --initial-cluster-token tkn \
      --initial-cluster-state new \
      --cert-file=/cert/server.crt \
      --key-file=/cert/server.key"
    volumes:
      - ../../nase/tls/server.crt:/cert/server.crt
      - ../../nase/tls/server.key:/cert/server.key
    ports:
      - 2379:2379
      - 2380:2380
    networks:
      fredwork:
        ipv4_address: 172.26.1.1

networks:
  fredwork:
    external: true

We have changed the following:

For the --listen-client-urls and --advertise-client-urls we have changed http to https to tell etcd to use HTTP over TLS.
We have added the server.crt and server.key files as volumes (the part before the : tells docker-compose where it can find the files relative to the compose file), which are named /cert/server.crt and /cert/server.key inside the container, respectively.
We have told etcd to use the /cert/server.crt file as a certificate file and /cert/server.key as a key file.

Now, we only need to add the ca.crt CA root certificate to our client containers so they know that certificates signed using this root are worthy of their trust. To do this, we add the following lines to our Dockerfile:

RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/*
COPY nase/tls/ca.crt /usr/local/share/ca-certificates/ca.crt
RUN update-ca-certificates

I got this from this blog post, let’s see if that works. First, we update all repositories and then install the ca-certificates bundle. Then, we copy our own ca.crt to the certificates folder and run update-ca-certificates to import it.

Let’s run our tests… and it works!

Authenticating Clients

Now, we have already encrypted client traffic to etcd, which is nice. Yet so far, every client that has a network link to our etcd server is also able to issue commands and read data, as long as it accepts the certificate. We want to limit this to only clients who are authorized⁶. For every request, etcd checks if that request is accompanied⁷ by a certificate that is also issued by our CA. So we will need to generate a client certificate for our clients.

Thankfully we have already created our CA and issued a first server certificate. Issuing a client certificate now isn’t that hard, we will need both a client.crt and a client.key file and pass those to the etcd client within our code. We’ll follow the same steps as above.

openssl genrsa -out client.key 2048

cat > client.conf <<EOF

[ req ]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = v3_req
distinguished_name = dn

[ dn ]
C = DE
ST = Berlin
L = Berlin
O = MCC
OU = FRED
CN = client

[v3_req]
keyUsage = keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth
subjectAltName = @alt_names

[alt_names]
IP.1 = 127.0.0.1
IP.2 = 172.26.0.10
IP.3 = 172.26.0.11
IP.4 = 172.26.0.12

EOF

openssl req -new -key client.key -out client.csr -config client.conf

openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key \
-CAcreateserial -out client.crt -days 1825 \
-extfile client.conf -extensions v3_req

The only difference here is that we use client as the common name (CN) and that we have different subject alternative names (SANs for short), which are the IP addresses of our clients.⁸ Now we can use these new certificate files in our code when connecting to etcd with the clientv3 that we use:

// rather than using direct paths, we will of course put in some command flags for our program to choose the location of these files
tlsInfo := transport.TLSInfo{
    CertFile:      "/cert/client.crt",
    KeyFile:       "/cert/client.key",
    TrustedCAFile: "/cert/ca.key",
}

tlsConfig, err := tlsInfo.ClientConfig()
if err != nil {
    return nil, errors.Errorf("Error starting the etcd client")
}

cli, err := clientv3.New(clientv3.Config{
    Endpoints:   endpoints,
    DialTimeout: 5 * time.Second,
    TLS:         tlsConfig,
})

Of course, we also need to tell etcd that we want to use client authentication now and it needs the CA certificate to check client certificates. Let’s add this with the --client-cert-auth and --trusted-ca-file=/cert/ca.crt CLI flags. And check if it works…

…and it doesn’t. Here are the two error messages we are getting:

transport: authentication handshake failed: remote error: tls: bad certificate

tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage

Have you noticed our error? No? Well, it’s not your fault. Remember when we generated our client certificate’s CSR and we put in extendedKeyUsage = serverAuth somewhere? Well, server authentication worked fine, but now we’re doing something else: client authentication. So, back to the drawing board (or, to the terminal, for that matter). We will need to extend our CSRs with:

extendedKeyUsage = serverAuth, clientAuth

Now we’ll re-generate and re-sign and re-verify ~~and re-tard~~ our certificates and see if it works… and it does! Damn, we’re great.

Authentication and Encryption Within a Cluster

And for our final trick, we will add certificates for intra-cluster communication. Or, rather, we won’t. Because our cluster contains only one node. Remember how I said our cluster is a test cluster and contains only one etcd node? Well. In theory, you can add as many nodes as you want⁹ and they will ensure consensus among themselves. In order to do that, they will need to communicate among each other. And again, we want a) this communication to be encrypted and b) that only ~~authenticated~~ authorized nodes can take part in that whole cluster communication thing.

In order to achieve that, we would need new certificates. If we were to generate a member1.crt and member1.key as we did before, we could add the --peer-client-cert-auth, --peer-trusted-ca-file=/cert/ca.crt, --peer-cert-file=/cert/member1.crt, and --peer-key-file=/cert/member1.key flags to encrypt and authorize. Then we could change the protocol in the --initial-advertise-peer-urls and --listen-peer-urls flags from http to https. Nice.

Automate All The Things!!

Of course, that was a bit difficult, wasn’t it? Many steps to keep track of, much to learn. Let’s pack it all up into a bash script for posterity.

#!/bin/bash

# usage: gen-cert.sh <name> <ip>
# check that we got the 2 parameters we needed or exit the script with a usage message
[ $# -ne 2 ] && { echo "Usage: $0 name ip"; exit 1; }

# give better names to parameter variables
NAME=$1
IP=$2

# generate a key
openssl genrsa -out "${NAME}".key 2048

# write the config file
# shellcheck disable=SC2086
cat > ${NAME}.conf <<EOF

[ req ]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = v3_req
distinguished_name = dn

[ dn ]
C = DE
ST = Berlin
L = Berlin
O = MCC
OU = FRED
EOF

# write the CN into the config file
echo "CN = ${NAME}" >> "${NAME}".conf

cat >> ${NAME}.conf <<EOF
[v3_req]
keyUsage = keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth, clientAuth
subjectAltName = @alt_names

[alt_names]
IP.1 = 127.0.0.1
EOF

# write the IP SAN into the config file
echo "IP.2 = ${IP}" >> "${NAME}".conf

# generate the CSR
openssl req -new -key "${NAME}".key -out "${NAME}".csr -config "${NAME}".conf

# build the certificate
openssl x509 -req -in "${NAME}".csr -CA ca.crt -CAkey ca.key \
-CAcreateserial -out "${NAME}".crt -days 1825 \
-extfile "${NAME}".conf -extensions v3_req

# delete the config file and csr
rm "${NAME}".conf
rm "${NAME}".csr

And that concludes today’s installation of Fun Public Key Infrastructure Activities for Millenial Couples!

Just like Kubernetes! ↩︎
The next step would be authorization: the service we authenticated against needs to know the permissions for this user. Is the user Tobias allowed to change some configuration or update a value? ↩︎
I should note that the --auto-tls and --peer-auto-tls flags exist for the etcd command. These will have etcd create self-signed certificates and use them for client and peer communication, respectively. This is fine for TLS, i.e., communication encryption, but doesn’t help with server authentication, as there is no way for the clients to prove that the correct etcd server issued these certificates. Hence we will need a certificate authority. ↩︎
After all: even if I go to great lengths to make sure that everyone is authenticated correctly and I can prove who everyone involved in our little data transaction is, what does it help me if everyone else in the world can read my traffic? ↩︎
The year is 2035. We have still not automated this process as we said we would. But it’s fine, we only need to generate a CA certificate once, no need to do it again later. ↩︎
I know I talked about how authorization ≠ authentication but in this case, we actually authorize the clients by giving them the correct certificates, which etcd checks. What they are actually doing, however, is authenticating agains etcd. But we have already authorized them by giving them the certificates. Does that makes sense? It’s complicated. ↩︎
encrypted by? signed by? who knows ↩︎
Of course we could generate a different certificate for each of our clients, but… it’s half past 8pm already. ↩︎
Five is enough in most cases, though. ↩︎