TLS and DTLS - Commons and Differences

TLS and DTLS are both used to encryt comunication. Both are based on the Internet Protocol v4 or Internet Protocol v6 to exchange messages. TLS reached a large popularity, while DTLS seems to be still uncommon. Some may therefore ask, why DTLS should be considered at all. So compare both to see the difference. Possibly some will recognise, that there are cases, where DTLS shows benefits.

TLS

TLS is very wellknown. It is based on TCP. That offers a connection oriented reliable transmission. You may consider it as “stream of ordered data without gaps”. For the most, that’s what is wanted for communication. You can send data and it gets received reliable in the same order without gaps. A TCP stream is split into IP messages and the reliability is based on retransmissions of these IP messages, as common. In some rare cases, for example under bad connectivity, even these retransmissions fails, and so the connection is dropped and no data could be exchanged. TCP transmits these IP messages using a window-buffer. It sends a number of messages without waiting for ACKs. The nonobvious drawback comes with a situation, where some data could be transmitted successfully, but some is dropped. To grant the “ordered data without gaps” the already received data after such a gap must be dropped, if that gap could not be closed by retransmissions. If the time of processing the received data is critical to your application, these grants may turn into drawback. The critical data may have been already received, but some less important, but preceding, data may stick in retransmission. With that, the critical data will not be processed in time.


    +---+---+---+---+---+---+---+---+
    | 1 | 2 | 3 | x | 5 | 6 | 7 | 8 |
    +---+---+---+---+---+---+---+---+

(IP messages 1-3 and 5-8 are already received. But 4 is missing. In order to grant the “stream of ordered data without gaps”, the receiver has to wait for the missing message. That delays the processing of 5-8. If the missing message gets lost even with retransmissions, the connection is dropped at message 3, and the messages 5-8 will then be lost as well.)

Usually such situations will be extremely rare, therefore TCP is so widespread. But for some environments it’s challenging. Especially mobile or slow narrowband links scenarios gets faced by such trouble.

(Some might object, that TCP offers a concept of urgent data. Read it, before rely on that.)

TLS uses that “stream of ordered data without gaps” to build the encryption on it. TLS split the application data into encryption records. It’s not intended to decrypt records, if there is a gap in the records sequence. That does not introduce more drawbacks, but it manifests them even more. In order to establish encryption, TLS uses a handshake, an exchange of several messages in order to select a common cipher suite and crypto parameter and authenticate the peers. Especially so called asymetric encryption may consume a lot of processing power and some data overhead. The x509 certificates are very common for TLS. Using these cipher suites usually costs 2-4k data overhead for these certificates on each handshake. So challenging environments get even more challenging with TLS. If the connection is dropped, a new handshake is required. And sometimes the transmission is just too unreliable, to even get the handshake done.

DTLS

DTLS also provides encryption for communication, but the grants are different. It is base on UDP, which also uses IP messages. In difference to TCP, it leaves retransmissions and ordering to other layers. Depending on the application, it benefits from possibly processing the data with gaps and out of order, but in time. Some application will benefit from that and some will not. Sometimes an application message does not fit into a single IP message, and sometimes even a single IP message maybe transmitted in chunks instead of one message. Both may result in similar behaviour as TCP, data is already received, but not complete, and so it is not handed over to the next layer. The advantage is gone. Therefore it is important to use small messages. This is not an absolute requirement. There may be some larger messages. But as more messages are smaller, the more an application may benefit from UDP.

Unfortunately, UDP suffers from the default configuration of many components on the path (route) through the network. As example, NATs are very common. For UDP it seems to be common to timeout the NAT entry after a much shorter time without traffic, than for TCP. In the past 30s seems to be widely used for that. In my experience, this is still valid for today. If the NAT removes the entry after such a timeout, the peer, which initiated the communication, will not longer be able to receive messages from the peer behind that NAT.


    +--------+         +-----+         +--------+
    | peer 1 | ==IP1=> | NAT | ==IPa=> | peer 2 |
    |        |         | 1=a |         |        |
    +--------+         +-----+         +--------+

    +--------+         +-----+         +--------+
    | peer 1 | ==IP1=> | NAT | ==IPa=> | peer 2 |
    |        |         | 1=a |         |        |
    +--------+         +-----+         +--------+

                     NAT timeout

    +--------+         +-----+         +--------+
    | peer 1 | |=???== | NAT | <=IPa== | peer 2 |
    |        |         | ?=a |         |        |
    +--------+         +-----+         +--------+

Peer 1 with inet address IP1 sends a message through a NAT, which translates that address to IPa. Peer 2 is only aware of that NATed address IPa After the NAT timeout peer 2 is not longer able to reach peer 1, because the mapping for the addresses IP1=IPa is removed by the timeout.

In some network environments, it may be possible to configure the NAT to forward the traffic for a UDP port statically to a specific other peer. That bypasses the temporary mapping. In the example above that would enable the NAT to know, that all traffic for a specific PORT will be mapped to peer 1 and so peer 2 will be able to reach peer 1. Timeouts are not the only events, which makes the mapping fragile. In much rarer cases also NAT restarts or exhausted resources may cause removals of mappings earlier. That makes reliable communication hard. Finally, in some corporate networks the usage of UDP seems to be forbidden at all. Then it simply also can’t work at all.

DTLS is based on the same cryptographic functions as TLS. It supports the behaviour of UDP by decrypting records independed from other records. It is able to handle records, if others are dropped or reordered. There is no need to disconnect and force a new connection. DTLS requires also handshakes as TLS, but for DTLS it is also more common to support cipher suites using PSK (Preshared secret key, comparable with a username/password, but without transmitting the secret/password to the other peer). That requires less data overhead, about 300 bytes. Also Raw Public Key is more common for DTLS, which enables to use asymetric cryptographie for authentication, if x509 infrastructor is not available or doesn’t make too much sense.

DTLS suffers more from NATs than UDP. If a NAT removes a entry on timeout, a next message sent may be mapped to a different address-port combination. But DTLS without extensions uses that address-port combination to select the cryption context (e.g. keys for writing and reading), and so the message could not be decrypted. The results in frequently new handshakes, just to traverse NATs.


    +--------+         +-----+         +--------+
    | peer 1 | ==IP1=> | NAT | ==IPa=> | peer 2 |
    |        |         | 1=a |         | a:=xyz |
    +--------+         +-----+         +--------+

                     NAT timeout

    +--------+         +-----+         +--------+
    | peer 1 | ==IP1=> | NAT | ==IPb=> | peer 2 |
    |        |         | 1=b |         | b:=????|
    +--------+         +-----+         +--------+


Peer 2 associate the crypto keys for peer 1 to address IPa ( a:=xyz). After the NAT timeout, peer 1 may be translated to the new address IPb. With that peer 2 is not longer able to use the right crypto keys.

One solution to overcome that is to identify the cryption context not by the address-port combination and replace that by a connection ID.

Summary

In scenarios with challenging environments, TLS/TCP may show drawbacks and DTLS/UDP demonstrates strenght. The reliability may be added, it is easy to resend data. If some data gets dropped even with retransmissions, then the next received data maybe processed without the previous message. Or instead of resend some less important data, you may resend an more important alert before. Both requires that your application can cope with gaps and unordered data and use mainly small messages. If that is possible, then using DTLS/UDP will help to overcome challenging environments. If the environment causes more TLS connection to be dropped, the amount of data to perform the handshakes is also increasing. If that counts, then DTLS 1.2 using a connection ID lowers that amount by reducing these handshakes.

List of recommended standards/drafts: