WebSocket protocol security in practice

The dynamic development of web applications leads to a situation in which, for some time now, there has been a demand for the introduction of asynchronous data exchange between the client and the application server. The commonly used HTTP protocol is stateless, based on the query sent to the server and the answer given – no intermediate states here. One of the proposed solutions, extending the existing possibilities, is the long polling technique.

In the case of HTTP servers, the client must assume that the server may not respond to the request immediately. On the other hand, the server side of such communication assumes that in case of lack of data to be sent, it will not send an empty answer, but will wait until the moment when the data appears. Another option is to use asynchronous queries (XHR). In this case, however, the effect of bidirectional communication with as little delay as possible is achieved at the cost of increasing the number of queries to the server. Thus, due to the demand for the implementation of real two-way communication in web applications, it was proposed to implement the WebSocket protocol.

What is WebSocket protocol and how it works

WebSocket is a TCP-based protocol that provides bi-directional (full-duplex) communication between the client and the server. After the connection is established, both parties can exchange data at any time by sending a data packet. A party interested in establishing a connection sends a handshake request to the server. This request is almost identical to a standard HTTP query for reasons of compatibility with web servers:

This query informs the web server that the application wants to establish a connection using the WebSocket protocol (Upgrade header). At first glance, the Sec-WebSocket-Key header, which contains a string encoded with the use of the base64 algorithm, also draws attention. What comes to mind is that there may be a key that will be used to encrypt communication. Its actual use is only to bypass cache problems, and in practice, it contains a sequence of randomly generated data.

In response to a request prepared and sent in this way, the application server responds in the following way:

Response code 101 means that the server supports the WebSocket protocol and agrees to establish a connection. As with the request, the response also contains a string encoded in base64. In this case, it is the result of the SHA-1 function on the previously sent character string from the Sec-WebSocket-Key header connected to the GUID constant “258EAFA5-E914-47DA-95CA-C5AB0DC85B11.” Once the connection has been successfully established, further communication takes place over the TCP socket without the HTTP protocol. The WebSocket frame looks like this:

Picture nr 1. Data frames of WebSocket protocol (source: https://tools.ietf.org/html/rfc6455)

At this stage, we are mainly interested in the opcode and payload data fields. Opcode defines how the data sent in payload data should be interpreted. The most important values that can be accepted by the opcode field are presented in the table.

Table – chosen values of opcode field

Other values not listed here are discussed, among others, in RFC 6455.

A separate paragraph should be devoted to the mask bit and the masking-key field. According to the standard, each packet sent from the client to the server must have a bit mask set. If it is set, the payload field does not contain explicit data, but its masked form. By camouflage, we mean the result of the XOR function on the strings of characters from the masking-key field and the sent data. The question arises here as to what value the execution of such an operation brings to the whole process. It is justified because there is no added value from the point of view of the confidentiality of the data transmitted. The encryption key is located just before the “masked” data, which means that reading such transmitted encryption should be treated as a trivial task. In the RFC document, however, we can find information about the fact that the use of such a mechanism introduces protection against cache poisoning – attacks aimed at influencing the cache memory of various types of proxy servers.

What does the example frame look like in practice? Sending the string, “Sekurak”, to the server, we can intercept the following packet (e.g., using the Wireshark tool):

Picture nr 2. Intercepted package

Here we can see that opcode has a value of 1, which means that we send the text. The sent string has 7 characters (111 binary), which corresponds to the length of the payload (Sekurak). Additionally, the packet has also set the bit mask and 32-bit masking-key. The last 7 bytes are masked data. Importantly, Wireshark is able to recognize the WebSocket package and presents individual parts of it in a clear way:

Picture nr 3. Individual parts of the WebSocket package recognized by Wireshark

Using a simple script, we can confront theory with practice. Our masked payload has the following hexadecimal form: 9d5376f1bc5776, according to what can be seen in the masking-key field, and the used key is ce361d84.

It seems that everything works as intended.

Simple client

In the next step, it is worth getting to know how WebSocket works in practice. For this purpose, you can use a simple client in JavaScript and an echo server shared by the websocket.org community.

Compared to the original, the code has been minimally adapted to our needs:

Source: https://www.websocket.org/echo.html

By saving the script under any name with the *.html extension and opening the file in a browser that supports the WebSocket protocol, we will automatically establish a connection to the echo server. It is recommended to use browsers based on Chromium (e.g., Google Chrome) because of the extensive functions related to WebSocket available in the development console.

Picture nr 4. Messages exchanged with the WebSocket server

By By typing any sequence of characters in the text field, we can send it to the server by pressing the Enter button. To preview frames generated by the browser and received from the server, you can use Google Developer Tools (Development console -> Network tab -> echo.websocket.org position in the Name column -> Frames tab):

Picture nr 5. Data frames intercepted in the development console

Threats

Below, we describe the most important threats related to the use of WebSocket protocol. They have been mapped to the most common vulnerabilities found in web applications listed in the OWASP Top 10 list. The intention is not to keep the order or exact division but to treat this list as a template to discuss the most important vulnerabilities associated with the described protocol. When you start reading further, you should be aware of the fact that WebSocket is nothing else but another way to transfer data over the web. The decision about what happens to data transmitted in this way depends entirely on the application using this protocol.

Same-origin policy

A practical attempt to use WebSocket was deliberately placed just after the theoretical introduction. If we have exchanged a few messages with the echo server, we should think about the fact that we have established connections without any problems – and more importantly, we have received a response from an external server. Why did he not protest against the Same-origin Policy (SOP) mechanism? One of the main risks to be considered when using WebSocket is the issue of the Same-origin policy, and more specifically, in this case, its non-application. In other words, WebSocket connections made from web browsers are not subject to any restrictions as to where you want to connect to. In case of HTTP queries, the SOP and possibly relaxing this policy, in the form of appropriate CORS rules, will apply. Here, we have no such restrictions. At the moment, the only way to harness the WebSocket connection is to use the Content Security Policy (CSP) through the connect-src directive.

Incorrect authentication and session management

WebSocket implements authentication directly in no way. Same as in HTTP; the burden of verifying the identity of the customer lies with the application based on this protocol.

Avoidance of authorisation

As in the case of authentication, issues related to the allocation of rights to resources are on the side of the application using WebSocket. WebSocket defines an HTTP-like set of URL schemas. It should be remembered that if the application does not enter the appropriate authorization level, then, as in the case of HTTP resources left without authentication, it will be possible to enumerate them here as well. When designing an application, it is convenient to make certain assumptions, which significantly simplify the issues related to security implementation. An example of a situation where the temptation to take shortcuts may arise is to place too much trust in the client’s headers, especially the Origin header. This header contains information about the domain from which the request was sent and should be validated on the server side. Its value is automatically set via web browsers and cannot be changed, e.g., via JavaScript code. It should be remembered, however, that the client, establishing the connection, may be any application, which is not subject to this restriction.

Injections and incorrect handling of data

At this point, it should be recalled once again that WebSocket is only a data exchange protocol. It depends on the programmer – what data will be sent and in what form. The application, on the other hand, bears the burden of data validation. Information transmitted by this protocol should not be treated as trusted and handled in the same way as data transmitted by other protocols. If the data received by WebSocket are to be transferred to the database, the mechanism of prepared statements should be used. When you want to attach the received data to the DOM tree, you need to replace the HTML control characters with their entities.

Depletion of server resources

Starting the WebSocket server may require rethinking the issue of resource depletion. By default, the customer has virtually no restrictions on the number of connections. Opening several tabs in a browser with the same WebSocket application will result in the same number of new connections. Logic protection against excessive depletion of resources must be implemented on the server or infrastructure side.

Tunnelling of traffic

Many sources that treat WebSocket contain information that this protocol that allows for tunnelling any TCP traffic. As an example of such an application, the wsshd project can be presented. Thanks to it, by installing several libraries and running the script on the server, we can expose our SSH server to the world, allowing you to connect to it via WebSocket.

Wsshd provides a console client and a web interface to connect to the SSH server via WebSocket:

Picture nr 6. Lunched wsshd server

The use of such solutions opens up new perspectives for bypassing the filtering of network traffic through firewalls.

Encrypted communication channel

As in the case of HTTP, using WebSocket, we can decide whether the data is to be sent via the encrypted communication channel (TLS) or not. For applications using encryption, the wss protocol has been prepared (e.g., wss://securitum.com).

Ready solutions

In practice, few people decide to use the native WebSocket implementation through the basic JavaScript interface provided in browsers. A more popular approach is to use ready-made libraries and frameworks. The most interesting of them are:

  • Socket.io – one of the most popular solutions of this type, developed since 2010; the server part is written in Node.JS,
  • Ratchet – something for people who want to stay with solutions based on PHP,
  • WebSocketHandler – class available in the .NET environment from version 4.5,
  • Autobahn – if we operate in Python environment, it is certainly worth taking an interest in this library; it also has its implementations for other technologies (Node.JS, Java, C++).

In case of your own implementations, you should remember about issues such as memory management.

Testing

To capture traffic and modify queries sent via WebSocket, it is strongly recommended to use OWASP Zaproxy. Support for WebSocket in Burp Suite is in its infancy; available capabilities are actually limited only to the basic capture of queries and displaying a list of executed requests and received responses. To use Zap for testing, download the JAR file and make sure that when you run the proxy, it listens to port 8080 (Tools -> Options -> Local Proxy -> Port field). Next, you need to configure your browser to send traffic to the localhost:8080 proxy (tips on how to configure proxy settings in popular browsers can be found here, among others).

After configuring the browser and refreshing the file with our test client, a new WebSockets tab should appear in the proxy:

Picture nr 7. View of the WebSocket tab in OWASP Zaproxy

It will be a place where we will find information about each frame sent from the application, as well as received from the server. By right-clicking on the selected item in the list, a menu will appear from which you can select Resend:

Picture nr 8 Resend option that calls up a form that allows you to modify the data frame

In the new window, we will have the following few useful options to choose from:

Picture nr 9 Form for editing a data frame
  • Opcode – from the drop-down list, you can select such options as TEXT, BINARY, CLOSE, PING, and PONG. In this way, we are able to simulate each of the stages of communication that can occur in the case of the WebSocket protocol,
  • Direction – the direction in which the frame is to be sent (to the server – Outgoing, Incoming – to the application),
  • Channel – a list from which we can choose which connections are affected by the modifications (if there is more than one at any given moment).

Confirm the changes we have made with the “Send” button

The options shown here are a substitute for the Burp Suite “Repeater” tool, which is often used to modify HTTP requests.

Threat modelling

As a summary of the article, below, I present a sample list of questions that should be answered when modeling threats of an application using WebSocket:

  • Is an encrypted communication channel (wss) used?
  • Are the data received from the client via the WebSocket protocol properly validated?
  • Does the WebSocket server limit the number of possible parallel connections from one client?
  • How is the authentication and authorization of resources made available through WebSocket implemented?
  • Is there a known WebSocket server or a proprietary solution used? Has the original server passed the stage of security verification?
  • Do we have a CSP policy in place that limits the sources with which we can establish a connection?
  • Is the Origin header validated on the server side, additionally taking into account the fact that it can be manipulated in case of using a client that is not a web browser?
  • Is there a cookie library that supports the client and server part responsible for WebSocket protocol?
  • Does the firewall allow traffic to the port on which the WebSocket server listens only from specific sources?

Summary

WebSocket is an interesting solution, which, in the era of “rich” web applications, can find many applications, for example, in the case of applications where users simultaneously work on the same set of data. However, it should be remembered that, from a security perspective, it is only a data carrier and the burden of proper handling, as in the case of HTTP, lies on the application side.

References:

1. https://tools.ietf.org/html/rfc6455
2. http://websocket.org/echo.html