Table of Contents

I’ve come across a particular challenge that many of us face: decoding HTTP/2 traffic becomes easier when you understand concepts from a protocol parsing guide, especially when dealing with complex binary protocols like HTTP/2. In this blog, I’ll share insights on why decoding HTTP/2 headers can be tricky, how HPACK adds a layer of complexity, and most importantly, how eBPF uprobes can come to the rescue.

It is crucial to gain visibility into the messages exchanged between services for a comprehensive understanding and effective troubleshooting of issues. Luckily, it is possible to track the traffic enabling you to effectively debug your HTTP/2 applications.

What does Wireshark do?

Wireshark is a popular open-source network protocol analyzer that allows you to capture and inspect the data conversing back and forth on a network in real time.

However, Wireshark sometimes fails to decode the HTTP/2. The issue stems from the binary framing of HTTP/2 packets, making it challenging for Wireshark to precisely decode headers. This challenge intensifies when dealing with encrypted traffic or intricate sequences of frames, leaving developers in a quandary. Let’s delve into a scenario where we attempt to inspect HTTP/2 traffic using Wireshark. We might encounter difficulties decoding headers due to the multiplexing of streams within a single connection. Traditional tools, designed for simpler protocols, may falter in providing a clear interpretation, emphasizing the need for a more sophisticated solution.

python

This snippet showcases the challenge of interpreting binary HTTP/2 frames, which can be a stumbling block for tools like Wireshark. Normally, we can create a function such as decode_http2_headers to determine the exact output of the above.

python

By running the above code snippet we can get our output:-

python

But this is a highly simplified example, and a real-world HTTP/2 header decoding function would need to handle a variety of scenarios, including HPACK compression, binary encoding, and more. The actual output would depend on the structure and content of the HTTP/2 headers in the given frame_data.

How does eBPF solve the issue?

So if we can’t properly decode HTTP/2 traffic without knowing the state, what can we do?

Thankfully, with eBPF it becomes possible for us to observe HTTP/2 implementation to get the information that we need, without requiring state. By attaching uprobes to the HTTP/2 library APIs that take clear-text headers as input, the uprobes can directly read the header content from application memory.

The first thing I need to do is find a specific function in my code that holds all the important info about HTTP/2. This function should use a straightforward argument structure for easy data access within the eBPF code. The objective is to establish a reliable and adaptable foundation for observing and optimizing HTTP/2 interactions, this process entails strategically selecting a function that simplifies the manual pointer manipulation required for eBPF code.

python

This is a simplified example of how we can do HTTP/2 tracing using eBPF uprobes. Now, let’s customize it so that the tracer is launched after the connection between the client and server is established.

python

Instead of trace_http2_recv_response_headers and trace_http2_send_request_headers, we are using the trace_http2_headers function which is associated with the HTTP/2 headers, and prints a message when headers are received.

We are using tcp_v{4,6}_connect tracepoint, which is triggered when a TCP connection is established, and when this event occurs, it updates a timestamp in the BPF hash table. You can refer to the sample app code on GitHub.

Now, when I run the Flask app and access it through my browser, I will get output on my terminal, which will look something like this:

yaml

The messages indicate when HTTP/2 headers are received, and the associated PID helps identify the process of handling the HTTP/2 traffic.

Conclusion

Tracing HTTP/2 activity is hard because of a complicated compression method called HPACK. However, in this post, we showed a different method to catch messages. Instead of dealing with HPACK directly, we used eBPF Uprobes to track certain functions in the HTTP/2 library. This gives us a clearer way to see what’s happening with the messages in our HTTP/2 traffic.

The main advantage is the ability to trace messages regardless of when the tracer was deployed. In the end, our goal was to optimize for an approach that worked out of the box, regardless of the deployment order, which is what led us to the eBPF Uprobe-based approach.

Author

  • Animesh Pathak

    Animesh Pathak is a developer specializing in backend systems and API-driven architectures. He focuses on improving application performance and building robust, scalable solutions.



More Stories

No posts found matching ""