Hello. I’m Len Shustek, and I’m going to talk to you today about DPI: Deep Packet Inspection.
We’re going to talk about four different things. One is what DPI is. The second are some examples of how it is used. The third are the challenges in deploying this kind of technology. And the fourth is some history about DPI products, starting in the 1980s when the technology was first developed.
Now, who am I, and why do I have the right to talk about this sort of thing? I’m a computer scientist, a former university professor, and the founding Chairman Emeritus of the Computer History Museum in Mountain View California. But more to the point for today’s talk, I was co-founder of two networking companies in the 1970s and 1980s. Nestar systems was a manufacturer of local area networking systems for personal computers. Network General was a manufacturer of diagnostic tools for local area networks, many of which used Deep Packet Inspection.
So, what is DPI?
Computer networks have lots of packets flowing between participants in conversations on a network. DPI is passive inspection of the contents of those packets by processes that are not participating in the conversation. It is a passive observer, connected to any link of any network of any kind, and it looks deeply inside the packets that are passing. As Yogi Berra, the famous New York Yankee’s Catcher, said, “You can observe a lot by watching”.
DPI is a process. So it could be running in a separate computer that gets clipped on to the network at some point. Or it could be a process running inside some computer that is itself participating on the network. But the DPI process itself is just a passive observer.
It applies to all sorts of networks: local area networks, wide area networks, and metropolitan area networks. And it doesn’t matter if they’re wired networks or wireless networks.
Here are some examples of how DPI aids in diagnosing network problems.
A simple one is that it exposes the real details about errors. Often, when things don’t work, you get a message from the application software that is not very helpful, like “host not available”. What DPI does is look inside the packets that are flowing, and tell you exactly the details of what is causing that particular error.
Networks are incredibly complicated these days. They’re assembled from hardware and software pieces that come from multiple vendors, and in order for it all to work there has to be compatibility. That’s especially challenging as new versions are deployed. Every once in a while you’ll deploy a new version, and something will stop working. Or it will work, but it will be downgraded to a fail-safe mode with lower performance because it’s not compatible with the newer software.
There are many errors that happen in networks in credentialing, passwords, and certificates. Here too, the error messages reported by the applications are often not very helpful. This is an example I saw on a web post: “A fatal error occurred while creating a TLS client credential; the internal error state is 10013”. Well, that really doesn’t help. But doing protocol analysis with DPI shows that the client needed to have an upgraded version of that particular software installed. There is a lot of chatter from network managers on the web about how to use DPI and protocol analysis for helping with these kinds of network problems
There are many issues that come up related to configuration. The configuration of a network is distributed across many nodes, switches, routers, and servers in a very complex system. Often multiple people make incompatible changes. You discover that only when something stops working, or the performance seriously degrades. DPI can help you understand what’s happening.
It can also help in the presence of hostile attacks, like intrusion, spoofing, and distributed denial of service. Naive network monitors help, but they often generate false positives. Network managers after a while start ignoring the warnings that they are issuing. DPI can help determine whether anomalous traffic is a threat, or if it’s something that doesn’t require the network manager’s attention.
The DPI process, once it makes that determination, can recommend mitigating action. It can also take mitigating action itself, if it is enabled to do so, like directing participants in the conversation — the nodes, the routers, the switches, and the servers — to block the traffic that it has identified as harmful.
There are many examples of performance issues that are masked by robust protocol retry rules. The people who designed the protocols really wanted them to work in all cases, but they don’t necessarily work at their most efficient because of that. TCP window size is an example. TCP — the Transmission Control Protocol — was originally designed for networks that were in order of magnitude slower. So various enhancements have been added, like window scaling. But sometimes they are not enabled, and sometimes they are not configured correctly.
There are other problems, like multiple acks. It’s often the case that if you see multiple acknowledgements for a packet it indicates that there has been a packet loss or performance mismatch over the network. The network continues to work, but it’s not working in its most efficient manner.
There are many timeouts in networks, and there’s a complex relationship between them, the various nodes, and network bandwidth. Network timeouts are often set by network managers “by the seat of their pants”, without a good understanding of what effect adjusting those timeouts has on network performance.
There is almost always a lot of traffic that the network manager might not even know is on his network. There are hidden applications or processes running in the background that are competing with the things they know about. DPI can tell you about that. Load balancing between servers is another area where DPI can help with problems.
There are challenges in the deployment of Deep Packet Inspection, and I’ll go through eight of them, some of which are solved, some of which are not.
In the earliest days, it was a problem just to do promiscuous reception. Remember that DPI processes are passive receivers of packets that are not directed to them. Some of the network controllers do not allow you to receive packets if they aren’t addressed to you. The first network sniffer we did at Network General was for ARCNET, which was a logical token ring but physical star network. It was implemented by an integrated circuit that did not allow you to receive packets whose destination was not your station. So in order to develop the Sniffer, we had to engineer a piece of hardware that sat like a wart on the back of the network card. It secretly changed the first instance of the destination ID to zero — representing a broadcast — so that this particular sniffer station, the DPI station, could receive all packets.
This is less true of modern network controllers. Most of them now allow you to configure a mode where you can receive all the traffic. One of the interesting questions is: does promiscuous packet reception represent a security issue? I’ll talk about that a little bit later.
Another challenge in using DPI effectively is to turn the series of bits that are on the wire into something that a human can make sense of. There are many different protocols and structures of the packets that are flowing on the network. Some of them are standardized, like TCP/IP. OSI (Open System Interconnect) used to be another set of very complicated standardized protocols. Some of them are proprietary to individual corporations, and are documented, like protocols from Digital Equipment Corporation, IBM, and Xerox. Some of them are proprietary, but were not documented, like those from Banyan and Apple. In most cases you had to do a lot of reverse engineering to try to understand what the formats of those packets were.
Also, packets are nested at multiple levels. The OSI model has seven levels, ranging from the bottom physical link layer to the top application layer. Each of these has a format that is nested within the previous format.
Back in the 1970s and 1980s, people’s attitude was to let a thousand flowers bloom, and have a different network protocol for every different use. Here was a chart we produced in 1993 at Network General, demonstrating the Tower of Babble of network protocols. This was not even all of them, this was only some of them. The good news is that 75 percent of these protocols are obsolete now, or nearly so. But that still leaves hundreds of important protocols that DPI tools need to understand and decode.
Another problem is that packets are not necessarily entirely self-describing. The participating end nodes in the conversation maintain private state information that provides the context for interpreting the packets. DPI tools, in order to do the decoding, have to maintain duplicate state information for all of the connections they see. Sometimes that is only possible if you see the initial connection establishment sequence and remember the context. If you don’t see that, then you have to use heuristics. We can also try multiple ways of decoding the packet. The sniffer that I’ll talk about later that we developed at Network General does this in spades.
Another problem is that network packet data is often encrypted. In the early days that was less true than today. We would take a Sniffer into a network manager and connect it to his network, and they were amazed to see passwords flowing in clear text on the network. Sometimes their reaction was, “This is a dangerous tool — the Sniffer shouldn’t be allowed on my network!” Our response was, “No, it is not the Sniffer that is dangerous. Sending passwords in clear text is what is dangerous!”
It’s less common now that network packets are unencrypted, which presents a problem to DPI. Even so, when the data is encrypted, it’s often the case that the protocol headers are in clear text, which allows DPI to work. Also, given the secret keys by the network manager, the protocol interpreters can decode the decrypted data. Wireshark does that, for example. Weak encryption techniques can sometimes be broken by brute force. But unless some one of those things applies, the existence of encrypted packets on the network does provide a challenge for DPI technology.
Another problem is accessing networks are that are inaccessible, either physically or logically. What we did at Network General to address that problem was to have DPI monitors attached to the hidden networks, which relay information back to a consolidating station that can display the information. We called that the Distributed Sniffer System.
Another issue is that you want to not only provide protocol interpretation, but you want to analyze what you’re seeing and diagnose problems at a higher level. Network General in 1991 released a system called the Expert Sniffer. For that we used the Blackboard architectural model that was developed at Stanford, where knowledge sources that understand the protocols contribute to the Blackboard, and then a control shell coordinates problem solving. This was artificial intelligence before artificial intelligence was as popular as it is now!
Here are some examples, which I won’t go through in detail, of higher level problems that a little bit of AI can help DPI products diagnose. Sometimes the acknowledgement times for frames are too long. But sometimes they are too short, which can also, surprisingly, result in reduced performance. Sometimes broadcasts can trigger each other and create a storm. Sometimes there are intruders on the network, and packet traffic that looks innocuous may actually be an indicator of some nefarious activity on your network. There are lots of misconfiguration situations, like for firewalls, that high level problem diagnosis using AI can diagnose.
Another challenge for DPI is that network bandwidth is ever increasing, which makes capturing and analyzing all of the traffic extremely difficult. To solve that problem you can use capture filters that reduce the effective speed of the traffic that you’re seeing. You can use triggers to limit the number of packets that are stored. And you can capture on network segments that only carry a subset of the traffic that you are interested in.
Another issue is that networks have evolved from the original Ethernet and Token Ring and ARCNET. They have changed from a shared medium, where all of the computers sit on a common bus, and looking at that bus at any point can disclose all of the traffic, to more star-like configurations, where no single link carries all the traffic. In order to deal with that, you can embed the packet inspection code inside router nodes that link these various network segments. Or you can deploy DPI traffic relays on many of the segments, like the Distributed Sniffer System did.
These are all challenges. But none of them are killer issues for DPI, and DPI is used successfully in today’s networks in many, many ways.
Let me talk a little bit about the history. Most of the DPI products started in the 1980s. Hewlett Packard had a protocol analyzer, the HP 4951, that they introduced in 1984. Exelan, which was one of the early networking companies, introduced the Nutcracker in 1984. It was really a set of boards that went into a server and turned it into a DPI analyzer. They introduced the LANalyzer in 1986, a separate tool that attached to the network. And in 1986 Network General introduced the Sniffer, and we became the largest seller of network DPI tools for the next ten years. We started in 1986 by selling four units, and ten years later we were selling 175 million dollars worth of Sniffers a year — close to 10,000 units a year.
We were motivated, as I said, by having started a networking company before that and realizing how difficult diagnosing network problems really was. We started humbly and, well, kind of amateurishly. We marketed and sold directly to end users through manufacturers’ reps. If they took one of our units in the field and demonstrated its use on a customer’s network, 99 out of 100 times that became a sale. The network manager’s eyes would open wide and they would say, “I just gotta have one of those things!”
One of the reason why it was so successful is that it had a very simple user interface that easily generated high level displays. This may look like gibberish to you, but this is the language that network managers speak and understand. The Sniffer established a three window synchronized highlighted display, which is now standard. The top window shows you a list of packets sent or received. The middle window shows you a decoding of the selected packet, based on the protocols discovered inside it. And the bottom window, if you’re interested, shows you the details of the bits and the bytes that went into generating that interpretation.
We were also successful because we were network agnostic. We worked with any network, whether they were standardized or not. From IBM Token Ring and Token Bus, to Ethernet, to Datapoint ARCNET, Apple Talk, ISDN, Frame Relay, and so forth. Many of these are, thankfully, now obsolete. We were also network protocol agnostic. The Sniffer came with a list of a hundred or so standard network protocol interpreters. And if that wasn’t enough, you could write your own custom protocol interpreters in the C language and link them into the Sniffer. So customers could add their own interpreters, for protocols that we hadn’t yet gotten around to decoding, or for protocols they had created which they wanted to keep secret.
You could also force the interpretation of a protocol within a packet, because sometimes packets were encoded within other packets in nonstandard ways. Sometimes frames were damaged in a way that didn’t allow you to identify what the protocol was. And sometimes the Sniffer wasn’t able to capture the session establishment exchange and therefore didn’t understand enough about the context for the packet.
As I said, we started using AI back in the early 1990s — when it wasn’t as trendy as it is now — for understanding the high-level analysis of the packets we saw on the network. That was called the Expert Sniffer. We also ran a “Sniffer University” for network managers that trained people in diagnostic techniques for networks generally, and in the use of the Sniffer as a tool to aid in that. We advertised heavily in trade magazines. I love this 1987 ad for the Sniffer that says “Nobody can find the problem, but everybody knows whose fault it is”, and has a picture of two people pointing fingers at each other. There was a lot of finger pointing that went on in trying to understand what was going wrong with networks. We distributed thousands of free demo disks which you could pop into a personal computer to get a sense of what it was like to use a Sniffer on your network.
For a while we tried to protect the word “Sniffer” as a trademark that we had registered. We ran full page ads, like this one in the Wall Street Journal in 1995, that tried to remind people that it was our trademark. But we filed no patents, only trademark registrations. So the Sniffer is prior art for Deep Packet Inspection technology, but because it doesn’t appear in the patent database, it is sometimes hard to discover. As of about ten years ago the trademark had lapsed, and now “sniffer” is a generic term for DPI tools
Well, our market dominance couldn’t last, especially at the price point of about 20,000 dollars per sniffer! In the 1990s a series of second-generation DPI monitors became available and began to be deployed. The most popular after a while was Ethereal, which has become Wireshark and remains to this day the most popular free combination network analyzer, sniffer, and DPI monitor. There’s another talk in this series by Gerald Combs, who was responsible for both Ethereal and Wireshark. I recommend that you to watch that talk for more information about how DPI is being used today.
Thank you.
NOTE: This transcript has received minor edits from the original to improve readability.