Hi, my name is Jerry Mancini I’m the senior director at NetScout within the CTO office, and I’m here today to talk about deep packet inspection for cyber security. Specifically, I’m talking about attackers who access your network look for data to steal look for data to compromise by kicking off a ransomware or exfiltrating data for their use.
The attackers have many goals many ways to penetrate your network and find data and cause damage to your enterprise my colleague, Andrew had talked about, DDoS. DDoS is a different way to attack things. This is an attacker who’s trying to take down your network facing systems, take down your public facing systems, and do it to cause damage to your network. Same network cyber security issues, but what we’re talking about here is the attacks that include data exfiltration, and corruption.
When we talk about cybersecurity that’s the content of this discussion. As we get into it a brief agenda, we’re going to start with a history of network cybersecurity of how we’ve defended against these types of attacks over the last couple of decades. We’re going to talk about packet-based analysis, which is the analysis of a single packet, and trying to determine that in that single packet there is malicious intent and how that’s evolved to flow based analysis which is looking at many different packets looking at a connection looking at all the data that’s transferred across the connection, and not only looking at a single packet.
We’ll look at how the network has changed. It’s no longer a single network with a defined network border. We have clouds We have colos, we have people working from home. There’s VPNs and zero trust. Lot of different things have changed the network, and we’ll talk about how that impacts cybersecurity. And finally, we’ll talk about how the attacker goes across or goes about their, techniques and their tactics for attacking your network and how we take a defensive posture against those.
Let’s get into it. We’re going to start with the brief history of network cyber security. The whole focus here is network and deep packet inspection. If we go back a couple of decades and we really look at how networks were defined, there was a defined network perimeter. There was the internet and then there was the companies or the enterprises internal assets their internal network. Typically, the way that a network was set up is that a firewall was in place. Firewall would make decisions for incoming data whether to allow that through or not. Now the firewall back you know a couple of decades ago was really built for inline processing of data. Had to keep up with the network had to look at each packet and decide whether to forward it to allow it through or not. So, it’s looking at every single packet making decisions.
The next device in the network, would have been an intrusion prevention system or an IPS. Also known as an IDS intrusion detection system, the IPS would be in line it would have the ability to understand a packet really looking deeply into the packet. it took a little bit more time because once the firewall led it through the IPS can now look deeper into that packet, and make a decision whether to allow it to go forward or not. The IDS would be out of band, and it could sometimes do a prevention using a TCP reset or maybe an integration with the firewall, but the IPS/IDS played different roles whether it was in line or not but the techniques of looking deeply into the packet are the same.
And finally, over the last couple of decades this market has changed many times and the definition of it .The current definition that everyone understands is NDR or network detection and response prior to that and it had been called network traffic analysis or NTA. And prior to that we had terms like Nbad and many others have defined this kind of market. And typically, what this is a device that’s now looking deeper into the packet or deeper into the flow. Using Netflow or metadata that’s derived from the packets and making decisions on that accumulated set of data rather than looking at any single packet like the firewall and the IPS.
This set of three different devices are what we’re really built to protect your internal network from outside, attacks coming through that network perimeter. Now let’s talk about how this has evolved. So, on this chart, I’m showing over time those three different devices talking about the firewall the IDS, or the IPS you can you can change that if you’d like and metadata or the which is your NTA your NDR which is looking not packet by packet but looking across the important data that’s extracted from those packets.
Firewalls really started back in the late eighties, now I’ve got links in here. If you have access to the PowerPoint, you’ll be able to follow those links and see where I’m citing this data. But, in the late eighties DEC Digital Equipment Corporation put together a packet filter, which really kind of built this idea. And later you had your stateful fire state full firewalls your application firewalls. So, by the early nineties you really have these devices that are put in networks and they’re able to look at ports and protocols and be able to block access based on port and protocol, looking at stateful analysis looking at Apple analysis so that’s really where that was.
The IDS ideas were coming around you know in the same time even as early as nineteen eighty there was a concept produced by NSA that later started to evolve into some products that hit the market. You see in the late eighties Snort comes about a Snort is an open source continues to be an open-source IDS implementation. It was brought to market by a company called Sourcefire who took that open-source technology and put support around it and sold it as a system. That’s now owned by Cisco. In 2010 Suricata started to come about another open-source device and we’ll start to see that Suricata being continuing to be independent, is now absorbed into other types of technology as its own function.
If we now move ahead and look at the metadata, you know right here the Cisco Netflow data came out in ninety six. Netflow we’ll talk about it Netflow is really also known as the five tupple. So, it’s your source IP address your source IP TCP port, destination IP destination port and protocol. This can really help you understand which devices are talking to which devices who’s talking to whom, and how they’re communicating. The late nineties Bro came about. Later it was renamed to Zeek but Bro is really the ability to start to extract metadata. So, based on the type of protocol, I can extract other header information. So, I have a deeper understanding of what that communication, or communication paths more than the five tupple I can add other things you know for example, HTTP can have the user agent the URL the referrer name you know etcetera etcetera etcetera. That’s really where Bro and Z came about.
Now you noticed that the firewall line also changed in two thousand eight Palo Alto Network came out with the next generation firewall, which now starts to bring in, deep packet inspection. So, in addition to looking at ports and protocols like those versions in the early nineties, the next generation firewall starts to bring in IDS capability. So, it’s now looking deeper into the packet not only ports and protocols, but it can now essentially take over the job of the of the IPS. Firewall is an inline device built to forward packets at speed and can now look deeper into the packet using DPI As we get further ahead you know more recently Palo Alto came out with their proactive next generation firewall, which starts to use machine learning.
In addition to the IDS type of capability from a decade earlier, that now begins to add machine learning into there. on the metadata line, you know when we get into twenty ten or so and really evolving up until the present time, we have the emergence of the network detection and response market or NDR, which is typically an out of band device. It can prevent the response can be in line in some cases by some vendors, but mostly the response is by integration with an inline device like a firewall to detect some problem by using metadata using machine learning and AI across this set of big data that’s extracted from all of this data, and then really, work with others to start preventing what has been detected.
We’re going to go through some more of these details. But, if you look at the timeline here, it’s been several decades that this network security problem has been, developing through different products and product lines emerging form from primarily three different markets here but it actually embedded in there. There’s other markets because It’s not just simply a firewall. You have a web application firewall, you have API security etcetera. If you look at what’s changed over the last twenty years let’s start at the firewall and I’ve kind of mentioned this on the prior slide, but as applications moved you know there used to be a time a couple of decades ago where all of your chat sessions your social media whatever they were all on different ports different protocols like different ports and protocols that are running there. But, over time everything has evolved to be on port eighty. Everything is HTTP or four forty three which is HTTPS. and DNS is you know what makes the internet work.
The firewall has moved from being a port blocking. it can no longer block ports that you don’t want on your network. Because you can’t block port eighty or or four forty three. the firewall now includes DPI capability and it includes the IPS capability and what you’ve seen in many cases is that capability in the firewall really has replaced the IPS. So, you don’t need a separate device doing intrusion prevention when you’ve got the firewall doing that work the detection world is really the IDS has really moved from looking at a single packet to looking at flows because you still need to compliment the firewall and if the firewall can do the IPS and it can really do the packet by packet inspection deciding which packets go through or not you still have the need to look at flows. And IDS can do that and so can the, NTA NDR market. other drivers that we’ll get into later, encryption went from you know, in twenty you know in the two thousands early two thousands, very few network sessions were encrypted. But if you go today almost everything is encrypted. So, we’ll talk about what impact that has on security.
The move to the cloud has been really changing the way networks are designed and the way networks look and that changes how you handle security as well. And then we’ll talk about remote workers in that effect where, you know a couple of decades ago that was rare. It’s now you know as we came up to the pandemic in twenty twenty it was there were more and more remote workers, but as you got past the pandemic it’s very prevalent and that has changed the network design as well. Let’s think about the prevent side of things. So, we’re talking about attackers your adversaries are trying to come into your network. And if you look at the graphic at the bottom, your threat prevention tools are blocking a lot. They’re blocking almost everything. But it’s typically the inline device lets something through. You can get something through because you can’t block everything and that’s been reality. Breeches do occur. The firewall works very well. But let’s say it’s ninety eight ninety nine percent effective, breaches continue to happen.
Prevent can’t be the entire answer. From a defender point of view you have to think about you have to assume that breaches occur and you now need to worry about how to detect them, which gets you to the detect side of the world, your IDS or your IPS you know they’ve come in many forms over time. we’ve talked about the, there’s the host IDS or the HIDS. There’s the network or the NIDS .There’s the protocol IDS. There’s the application, IP IPS or the application IDS I should say. And they look at different things or understanding different things and as time has come along, you know the kids. You see the heads really embedded in a lot of ADR, capabilities or endpoint detection and response.
The world has really come from a security standpoint of worrying about EDR that includes HIDS. You also have HIDS that’s embedded in your operating systems. So, your host IDS is within the operating system your Windows your Apple your iOS and a lot of times how you configure that how you set that up. The IT department can do that or if you’re, using on personal computers you can set it up yourself. The other forms of the network, you know the network IDS, the protocol IDS the application IDS, you see those kind of evolving into techniques that are embedded within the NDR market or the network detection and response.
The evolution from a security standpoint has been all these different IDS concepts, evolved into EDR and NDR. But you still have, all of these capabilities are still available as standalone systems, being built being supported being you know there are inventions happening from the developers of these systems as well as your EDR and NDR markets that include the same capabilities or in some cases embed these capabilities you know through a through a sharing or an open source development I talked about this earlier about how these different terms NIDS and bad etcetera.
Let’s just talk about the terms really quickly. Network behavior anomaly detection or Nbad is a term that has been out there and it still exists. You’ll still hear people talking about it. Same with network traffic analysis, network detection and response has really become the main term or the main market today, and you have network analysis and visibility or NAV which is also used today and out in marketplace but in my point of view there are really four different terms for similar capabilities you are trying to look at metadata and trying to establish information there. The IDS the network-based IDS way that works is you need to write a signature. You’re looking at packets you’re building a signature for, I want to look for this activity in a packet because I know this to be malicious. That requires that somebody knows about the threat and its packet-based analysis.
When you look at the NDR or any of the other terms that are listed here we’re really analyzing traffic patterns, looking for anomalies or looking for malicious behavior, not necessarily looking at any single packet at a time. and the goal of these products the goal of NDR is that you don’t need to you don’t require knowledge of this specific threat You don’t need to have seen it first, because you’re looking for anomalies, you’re looking for strange behavior, and you’re not looking at a single packet but you’re looking at flows. When you look at the NDR market, almost every vendor that puts out an NDR includes the NIDS capability. So, the NIDS the IDS capability that has not gone away it’s still available standard. I mean as a standalone market, you have products that are purely IDs or IPS capability. but you also have NDR and the advent of Suricata open source you’ll see that a lot of NDR vendors are including open source capability to support that NIDS. So, when you know about something, and you have a signature written to detect some threat, you can utilize those signatures this is known behavior this is known malware This is known malicious behavior. We’re including all of that and the NDR is now looking for that in addition to the unknown
Let’s talk about the evolution from packets to flow based and we’re going to get into a little bit of detail here. So this is an example of a DNS query. You see a screenshot from an NetScout product that provides the packets that have been captured for a DNS event. What I would like to do is alert on something in this example. Of any connection to a dot p w domain. and I shouldn’t say connection because this is DNS I’m looking for a request. So I’m looking for somebody that’s requesting access to connect with a dot p w domain. P w is a professional web That’s what p w stands for. And you know per Wikipedia it’s one of several domains that are, open to the public and anyone can access or anyone can create an account here You don’t need to meet enterprise and according to Wikipedia seventy five percent, of dot p w domains, our malicious domains.
I would want to see anyone trying to connect to those I can create a rule. This is getting into Suricata, that’s the Suricata rule. And if you look into the details here, we’re looking for a UDP connection between my home net and the external net. Now you define what home net is, that’s my network. What do I own, what is my network? And that can be very, sometimes that can be different when your network includes cloud and colo and on prem devices but you can typically come up with and define what your network looks like .What are your public IP addresses? You’re not gonna have many public IP addresses. So you can define what that is and then everything else is outside.
If you want to go through the details of this rule UDP between, something on my internal network to my out external network, on port fifty three which is you know DNS. There’s a message that if I find this this is what the message would be and when you get to content, now we’re looking for specific bits. So, we’re looking into the packet. And if you look at the specific bits here that are highlighted, and you look at the packet here you see where they’re connected. and that content really refers to a DNS request for something that is the dot p w. You’re looking at it at a certain depth and offset, and the content is your p w here. Suricata rules can get very detailed very precise, they can look at packets. They can look at any anything within the packet.
It’s really purely a deep packet inspection capability, allowing you to define, what you want to look for and if you find it what message to present following the same example, I’ve got a the response to that request. So, if I look at the query and I look at the response, the response came back successfully. So, DNS provided the IP address for this requested dot p w domain host. that means that the host can now make a connection there because they know the IP address for it but sometimes that single packet, I showed you how we found that dot p w by looking at there Sometimes that’s not enough.
Here’s a different example of DNS and this is another screenshot from a NetScout product. where we’re looking at over time all of these different connections and these are DNS requests. You can see the query, you can see the time here. And you can see that everyone is is querying the same DNS server or the I’m sorry The client is who is requesting the data from the DNS server and this is your server address over here. But the response code is an error which means I’ve requested a, query I’ve queried for a domain like this one that’s highlighted, and the DNS server said its name not found. So, what could that mean? It could mean that Somebody’s trying to access a site that doesn’t exist, but if you see this repeated over and over again as we’re showing here, it could mean that you have malware that’s trying the phone home. We’ve seen this in threat activity when we talk later about .We’ve accumulated all the different threats that are out there. one way to do this is that a piece of malware we’ll try to you know phone home find its command-and-control site but I’ve got malware that’s distributed everywhere.
I may not have a command and control site available yet or maybe I have it built but I only wanna respond at a certain date and a certain time and a certain activity. So when you see this repeated activity this repeated request, that could mean that malware is trying to phone home If you look at these queries over time it could also mean that when you have these long names that are, you know, unreadable. there are many legitimate websites out there are you know ad servers and other like types of things plus malicious servers that will generate names automatically. That could be what this is.
There are other techniques for attackers where they when they want to steal data. They know DNS is open. They know port fifty three is open. I’m going to take all that data I’m going to encrypt it and I’m going to send it out as a series of queries. I am the attacker. I own this DNS server. So, the fact that I get name error because these sites don’t exist, but I am actually exfiltrating the data the data that I want to steal. I’ve encrypted it in some you know method, sending it to a server that I understand I can capture all this data now and I can decode it. And now I’ve stolen all your data. So, this set of queries could mean many things. Any single packet if you look at it, there could be many legitimate reasons that you know this client is requesting or querying for this domain name. But when you look at the whole thing put together, I may have a problem. So, here’s your single DNS request. There is no indication of a problem here It’s a request out to a certain site. I don’t have a Suricata rule that’s gonna fire there.
On the response side I don’t have an indication there’s a problem either, because no name error. That’s common. It happens all the time It doesn’t necessarily mean that something malicious is occurring. But as I mentioned earlier, it could be used to extract, you know your data there but I can use deep packet inspection as a defender to extract metadata and take my flow analysis and perform it over the full subset. Do I need the full packet to analyze this If I go back to slides do I need the full packet to analyze this? I don’t I just need the metadata which you see the fields here for DNS This is a subset of the fields that would be there for DNS. I don’t need the whole packet to come up with this analysis.
Metadata examples, there are many of them. The common ones Netflow I talked about flow earlier. You see them listed here. Also known as the five tupple You’ll hear people talk about the five tupple. Many times the common fields in metadata would include the start and end time and how many bytes were transferred. So that is common across every protocol but then you get into protocol specific things. And you see you know a few examples listed here for DNS the query value query type the class whether you know the resolved address the TTL the response code, HTTP there’s a subset of the data there for SMB you see some of the data there but you could list every protocol all the common protocols that you see on the on the network, and come up with the metadata that you need there. And this is just a small subset just for examples if you look at what it takes to store all this data you know metadata is typically about two percent of the storage required for the full packet. but we’re pulling out the most important data.
There are a lot of occasions especially when you’re investigating and you’re looking for forensic data that you need the full packet to really truly understand what was transferred there but you can come up to you can find many examples of malicious activity just by looking at the metadata. Which doesn’t cost you as much to store as storing the full packets.
Maybe you store the metadata for metadata for a longer period of time than you do with packets. Here’s an example for HTTP. So, this would have been a very wide screenshot We broke it up into two two sections here for readability. but you can see the URI the user agent the referrer the query values the query names response code etcetera It it’s all there. this is the most important data that shows an HTTP connection between a client and a server For DNS you know here’s an example in JSON.
The security tools have ways to communicate with each other. They have ways to extract data and send data. There are many different formats out there JSON has become quite popular, and this is showing the JSON. From a certain IP address in the one ninety two one sixty eight range, to a DNS server That’s you know another internal server at your dot two on port fifty three, and you see in here the bytes the response code how many packets were transferred all stored in your within a JSON format.
Machine learning and AI have become very popular network cyber security. But you know these were these techniques were solved for many big data problems. Not necessarily designed or originally created for use on cybersecurity. If you think about you know how data works and, you know everyone has the experience where you’re looking to buy let’s say a new pair of shoes, And you do some searches on shoes and now you go to your social media and you get all the ads popping up and they’re all about shoes, these are ad level capabilities. This is big data. This is capturing data of what you’re doing as a user and trying to push that back into creating ads that will be interesting for you.
A lot of big data is essentially just capturing a lot of data. Capturing user activity, capturing network activity, cap and then using all that data for many different purposes. The algorithms behind it are very similar and they’re all based on the same research and the same techniques that we’re doing there. So, whether you’re working in marketing or sales for a company or you’re in cyber security you’re essentially following the same techniques that are have been invented for this type of data, working through lots of data.
In cybersecurity we’re talking about the metadata, as the data that is being used for these different techniques. If you think about, supervised learning, this requires labeled data. Or security that means I need to know what’s good behavior and what’s bad behavior. And try to label that by extracting features, you know features are your metadata. This could be usernames, it could be your URLs that you’re accessing, it could be the queries that you’re doing for DNS, all of that metadata. You extract that the features and you, meaning how do you how do you know bad?
Back to the example that I gave before for supervised learning, on DNS, bad would be a repeated list from a client to a server of long DNS queries. It could be a repeated list of no response Those were two different things that we talked about earlier, but you could describe what is bad behavior, within the realm of good.
Unsupervised learning requires you to find some kind of baseline. What is normal? Normal can be hard to describe. So back earlier when I said there’s a long list of, no response. Well maybe there is normal behavior where there’s no response but is it two of them Is it ten of them Is it over in and over what period of time. What is normal? And then you look for anomalies you look for things that are different from normal.
If I looked at all the DNS queries that have occurred within my network and I find that you know within a let’s say one hour period. It’s normal that we have, you know let’s say five average of five, no response or no the response code coming back from DNS’s name unknown. And then when you find an anomaly which would be a standard deviation or many standard deviations away from that normal that’s really unsupervised learning.
Then you get into neural networks Neural networks in deep learning use a multitude of different algorithms and that’s shown on the chart on the right here. So you have input from many different sources all of your all of your data here. So maybe I have a supervised learning algorithm that’s looking at that DNS traffic, and it determines that this looks bad because I have long query names. My query names are too often they’re long and this matches a feature behavior that I’ve learned.
Maybe I have a different model for unsupervised learning which is running here that’s looking at what’s normal based on how many name unknown responses I get, and that has detected something as well. So, the supervised learning has some false positive capability results that come into that which means, it’s hard to define good versus bad. And what I think looks bad I’m going to alert on it and say this is bad but really isn’t. That’s called a false positive. But if I take that supervised learning model, my example here was the long query names, and I couple that with another piece of data which is the unsupervised learning for the response names and something is out of whack It’s it’s an anomaly It’s many standard deviations away from normal. and I kinda go through this set of algorithms.
I can find a way to reduce my false positives because I’m combining the outputs of many different sources to come with a final output, which would have less false positives And in the cybersecurity world that’s where neural networks and deep learning really come into play is pulling together a multitude of different types techniques to really analyze your big data, and come out with an outcome similar algorithms similar things are used in, you know other things like I mentioned earlier marketing and sales campaigns and other types of big data.
What is NDR Network detection and response there are over fifty vendors that are out there trying to sell solutions that are that are in this NDR category. NetScout is one of them. It really needs deep packet analysis, for metadata extraction. Many times, it will include the IDS or the signature type of capability to detect the known activity on a single packet or a flow of packets. Usually includes an indicator of compromise which means, once it becomes known that certain IP address or a certain domain or a certain website, or a certain file is malicious, or the website is a command-and-control server or the website is a malware distributor.
I can push feeds out or I can push out these lists of known IP addresses domains fire file names etcetera. And I could match those so I know that activity. So, between the IDS and the IOC capabilities that’s really finding your known. This has been known to be bad and now I can detect that. I can extract my metadata I can use ML or AI across that big data set to determine the unknown. So NDR really comes about of these different kind of categories, fifty vendors all trying to find something different or do something better than the others and that’s really where the inventions are kinda coming and coming about. Which is adapting these techniques that have been built for other domains into security and then combining that with deep packet inspection to extract the right data to feed into the into the algorithms, and also to perform the detection within the packets themselves.
Of the known detection capabilities from an IDS or an IOC now let’s look at modern networks and the impact of encryption cloud and remote workforce. So, let’s start with encryption so you know from a security point of view encryption is it’s a privacy concern. It’s protecting your own private data. But from a security standpoint it allows attackers to hide. So how do you deal with that? There are many algorithms out there that are attempting to find malicious activity within encrypted data.
You can look at j a three. You can look at different encrypt the encryption type the encrypt encryption algorithms. There are certain things that you can find, and you can find malicious activity. You can look at the rate of traffic and you can look at the size of the traffic which you’re trying to determine is there a person on the end who’s creating the network activity, or is it being auto generated by a by malware? Or an algorithm that an attacker would use.
There are inventions that are happening about trying to detect malicious activity within encryption. But what’s most, what’s used most likely or what’s really used in other cases is what you see here of a break in inspect tool where encrypted traffic flows through the device. But the device itself can decrypt TLS. if it has the keys if you provide it with the keys for SSH it it can decrypt that kind of activity and send it off to your other tools like your network performance analysis tools and your security tools, which really work better with unencrypted data.
If you’re having a break in inspect device, you’ll see inventions happening around breaking inspect, and what occurs in this middle device here. So that the security tools and any other kind of network device like network performance can really perform at its best by having access to the full packets digital transformation.
Now this is data is now in the cloud. how do you get access to the packets that can be difficult. It has been difficult but the cloud vendors you know the main, AWS Azure Google Oracle, have recognized that and have offered solutions to tap the data. So, you can still put a network device into your private cloud, so that you can understand the data that’s flowing through your private cloud. It sometimes requires you to route the traffic to get to the security device.
There is a little bit of extra work involved in setting up the environment but there are capabilities to tap the data extract your packets provided to a packet inspection solution In some cases the NDR solution or the network detection and response solution, has provide another ways to work with logs, whether that’s a cloud log or a firewall log or any other type of data, But now you’re bringing in not the packet but you’re bringing in limited data sometimes that’s easier to get to than the packets. And so that’s why it’s become prevalent. But you know attackers if they are getting into your network getting into your private cloud. if they can compromise systems then they can also compromise those logs.
And you always have to ask when you’re dealing with logs if you’re dealing with the truth of the packets because you don’t actually see the packets. You don’t have a device that is, uncompromised the packets are uncompromised because the packets are how attackers and how legitimate traffic flows across your network whether it’s on prem or the cloud. NDR devices need to find a way to either with the top bullet get a device in the cloud network to find the packets and read the packets or get logs into the system and do as best as possible with the log data.
As I mentioned earlier in this presentation the remote workforce was always kind of growing, but it was really accelerated by the pandemic a few years ago. So, you know now you’re looking at employees and contract contractors who can access data either over VPN or zero trust solution. Some are now offered as a SaaS solution in the cloud so you don’t need to put a VPN device on your network. You can route everything through some analysis that’s done in the cloud or a SaaS provider. But DPI is still used to analyze that traffic, whether it’s the SaaS provider or it’s your own device. You can analyze the traffic but the issue that you have now is when you have SaaS applications you don’t have a single source of a single solution that can see end to end.
You have to patch all these systems together to really analyze all the data that’s flowing and getting into your critical applications remote workforce plus the cloud has really turned the traditional network to involve a bunch of other terms that I listed here Sassy is very popular, secure access server edge, which is really taking your firewall IDS and NDR that I described earlier and really pushing that in the cloud. And then having your users, route traffic through your cloud your customers that are accessing your website, are all routed through the same thing and all of that analysis is now done in the cloud rather than devices that are on prem.
CASB is also like a firewall or an IDS or even a data loss prevention solution that’s really looking at data going to the cloud. Zero trust solutions have it come about that are really combining your IDS and your multiple different versions of IDS including the host-based IDS that allow now allow users to get access to your data not only based on port and protocol like before but also many other configurations about the status of the host does the host look to be compromised.
What data are you trying to access? Where is the host located If it’s a laptop it could be, you know many different places. And then full network visibility is also a solution out that’s out there, which is really trying to provide visibility into the full network end to end, really trying to define what is enterprise data how are users getting to it and is there malicious traffic across the whole thing? So full network visibility is not only at the edge of your network, like where your firewall would be but all of your internal communications, whether those are people on prem or people accessing it from outside or coming through the cloud.
The last section here we’re going to look at the attacker playbooks and how you can take a defensive posture or how a defender, reacts with it to this information. Start by talking about the attack framework. This has developed my MITRE. The last time I updated this they update it every six months or so. There are fourteen different techniques and over two hundred and twenty five techniques I’m sorry fourteen tactics and two hundred and twenty five techniques.
The tactics are the vertical lines reconnaissance resource development, initial access etcetera. Those are your tactics and you can think about those as the attacker starts on the left and they move to the right. When they finally hit exfiltration or impact, they’ve accomplished their mission, their mission. The techniques are the verticals, and that lists the ways that an attacker uses to accomplish the tactic so in order to move to the right I need to accomplish certain things by deploying a tactic.
Some cases a single tactic will get you from one tactic. I’m sorry. I keep mixing the terms. A single technique will get you to the next tactic. In some cases, you need many techniques to get you to the next tactic let’s take a deeper dive into the tactics.
This is the same list, from the same screenshot but it’s been moved so that you can read it and that’s what the arrow is supposed to represent. And let’s kind of talk at things at a very high level. The reconnaissance and resource development is really the attacker doing their R&D. They’re using reconnaissance they’re trying to figure out what’s on your network. How can I get to your network What ports and protocols are open? How do you communicate How does your enterprise communicate with the rest of the world. And once I have that information now I can go off and do my R&D which develop my malware develop my infrastructure. Maybe I can reuse things that are already built. Maybe I can go on the black market and buy things. But this is all of the attacker understanding who their target is whether the target is worth their time to attack and then developing how they’re going to go about it.
Once they have that built they figure out how to get their initial access. So the goal here is to get into a, a way to compromise at least one computer that has access to your network It may not need to be on your network I can access that computer. I can compromise it while you’re sitting on your laptop and working on a coffee shop or in a hotel. So, it doesn’t necessarily have to be connected to the network, but I get initial access I execute my plan how I can compromise that system.
The next few stages here persistence privilege escalation defensive evasion. The whole goal here is I found a way to attack somebody. I’ve compromised that system. I want to make sure that I can continue to access that system So whether they reboot whatever they do I want my highest privilege escalation I want my highest privilege on that system so I can continue to get access. I also want to execute defense evasion which you can think about as I’ve compromised a computer. I want to make sure that that computer doesn’t know I’m there or that other tools that are in place like the EDR and the firewalls, I can continue to get to that device, and nobody’s going to find it.
Once I’ve gotten to this stage I’ve compromised the system. I can get into that computer maybe it’s a personal laptop, maybe it’s a web server maybe it’s a something in the cloud maybe it’s a cloud application, whatever it may be I have compromised it I have access to it. I could stop here. Let’s say it’s a personal laptop. Steal all your data, and compromise you and then you know I’ve I’m done. But when we’re talking about an enterprise, that is being attacked I’ve only accomplished the first step here as I’ve compromised a single device now where’s the rest of the data?
And that’s really what happens next. This is where I get credentials i steal logins I still use your names and passwords i can discover what is in the network. Where is the data? Where’s that data that I want to steal? How do I move from one compromised system to another one and compromise another system that’s your lateral movement. So, I can move to a system that has more data or better access. Collection is electing all the data that I found that I want to steal but I’m going to keep it inside the network.
Exfiltration is when I pull it all out. So, I could collect it all first and pull it all out at once or I could exfiltrate it a little bit at a time. Many different techniques for that. Command control can really be chronologically anywhere around here you know in this whole path here but that is the way that I communicate with the devices that I’ve compromised as an attacker. And then finally impact is where I would you know execute my ransomware, I would try to pull out you know tell the enterprise I’ve stolen all this data you know ask for a ransom whatever my final impact is. There’s many different tactics or I’m sorry Many different techniques within the impact tactic.
Now let’s look at it at a DPI point of view. Where does DPI really work? So, at the reconnaissance phase that’s where the attacker may be running some scans and doing some probes to try to figure out what ports and protocols are open. you can see that with DPI. Now I’ll also tell you that you will find constant attacks on every network. There’s constant reconnaissance that’s occurring where many different people on the on the internet are just searching what’s open, who has what where can I what you know what ports and protocols are open?
The initial access is really where the attacker first access as a system. now that DPI could be on the endpoint itself, which would work whether you’re on network or not. Or if on your network or in the cloud or where your critical assets are this is how you would look at the information that has passed. The reconnaissance and an initial access is typically referred to as north south traffic. So this is the traffic that is coming from the attacker across the internet into your enterprise. Once they’ve compromised the system, now this is called East West.
I’m now inside the network I’ve got a compromised system I’m going to use that system to try to get credentials discover what’s out there move to other systems etcetera. So, this is working from a compromised host or working from many different compromised hosts, typically east west. But this is where DPI would play. First access, command and control would also be north south, and then how the attacker moves within the network practical application really comes into when you discover something, tag it with the MITRE technique. And if I have these different tags, I can use that as part of my metadata.
Any single detection could be labeled like in this case lateral movement. I can couple that detection with others to form the entire attack pattern or the entire story. MITRE provides a lot of language to help you understand what lateral movement is and in this case the technique was lateral tool transfer. There’s a lot of different data that’s available.
In summary the state of network cyber security, DPI, deep packet inspection is the core technology. Everything in the network starts with packets. Packet headers are used to create Netflow, which tells you who talks to whom That’s the five tuple that we talked about earlier. Deep packet inspection leads to packet inspection which is your IDS signatures as an example, which is then DPIs also used to derive metadata, which can then be a historical representation of important content.
You know, URLs IP addresses DNS query names all the examples that we talked about earlier you can keep that data for a while. It also provides your flow analysis so you’re not looking at a single packet but you’re looking at the important data from many packets over time, or within a single connection. And that input then leads to, or that’s the data that is then sent to machine learning and AI algorithms. That as we talked about earlier are used for many different big data problems.
Cyber security is one of them and it’s based on the metadata that is pulled out of the packets, all the advancements that we’re talking about from a network security standpoint, everything requires deep packet inspection. This is a screenshot of a public domain of a reseller who has talked about all the different products that are available out there this slide is hard to read, and that’s kind of the point here is that there are many different markets.
Network security is kind of in this box, web security which is also network traffic. And if you kind of look through every one of these different boxes here, the majority of them require DPI, deep packet inspection to take data that’s coming off of any communication and understand within these markets what are you trying to prevent?
Cybersecurity is a complicated process it’s a complicated issue many different vendors many markets many different products, most of which are using DPI at the core of their capabilities. And the DPI is really what provides the data that, a lot of different algorithms and techniques are used for.
Thank you for your time It’s been a pleasure to go through, network cyber security and talk about the history and, the different capabilities that are out there across many different markets. Thank you.