Mastering Palo Alto Networks
上QQ阅读APP看书,第一时间看更新

Understanding App-ID and Content-ID

App-ID and Content-ID are two technologies that go hand in hand and make up the core inspection mechanism. They ensure applications are identified and act as expected, threats are intercepted and action is applied based on a configurable policy, and data exfiltration is prevented.

How App-ID gives more control

Determining which application is contained within a specific data flow is the cornerstone of any next-generation firewall. It can no longer be assumed that any sessions using TCP port 80 and 443 are simply plaintext or encrypted web browsing, as today's applications predominantly use these ports as their base transport and many malware developers have leveraged this convergence to well-known ports in an attempt to masquerade their malware as legitimate web traffic while exfiltrating sensitive information or downloading more malicious payloads into an infected host:

Figure 1.4 – How App-ID classifies applications

When a packet is received, App-ID will go through several stages to identify just what something is. First, the 6-Tuple is checked against the security policy to verify whether a certain source, destination, protocol, and port combination is allowed or not. This will take care of low-hanging fruit if all the unnecessary ports have been closed off and unusual destination ports can already be rejected. Next, the packets will be checked against known application signatures and the app cache to see if the session can be rapidly identified, followed by a second security policy check against the application, now adding App-ID to the required set of identifiers for the security policy to allow the session through.

If at this time or in future policy checks, it is determined that the application is SSH, TLS, or SSL, a secondary policy check is performed to verify whether decryption needs to be applied. If a decryption policy exists, the session will go through decryption and will then be checked again for a known application signature, as the session encapsulated inside TLS or SSH may be something entirely different.

If in this step, the application has not been identified (a maximum of 4 packets after the handshake, or 2,000 bytes), App-ID will use the base protocol to determine which decoder to use to analyze the packets more deeply. If the protocol is known, the decoder will go ahead and decode the protocol, then run the payload against the known application signatures again. The outcome could either be a known application, or an unknown generic application, like unknown-tcp. The session is then again re-matched against the security policy to determine whether it is allowed to pass or needs to be rejected or dropped.

If the protocol is unknown, App-ID will apply heuristics to try and determine which protocol is used in the session. Once it is determined which protocol is used, another security policy check is performed. Once the application has been identified or all options have been exhausted, App-ID will stop processing the packets for identification.Throughout the life of a session, the identified application may change several times as more information is learned from the session through inspecting packet after packet. A TCP session may be identified as SSL, which is the HTTPS application as the firewall detects an SSL handshake. The decryption engine and protocol decoders will then be initiated to decrypt the session and identify what is contained inside the encrypted session. Next, it may detect application web-browsing as the decoder identifies typical browsing behavior such as an HTTP GET. App-ID can then apply known application signatures to identify flickr. Each time the application context changes, the firewall will quickly check whether this particular application is allowed in its security rule base.

If at this point flickr is allowed, the same session may later switch contexts again as the user tries to upload a photo, which will trigger another security policy check. The session that was previously allowed may now get blocked by the firewall as the sub-application flickr-uploading may not be allowed.

Once the App-ID process has settled on an application, the application decoder will continuously scan the session for expected and deviant behavior, in case the application changes to a sub-application or a malicious actor is trying to tunnel a different application or protocol over the existing session.

How Content-ID makes things safe

Meanwhile, if the appropriate security profiles have been enabled in the security rules, the Content-ID engine will apply the URL filtering policy and will continuously, and in parallel, scan the session for threats like vulnerability exploits, virus or worm infections, suspicious DNS queries, command and control (C&C or C2) signatures, DoS attacks, port scans, malformed protocols, or data patterns matching sensitive data exfiltration. TCP reassembly and IP defragmentation are performed to prevent packet-level evasion techniques:

Figure 1.5 – How Content-ID scans packets

All of this happens in parallel because the hardware and software were designed so that each packet is simultaneously processed by an App-ID decoder and a Content-ID stream-based engine, each in a dedicated chip on the chassis or through a dedicated process in a Virtual Machine (VM). This design reduces latency versus serial processing, which means that enabling more security profiles does not come at an exponential cost to performance as is the case with other firewall and IPS solutions.

Hardware and VM design is centered on enabling the best performance for parallel processing while still performing tasks that cost processing power that could impede the speed at which flows are able to pass through the system. For this reason, each platform is split up into so-called planes, which we'll learn about in the next section.