Lecture 11

Feb. 20, 2007

Firewalls, Filters and NAT

Last lecture covered analysis of filtering and blacklisting and each methods effectiveness against viruses. Results proved grim versus even a random propagation worm. The top 100 ISP's would be required to implement content filtering to limit the infection rate within 24 hours. A well designed, or flash worm (with a pre-build target list) would require reaction time within minutes. Analysis showed that content filtering was much more effective than blacklising.

Filtering technology is still rare in firewalls.

There are several different types of firewalls and all must be configured appropriated to be effective.

NAT

NAT devices break internet invariance by allowing alternate IP naming schemes behind the NAT device. NAT leverages dual facing interfaces, masking the IP of all machines on the network behind it by rewriting packet headers. The device accomplishes this by replacing the port numbers in the headers with NAT id numbers. This id number is the same size as a port number in the packet header. NAT's overload the meaning of ports to keep track of inbound and outbound requests. The NAT device, in essence, uses the combination of the IP address and the port number as a machine identifier.

NAT devices keep internal tables which map ports to hosts for this translation. This effectively shields internal hosts because an attacker can not easily guess the NAT id for a specific host. Additionally a garbage collection occurs within the device on a fixed time schedule.

Port forwarding can be enabled on most NAT devices. This allows the device to be configured to always route certain port request to specific hosts by adding a static entry to the NAT routing tables. This pokes a hole in the firewall functionality of the NAT address, but is necessary in cases where a server needs to run on a well known port and serve requests from beyond the NAT device. For instance, the NAT device could add an entry that always maps inbound port 80 requests to a specific internal host IP on port 80. This allows the NAT device to appear as if it supports all well known protocols.

Rewriting packet headers does have disadvantages, however. The IP address is only rewritten in the header, so protocols that utilize IP information contained within the packet body might not function properly over a NAT device's connection. Additionally, protocols that utilize header based encryption could fail due to the header rewriting. NAT devices must replace header checksums in packets as they rewrite the header information in order to provide accurate checksums on the new packets. Tunneling protocols can also break down across NAT devices.

There is an argument that the widespread implementation of NAT has slowed the adoption of IPv6. The argument states that because NAT has eased the IP address space burden (by providing what can amount to an entire class A address space behind each IP address) that there is less demand for the increased address space provided by IPv6. Also, NAT corrupts IP addressing schemes by effectively allowing the combination of a port number and an IP address to be used as a machine address.

Some NAT devices also allow primitive packet filtering.

Filter Firewalls

Conceptually filter firewalls are supposed to prevent bad packets at the perimeter. Traditional network models, however, do not always map to real life topologies. Sometimes the network leaks outside of the firewall, blurring the concept of a perimeter.

Filtering firewalls can also become a bottleneck to network traffic. Typically filter rules are based on headers only (rather than packet content). Some firewalls can, however, monitor state, allowing conditional filtering on connections based upon their state (for instance, dropping an ACK without a corresponding SYN).

Filtering firewalls consult internal rules tables and construct allow/deny instructions that determine whether or not to pass a packet or drop it. Ideally firewalls should be configured with least privilege, allowing the smallest number of packets possible while preserving legitimate traffic. Typical firewalls either enable communication on certain ports (default deny) or disable communications on certain ports (default allow).

Firewalls often rely on well known ports, so they can be end run by placing services on non-standard ports, or by running non-standard services on well known ports.

Firewalls can be configured in multiple ways (GUI, shell scripts, config files, etc.) but all construct a rules based table. Typically there is a precedence among the rules and the first match in the table governs the firewalls treatment of a packet. If a packet reaches the end of a table without matching any of the rules it can either be allowed or denied based on the specific firewall.

Filtering can occur at four points:

		--------->1		3---------->
Internal Net			Firewall		External Net
		<---------2		4<---------

Some firewalls will only allow control over some of the points of contact with the firewall. Allowing configuration at all four points of contact is most effective. For instance, if a packet arrives at point 1 (from within the local network) with an IP address that does not belong in the internal network then the firewall can assume it is a bad packet and drop it. Similarly, if a packet arrives at 4 with a source address on the internal network it can be discarded as a forgery.

As a rule of thumb firewalls should be designed to do filtering as soon as possible. In this way the firewall can reduce the amount of traffic that has to be processed at the earliest possible point. This reduces overhead on the network. For instance, a DDOS against a machine on the network could be mitigated by dropping inbound packets at point 4. Of course, in this example the firewall is absorbing the brunt of the processing and may itself become the victim of the DDOS.

Rules authoring can be made considerably easier or more difficult depending on how configurable the firewall is. Having the ability to write rules on all four points of contact with the firewall allows for greater conditional filtering. With fewer numbers of contact that are configurable firewall rules must become more permissive. This could allow some bad packets to sift past the firewall.

Because firewalls make assumptions about conventions an attacker can exploit these assumptions (such as well known ports).

Port filtering firewalls are some of the most commonly deployed.

Snort IDS (http://www.snort.org)

Snort resembles a firewall in the fact that it does real time traffic analysis and logging on a per packet basis. However, snort is able to do content filtering, unlike most firewalls. Snort, does not, however, do TCP stream reconstruction, so it cannot do signature matching on payloads spread across multiple (fragmented or otherwise) packets. Also, Snort is a reactive intrusion detection system meaning that it cannot actively manipulate routing. It can be configured to log messages, send email alerts, and perform other actions when it detects a rules match.

Snort may require difficult rules configuration. There are many freely available rule sets for Snort.

Automated Worm Fingerprinting

Automated worm fingerprinting makes assumptions that could potentially be used by worm writers to mute their effectiveness. Automated worm fingerprinting assumes that there is some signature content in the worm. Also, that worm propagation will increase the occurrence of this invariant content on the network. If one were to monitor the large portion of unused IP addresses on the internet, logging all traffic bound for those addresses and analyzing them one could create an 'internet telescope'. The occurrence of invariant content in the telescope could be used to indicate a worm (rather than popular, but legitimate traffic). Over time the number of hosts interacting with the telescope should grow if they are infected with a worm and their distribution should be uniform, which would further serve to distinguish worm traffic from legitimate traffic.

Examining the telescope content could be used to pick out the shortest string possible to use as a signature for the worm. The longer the string, the harder it would be to match. Measuring traffic on the network would allow one to show divergence from regular traffic and correlate it to the occurrence of the worm signature string.

Naive Content Sifting

Looking through packets to sift out signatures and then using a threshold to determine if a packet should raise an alarm. Packets would have to be indexed on a 40 byte substring. Indexing this traffic, however, is enormously resource intensive. Subsampling traffic would reduce overhead but would retard response time in identifying a worm by logging a much smaller proportion of relevant events. Instead packets can be hashed to a fixed size. Using this method packets must be hashed three times in order to identify and avoid hash collision. Destination ports are one of the most indicative factors in worm traffic and is included with the hash. Destination ports can be used to identify the software target of a worm by identifying the software that runs on the well known port targeted by the worm.

Rabin fingerprinting will be discussed in the next class...