At its most basic (and I’ll explain all the terms in a moment) a firewall is a stateful device that can apply network layer access control to packets passing through it. It’s also able to do network address translation (NAT). And one of the most important yet most overlooked features of a basic firewall is that it must include extensive logging.
Stateful in this context means the firewall keeps a table of every active session passing through it. If I’m allowed to make an outbound connection to a web site on the Internet, the state table knows it should allow inbound packets back from the same site. But after my session ends and the firewall is no longer expecting those inbound packets, it should block them.
Network Layer access controls are simple rules that permit or deny traffic based on information in the packet headers. This information could include IP addresses, protocols, or port numbers.
Network address translationis a pretty well-understood concept. At a minimum, I want to hide internal private addresses from the Internet. So when forwarding a packet out to the Internet, the firewall needs to replace private addresses with public addresses that can be routed on the Internet. In many cases, I also want to make internal resources publicly accessible, which again means I need to create a mapping rule that associates the internal resource with a public IP address.
Logging is a little more subtle. A firewall should be capable of logging information about every successful session. In most cases, you’ll also want log information about unsuccessful sessions through the firewall. You may also want information about all of the NAT translations the firewall has done and all administrative activities done to the firewall. Ideally, this information should be sent to a central server so you can sort through it and look for interesting patterns that might indicate something bad is going on.
Another feature commonly found on basic firewalls is high availability. This involves having a second firewall configured to automatically take over in case the first one fails. To do this without dropping active sessions, it’s important that the secondary device have all state information about the sessions.
In most cases, the failover mode is active-standby: the secondary device doesn’t pass packets until the primary device fails. Then it takes over all processing. In some cases, it’s possible to build an active-active failover model in which two firewalls share the traffic load in some way. These active-active failover models are invariably much more complicated to both build and manage, though.
A next-generation firewall (NGFW) has all the features of a basic firewall plus some or all of the additional features I discuss below.
Geolocation is the ability to associate IP addresses with physical locations. Rather than specifying a bunch of IP address ranges that will change over time, you can specify a whole country.
I’ve often used geolocation to restrict access from countries where I know the company has no legitimate business (cough, North Korea). However, you could also use geolocation to create a special NAT rule that sends all your North American traffic to one web server and all your European traffic to a different one. Because IP address allocations change rather frequently, geolocation requires intermittent updates to remain current.
Intrusion detection or prevention systems look at the contents of packets going through the firewall and try to spot things that look like attacks. In most cases, IDS/IPS devices use signatures to detect known attacks. They also look for generic types of attacks, which are less signature dependent. Because new attacks appear constantly, IDS/IPS devices tend to become less useful over time unless their signatures are regularly updated. This typically requires a subscription service from the vendor.
As files are uploaded or downloaded, they pass through the firewall and it can do a basic examination. In most cases, this will be signature-based analysis, looking at checksums and scanning inside the file for patterns of bytes that have been seen in known malware in the past. This feature obviously requires that files aren’t encrypted and the firewall has a recently updated set of signatures.
In truth, simple malware scanning at the firewall isn’t terribly effective because it’s so easy to hide malware with encryption. You’ll usually have better anti-virus scanning on the destination computer.
A better form of malware scanning is called a sandbox. This is essentially a virtual machine (VM) running a common target operating system such as Windows. The firewall intercepts the file download and sends it over to the sandbox VM where it’s “detonated,” meaning the VM tries to run the file as if it were the target computer. The sandbox then looks for common types of malicious behaviour such as connecting to command and control (C&C) networks. Once the file has been analysed, the VM is safely deleted and a new one is created. A sandbox is an isolated testing environment that enables users to run programs or execute files without affecting the application, system or platform on which they run. Without sandboxing, an application or other system process could have unlimited access to all the user data and system resources on a network.
In some cases, the sandbox is a separate physical box sitting at the network edge. In other cases, it’s a cloud service. It tends to be less effective to put a sandbox inside the firewall because the sandbox requires so many memory and CPU resources to run.