Troubleshooting Guide: Packet capture of VLAN, non-VLAN, and Q-in-Q traffic with tcpdump

Version 10

    Introduction

     

    Packet captures are an excellent way of troubleshooting FireEye error messages because they contain all the information that passes from one device to another, regardless of compression and encryption.

     

    If it didn't happen in the packet capture, then it didn't happen, regardless of what the error message says.

     

    Example: SSL handshake failed

     

    This error indicates that a connection was made and the certificate or cipher looked wrong. But when you examine the capture, you see that there never was a handshake all; a reset was sent at the beginning of the connection attempt. Instead of going down the path of trying to figure out what ciphers everyone is using and what device is mucking up the certificate/ssl stream, you can now focus on the simple fix of changing a firewall setting to unblock the connection.

     

    Examples of when to use packet capture for troubleshooting

    Examples of things you can do with a packet capture
    1. Assess communication loss for connection errors and security content or firmware update downloading errors
    2. Verify MTA connectivity and confirm that emails are directed to an email appliance (EX) correctly
    3. Verify that traffic is reaching the Web appliance (NX) and determine if it sees complete TCP streams
    4. If an alert is malformed or can't be found, see exactly what information and format is being sent to your logging servers to determine if the problem  is at the FireEye appliance or somewhere else
    1. Reconstruct objects (Web pages, block messages, files, and so on) that passed through the network during packet capture
    2. Check for latency
    3. Check for lost packets, network disruption, or misbehaving devices
    4. Determine if packets leaving one device arrive where and as expected
    5. Determine if packets are malformed or incorrect for a device
    6. Duplicate a problem by replaying captures in the lab
    7. Determine if SSL interception is happening.
    8. See what cipher is being used and check details about the certificate
    9. See if de-encrypted and encrypted versions of the same stream are being combined at the NX (breaking detection for malicious objects)
    10. See LDAP and DNS timeouts and failures

     

    NX Deployment Check Tool

     

    The NX includes a deployment check tool that will monitor all ports for 150 seconds or 100,000 packets.

    • From the Web UI, go to About > Deployment Check > Network Deployment Check
    • From the CLI:
      # enable
      > configure terminal
      > file tcpdump upload deployment_check.pcap <*p://username[:password]@hostname/path/filename>
      Note: ftp, tftp, scp and sftp are supported. You can also use an scp utility to access the file at /var/opt/tms/tcpdumps

     

    TCPDUMPs

    Capturing VLAN and non-VLAN traffic

     

    You can capture VLAN and non-VLAN traffic at the same time, but this requires the parts of the filter to be in a specific order. Each use of vlan changes the decoding offsets for the remainder of the expression.

     

    For example, here we search the host first to catch packets with no VLAN, then we search for vlan and the same host to catch VLAN packets with that host.

    tcpdump -nnvvi eth0 'host 173.194.115.81 or (vlan and host 173.194.115.81)'

     

    If we reverse this to '(vlan and host 173.194.115.81) or host 173.194.115.81', we would not find packets in the native or untagged VLANs because the second mention of host would be offset by 4 bits, causing it to fail.

     

    Q-in-Q

    Because Q-in-Q uses two vlan tags next to each other in the header, we use vlan twice to move the offset for the rest of the filter.

    tcpdump -nnvvi eth0 'vlan and vlan and host 173.194.115.81'

     

    Although it seems intuitive that (host x.x.x.x or (vlan and host x.x.x.x ) or (vlan and vlan and host x.x.x.x)) would catch all the types of traffic (non-VLAN, VLAN, and Q-in-Q traffic), the third use of  vlan pushes the alignment 4 bits too far. The solution is to repeat the single vlan statement a second time, which moves the offset four more bits to match the Q-in-Q VLAN.

     

    Examples

     

    Let's say you want to catch a set of addresses and you're not sure if they have a VLAN. The hosts are in the following ranges: 173.194.115.82, 173.194.115.81, and 173.1.1.0 to 173.1.1.15.

     

    Capturing normal and single VLANs
    tcpdump -nnvvi eth0 '(host 173.194.115.82 or host 173.194.115.81 or net 173.1.1.0/28)or (vlan and (host 173.194.115.82 or host 173.194.115.81 or net 173.1.1.0/28))'

     

    Capturing Q-in-Q packets
    tcpdump -nnvvi eth0 '(host 173.194.115.82 or host 173.194.115.81 or net 173.1.1.0/28) or (vlan and vlan and (host 173.194.115.82 or host 173.194.115.81 or net 173.1.1.0/28))'

     

    Capturing all three packet types
    tcpdump -nnvvi eth0 '(host 173.194.115.82 or host 173.194.115.81 or net 173.1.1.0/28) or (vlan and (host 173.194.115.82 or host 173.194.115.81 or net 173.1.1.0/28)) or (vlan and (host 173.194.115.82 or host 173.194.115.81 or net 173.1.1.0/28))'

    Use the -w <filename> option to write the results to a file that can be accessed in /var/opt/tms/tcpdumps/

     

    Thanks to george.anderson and layth.darraji for contributing this article!