Return to
Portfolio

29. Reducing Bandwidth and Data Size

There are several ways that NXLog can be configured to reduce the size of log data. This can help lower bandwidth requirements during transport, storage requirements for log data storage, and licensing costs for commercial SIEM systems that charge based on data volume.

There are three sections following.

  • By removing unnecessary or duplicate events at the source, there is less data to be transported and stored—reducing the data size during all subsequent stages of processing. See Filtering Events.

  • Similarly, removing extra content or fields from event records can reduce the total amount of log data. See Trimming Events.

  • For information about reducing data requirements during transport, see Compressing During Transport.

To achieve the best results, it is important to understand how fields work in NXLog and which fields are being transferred or stored. For example, removing or modifying fields without modifying $raw_event will not reduce data requirements at all for an output module instance that uses only $raw_event. See Event Records and Fields for details, as well as the explanation in Compressing During Transport below.

29.1. Filtering Events

Depending on the logging requirements and the log source, it may be possible to simply discard certain events. NXLog can be configured to filter events based on nearly any set of criteria. See also Filtering Messages.

Example 144. Dropping Unnecessary Events

In this example, an NXLog agent is configured to collect Syslog messages from devices on the local network. Events are parsed with the xm_syslog parse_syslog() procedure, which sets the SeverityValue field. Any event with a normalized severity lower than 3 (warning) is discarded.

nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
<Extension _syslog>
    Module  xm_syslog
</Extension>

<Input syslog>
    Module  im_udp
    Host    0.0.0.0
    Port    514
    Exec    parse_syslog(); if $SeverityValue < 3 drop();
</Input>

Similarly, the pm_norepeat module can be used to detect, count, and discard duplicate events. In their place, pm_norepeat generates a single event with a last message repeated n times message.

Example 145. Dropping Duplicate Events

With this configuration, NXLog collects Syslog messages from hosts on the local network with im_udp and parses them with the xm_syslog parse_syslog() procedure. Events are then routed through a pm_norepeat module instance, where the $Hostname, $Message, and $SourceName fields are checked to detect duplicate messages. Last, events are sent to a remote host with om_batchcompress.

nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<Extension _syslog>
    Module      xm_syslog
</Extension>

<Input syslog_udp>
    Module      im_udp
    Host        0.0.0.0
    Port        514
    Exec        parse_syslog();
</Input>

<Processor norepeat>
    Module      pm_norepeat
    CheckFields Hostname, Message, SourceName
</Processor>

<Output out>
    Module      om_batchcompress
    Host        10.2.0.2
    Port        2514
</Output>

<Route r>
    Path        syslog_udp => norepeat => out
</Route>

29.2. Trimming Events

NXLog can be configured to parse events into various fields in the event record. In this case, a whitelist can be used to retain a set of important fields. See Rewriting and Modifying Messages for more information about modifying events.

Example 146. Discarding Extra Fields via Whitelist

This configuration reads from the Windows EventLog with im_msvistalog and uses an xm_rewrite module instance to discard any fields in the event record that are not included in the whitelist. The xm_rewrite instance below could be used with multiple sources; for example, the whitelist would also be suitable for the xm_syslog fields.

Note
The xm_rewrite module does not remove the $raw_event field.
nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<Extension whitelist>
    Module  xm_rewrite
    Keep    AccountName, Channel, EventID, EventReceivedTime, EventTime, Hostname, \
            Severity, SeverityValue, SourceName
</Extension>

<Input eventlog>
    Module  im_msvistalog
    <QueryXML>
        <QueryList>
            <Query Id='0'>
                <Select Path='Security'>*[System/Level&lt;=4]</Select>
            </Query>
        </QueryList>
    </QueryXML>
    Exec    whitelist->process();
</Input>

In some cases, event messages contain a lot of extra data that is duplicated across multiple events of the same time. One example of this is the "descriptive event data" which has been introduced by Microsoft for the Windows EventLog. By removing this verbose text from common events, event sizes can be reduced significantly while still preserving all the forensic details of the event.

Example 147. Removing Descriptive Data From Event Messages

The following configuration collects events from the Application, Security, and System channels. Rules are included for truncating the messages of Security events with IDs 4688 and 4769.

Note
In this example, the $Message field is truncated. However, the $raw_event field is not. For most input modules, $raw_event will include the contents of $Message and other fields (see the im_msvistalog $raw_event field). To update the $raw_event field, include a statement for this (see the comment in the configuration example). See also Compressing During Transport below for more details.
Input Sample (Event ID 4769)
A Kerberos service ticket was requested.

Account Information:
        Account Name:           WINAD$@TEST.COM
        Account Domain:         TEST.COM
        Logon GUID:             {55a7f67c-a32c-150a-29f1-7e173ff130a7}

Service Information:
        Service Name:           WINAD$
        Service ID:             TEST\WINAD$

Network Information:
        Client Address:         ::1
        Client Port:            0

Additional Information:
        Ticket Options:         0x40810000
        Ticket Encryption Type: 0x12
        Failure Code:           0x0
        Transited Services:     -

This event is generated every time access is requested to a resource such as a computer or a Windows service.  The service name indicates the resource to which access was requested.

This event can be correlated with Windows logon events by comparing the Logon GUID fields in each event.  The logon event occurs on the machine that was accessed, which is often a different machine than the domain controller which issued the service ticket.

Ticket options, encryption types, and failure codes are defined in RFC 4120.
nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<Input eventlog>
    Module  im_msvistalog
    <QueryXML>
        <QueryList>
            <Query Id="0">
                <Select Path="Application">
                    *[System[(Level&lt;=4)]]</Select>
                <Select Path="Security">
                    *[System[(Level&lt;=4)]]</Select>
                <Select Path="System">
                    *[System[(Level&lt;=4)]]</Select>
            </Query>
        </QueryList>
    </QueryXML>
    <Exec>
        if ($Channel == 'Security') and ($EventID == 4688)
            $Message =~ s/\s*Token Elevation Type indicates the type of .*$//s;
        else if $(Channel == 'Security') and ($EventID == 4769)
            $Message =~ s/\s*This event is generated every time access is .*$//s;
        # Additional rules can be added here
        # ...
        # Optionally, update the $raw_event field
        #$raw_event = $EventTime + ' ' + $Message;
    </Exec>
</Input>
Output Sample
A Kerberos service ticket was requested.

Account Information:
        Account Name:           WINAD$@TEST.COM
        Account Domain:         TEST.COM
        Logon GUID:             {55a7f67c-a32c-150a-29f1-7e173ff130a7}

Service Information:
        Service Name:           WINAD$
        Service ID:             TEST\WINAD$

Network Information:
        Client Address:         ::1
        Client Port:            0

Additional Information:
        Ticket Options:         0x40810000
        Ticket Encryption Type: 0x12
        Failure Code:           0x0
        Transited Services:     -

29.3. Compressing During Transport

There are several ways that event data can be transported between NXLog agents, including the *m_tcp and *m_ssl modules. However, those modules do not provide data compression. The im_batchcompress and om_batchcompress modules, available in NXLog Enterprise Edition, can be used to transfer events in compressed (and optionally, encrypted) batches.

The following chart compares the data requirements for the *m_tcp, *m_ssl (with TLSv1.2), and *m_batchcompress module pairs. It is based on a sample of BSD Syslog records parsed with parse_syslog(). The values shown reflect the total bi-directional bytes transferred at the packet level. Of course, ratios will vary from this in practice based on network conditions and the compressibility of the event data.

Note that the om_tcp and om_ssl modules (among others) transfer only the $raw_event field by default, but can be configured to transfer all fields with OutputType Binary. The om_batchcompress module transfers all fields in the event record, but it is possible to send only the $raw_event field by first removing the other fields (see Generating $raw_event and Removing Other Fields below).

Data Requirements for Various Transfer Methods

Simply configuring the *m_batchcompress modules for the transfer of event data between NXLog agents can significantly reduce the bandwidth requirements for that part of the log path.

Example 148. Batched Log Transfer

With the following configuration, an NXLog agent uses om_batchcompress to send events in compressed batches to a remote NXLog agent.

Tip
The *m_batchcompress modules also support SSL/TLS encryption; see the im_batchcompress and om_batchcompress configuration details.
nxlog.conf (Sending Agent) [Download file]
1
2
3
4
5
<Output out>
    Module  om_batchcompress
    Host    10.2.0.2
    Port    2514
</Output>

The remote NXLog agent receives and decompresses the received batches with im_batchcompress. All fields in an event are available to the receiving agent.

nxlog.conf (Receiving Agent) [Download file]
1
2
3
4
5
<Input in>
    Module      im_batchcompress
    ListenAddr  10.2.0.2
    Port        2514
</Input>

To further reduce the size of the batches transferred by the *m_batchcompress modules, and if only the $raw_event field will be needed later in the log path, the extra fields can be removed from the event record prior to transfer. This can be done with an xm_rewrite instance for multiple fields or with the delete() procedure (see Renaming and Deleting Fields).

Example 149. Generating $raw_event and Removing Other Fields

In this configuration, events are collected from the Windows EventLog with im_msvistalog, which sets the $raw_event and many other fields. To reduce the size of the events, only the $raw_event field is retained; all the other fields in the event record are removed by the xm_rewrite module instance (called by clean->process()).

Note
Rather than using the default im_msvistalog $raw_event field, it would also be possible to customize it with something like $raw_event = $EventTime + ' ' + $Message or to_json().
nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<Extension clean>
    Module  xm_rewrite
    Keep    raw_event
</Extension>

<Input eventlog>
    Module  im_msvistalog
    <QueryXML>
        <QueryList>
            <Query Id='0'>
                <Select Path='Security'>*[System/Level&lt;=4]</Select>
            </Query>
        </QueryList>
    </QueryXML>
</Input>

<Output out>
    Module  om_batchcompress
    Host    10.2.0.2
    Exec    clean->process();
</Output>

Alternatively, if the various fields in the event record will be handled later in the log path, the $raw_event field can be set to an empty string (but see the warning below).

Example 150. Emptying $raw_event and Sending Other Fields

This configuration collects events from the Windows EventLog with im_msvistalog, which writes multiple fields to the event record. In this case, the $raw_event field contains the same data as other fields. Because the om_batchcompress module instance will send all the fields in the event record, the $raw_event field can be emptied.

Warning
Many output modules operate on the $raw_event field only. It should not be set to an empty string unless the output module sends all the event fields (om_batchcompress or a module using the Binary OutputType) and so on for all subsequent agents and modules. Otherwise, a module instance will encounter an empty $raw_event. For this reason, the following example is in general not recommended.
nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<Input eventlog>
    Module  im_msvistalog
    <QueryXML>
        <QueryList>
            <Query Id='1'>
                <Select Path='Security'>*[System/Level&lt;=4]</Select>
            </Query>
        </QueryList>
    </QueryXML>
</Input>

<Output out>
    Module  om_batchcompress
    Host    10.2.0.2
    Exec    $raw_event = '';
</Output>