29. Reducing Bandwidth and Data Size
There are several ways that NXLog can be configured to reduce the size of log data. This can help lower bandwidth requirements during transport, storage requirements for log data storage, and licensing costs for commercial SIEM systems that charge based on data volume.
There are three sections following.
-
By removing unnecessary or duplicate events at the source, there is less data to be transported and stored—reducing the data size during all subsequent stages of processing. See Filtering Events.
-
Similarly, removing extra content or fields from event records can reduce the total amount of log data. See Trimming Events.
-
For information about reducing data requirements during transport, see Compressing During Transport.
To achieve the best results, it is important to understand how fields work in
NXLog and which fields are being transferred or stored. For example,
removing or modifying fields without modifying $raw_event
will not reduce
data requirements at all for an output module instance that uses only
$raw_event
. See Event Records and Fields for details, as well as the
explanation in Compressing During Transport below.
29.1. Filtering Events
Depending on the logging requirements and the log source, it may be possible to simply discard certain events. NXLog can be configured to filter events based on nearly any set of criteria. See also Filtering Messages.
In this example, an NXLog agent is configured to collect Syslog messages from devices on the local network. Events are parsed with the xm_syslog parse_syslog() procedure, which sets the SeverityValue field. Any event with a normalized severity lower than 3 (warning) is discarded.
Similarly, the pm_norepeat module can be used to detect,
count, and discard duplicate events. In their place, pm_norepeat generates a
single event with a last message repeated n times
message.
With this configuration, NXLog collects Syslog messages from hosts on
the local network with im_udp and parses them with the xm_syslog
parse_syslog() procedure. Events are then
routed through a pm_norepeat module instance, where the
$Hostname
, $Message
, and $SourceName
fields are checked to detect
duplicate messages. Last, events are sent to a remote host with
om_batchcompress.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<Extension _syslog>
Module xm_syslog
</Extension>
<Input syslog_udp>
Module im_udp
Host 0.0.0.0
Port 514
Exec parse_syslog();
</Input>
<Processor norepeat>
Module pm_norepeat
CheckFields Hostname, Message, SourceName
</Processor>
<Output out>
Module om_batchcompress
Host 10.2.0.2
Port 2514
</Output>
<Route r>
Path syslog_udp => norepeat => out
</Route>
29.2. Trimming Events
NXLog can be configured to parse events into various fields in the event record. In this case, a whitelist can be used to retain a set of important fields. See Rewriting and Modifying Messages for more information about modifying events.
This configuration reads from the Windows EventLog with im_msvistalog and uses an xm_rewrite module instance to discard any fields in the event record that are not included in the whitelist. The xm_rewrite instance below could be used with multiple sources; for example, the whitelist would also be suitable for the xm_syslog fields.
Note
|
The xm_rewrite module does not remove the $raw_event
field.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<Extension whitelist>
Module xm_rewrite
Keep AccountName, Channel, EventID, EventReceivedTime, EventTime, Hostname, \
Severity, SeverityValue, SourceName
</Extension>
<Input eventlog>
Module im_msvistalog
<QueryXML>
<QueryList>
<Query Id='0'>
<Select Path='Security'>*[System/Level<=4]</Select>
</Query>
</QueryList>
</QueryXML>
Exec whitelist->process();
</Input>
In some cases, event messages contain a lot of extra data that is duplicated across multiple events of the same time. One example of this is the "descriptive event data" which has been introduced by Microsoft for the Windows EventLog. By removing this verbose text from common events, event sizes can be reduced significantly while still preserving all the forensic details of the event.
The following configuration collects events from the Application, Security, and System channels. Rules are included for truncating the messages of Security events with IDs 4688 and 4769.
Note
|
In this example, the $Message field is truncated. However, the
$raw_event field is not. For most input modules, $raw_event will
include the contents of $Message and other fields (see the
im_msvistalog $raw_event field). To
update the $raw_event field, include a statement for this (see the
comment in the configuration example). See also Compressing During Transport
below for more details.
|
A Kerberos service ticket was requested. Account Information: Account Name: WINAD$@TEST.COM Account Domain: TEST.COM Logon GUID: {55a7f67c-a32c-150a-29f1-7e173ff130a7} Service Information: Service Name: WINAD$ Service ID: TEST\WINAD$ Network Information: Client Address: ::1 Client Port: 0 Additional Information: Ticket Options: 0x40810000 Ticket Encryption Type: 0x12 Failure Code: 0x0 Transited Services: - This event is generated every time access is requested to a resource such as a computer or a Windows service. The service name indicates the resource to which access was requested. This event can be correlated with Windows logon events by comparing the Logon GUID fields in each event. The logon event occurs on the machine that was accessed, which is often a different machine than the domain controller which issued the service ticket. Ticket options, encryption types, and failure codes are defined in RFC 4120.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<Input eventlog>
Module im_msvistalog
<QueryXML>
<QueryList>
<Query Id="0">
<Select Path="Application">
*[System[(Level<=4)]]</Select>
<Select Path="Security">
*[System[(Level<=4)]]</Select>
<Select Path="System">
*[System[(Level<=4)]]</Select>
</Query>
</QueryList>
</QueryXML>
<Exec>
if ($Channel == 'Security') and ($EventID == 4688)
$Message =~ s/\s*Token Elevation Type indicates the type of .*$//s;
else if $(Channel == 'Security') and ($EventID == 4769)
$Message =~ s/\s*This event is generated every time access is .*$//s;
# Additional rules can be added here
# ...
# Optionally, update the $raw_event field
#$raw_event = $EventTime + ' ' + $Message;
</Exec>
</Input>
A Kerberos service ticket was requested. Account Information: Account Name: WINAD$@TEST.COM Account Domain: TEST.COM Logon GUID: {55a7f67c-a32c-150a-29f1-7e173ff130a7} Service Information: Service Name: WINAD$ Service ID: TEST\WINAD$ Network Information: Client Address: ::1 Client Port: 0 Additional Information: Ticket Options: 0x40810000 Ticket Encryption Type: 0x12 Failure Code: 0x0 Transited Services: -
29.3. Compressing During Transport
There are several ways that event data can be transported between NXLog agents, including the *m_tcp and *m_ssl modules. However, those modules do not provide data compression. The im_batchcompress and om_batchcompress modules, available in NXLog Enterprise Edition, can be used to transfer events in compressed (and optionally, encrypted) batches.
The following chart compares the data requirements for the *m_tcp, *m_ssl (with TLSv1.2), and *m_batchcompress module pairs. It is based on a sample of BSD Syslog records parsed with parse_syslog(). The values shown reflect the total bi-directional bytes transferred at the packet level. Of course, ratios will vary from this in practice based on network conditions and the compressibility of the event data.
Note that the
om_tcp and om_ssl modules (among others) transfer only
the $raw_event
field by default, but can be configured to transfer all
fields with OutputType Binary
. The
om_batchcompress module transfers all fields in the event
record, but it is possible to send only the $raw_event
field by first
removing the other fields (see Generating $raw_event and Removing Other Fields below).
Simply configuring the *m_batchcompress modules for the transfer of event data between NXLog agents can significantly reduce the bandwidth requirements for that part of the log path.
With the following configuration, an NXLog agent uses om_batchcompress to send events in compressed batches to a remote NXLog agent.
Tip
|
The *m_batchcompress modules also support SSL/TLS encryption; see the im_batchcompress and om_batchcompress configuration details. |
1
2
3
4
5
<Output out>
Module om_batchcompress
Host 10.2.0.2
Port 2514
</Output>
The remote NXLog agent receives and decompresses the received batches with im_batchcompress. All fields in an event are available to the receiving agent.
To further reduce the size of the batches transferred by the
*m_batchcompress modules, and if only the $raw_event
field will be needed
later in the log path, the extra fields can be removed from the event record
prior to transfer.
This can be done with an xm_rewrite instance for multiple
fields or with the delete() procedure (see
Renaming and Deleting Fields).
In this configuration, events are collected from the Windows EventLog with
im_msvistalog, which sets the
$raw_event and many other fields. To reduce
the size of the events, only the $raw_event
field is retained; all the other
fields in the event record are removed by the xm_rewrite module
instance (called by clean->process()
).
Note
|
Rather than using the default im_msvistalog
$raw_event field, it would also be
possible to customize it with something like
$raw_event = $EventTime + ' ' + $Message or
to_json().
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<Extension clean>
Module xm_rewrite
Keep raw_event
</Extension>
<Input eventlog>
Module im_msvistalog
<QueryXML>
<QueryList>
<Query Id='0'>
<Select Path='Security'>*[System/Level<=4]</Select>
</Query>
</QueryList>
</QueryXML>
</Input>
<Output out>
Module om_batchcompress
Host 10.2.0.2
Exec clean->process();
</Output>
Alternatively, if the various fields in the event record will be handled later
in the log path, the $raw_event
field can be set to an empty string (but see
the warning below).
This configuration collects events from the Windows EventLog with
im_msvistalog, which writes multiple fields to the event
record. In this case, the $raw_event field
contains the same data as other fields. Because the
om_batchcompress module instance will send all the fields
in the event record, the $raw_event
field can be emptied.
Warning
|
Many output modules operate on the $raw_event field only. It should
not be set to an empty string unless the output module sends all the
event fields (om_batchcompress or a module using
the Binary OutputType) and so on for all
subsequent agents and modules. Otherwise, a module instance will
encounter an empty $raw_event . For this reason, the following
example is in general not recommended.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<Input eventlog>
Module im_msvistalog
<QueryXML>
<QueryList>
<Query Id='1'>
<Select Path='Security'>*[System/Level<=4]</Select>
</Query>
</QueryList>
</QueryXML>
</Input>
<Output out>
Module om_batchcompress
Host 10.2.0.2
Exec $raw_event = '';
</Output>