NXLog User Guide

25.3. Using Buffers

The following sections describe the various types of buffering features provided by NXLog and give examples for configuring buffering in specific scenarios.

25.3.1. Read and Write Buffers

Input and output module instances have read and write buffers, respectively. These buffers can be configured for a particular module instance with the BufferSize directive.

Example 69. Read/Write Buffers in a Simple Route

This example shows the default read and write buffers used by NXLog for a simple route. Each buffer is limited to 65,000 bytes.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

  <Input file>
    Module      im_file
    File        '/tmp/in.log'

    # Set read buffer size, in bytes (default)
    BufferSize  65000
</Input>

<Output tcp>
    Module      om_tcp
    Host        192.168.1.1

    # Set write buffer size, in bytes (default)
    BufferSize  65000
</Output>

<Route r>
    Path        file => tcp
</Route>

25.3.2. Log Queues

Every processor and output module instance has an input log queue for events that have not yet been processed by that module instance. When the preceding module has processed an event, it is placed in this queue. Because log queues are enabled by default for all processor and output module instances, they are the preferred way to adjust buffering behavior.

The size of a module instance’s log queue can be configured with the LogqueueSize directive.

Example 70. A Log Queue in a Basic Route

This example shows the default log queue used by NXLog in a simple route. Up to 100 events will be placed in the queue to be processed by the om_batchcompress instance.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

  <Input eventlog>
    Module          im_msvistalog
</Input>

<Output batch>
    Module          om_batchcompress
    Host            192.168.2.1

    # Set log queue size, in events (default)
    LogqueueSize    100
</Output>

<Route r>
    Path            eventlog => batch
</Route>

By default, log queues are stored in memory. NXLog can be configured to persist log queues to disk with the PersistLogqueue directive. NXLog will further sync all writes to a disk-based queue with SyncLogqueue. These directives can be used to prevent data loss in case of interrupted processing, at the expense of reduced performance, and can be used both globally and for a particular module. For more information, see Reliable Message Delivery.

Note	Any events remaining in the log queue will be written to disk when NXLog is stopped, regardless of the value of PersistLogqueue.

Example 71. A Persistent Log Queue

In this example, the om_elasticsearch instance is configured with a persistent and synced log queue. Each time an event is added to the log queue, the event will be written to disk and synced before processing continues.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

  <Input acct>
    Module          im_acct
</Input>

<Output elasticsearch>
    Module          om_elasticsearch
    URL             http://192.168.2.2:9200/_bulk

    # Set log queue size, in events (default)
    LogqueueSize    100

    # Use persistent and synced log queue
    PersistLogqueue TRUE
    SyncLogqueue    TRUE
</Output>

<Route r>
    Path            acct => elasticsearch
</Route>

25.3.3. Flow Control

To effectively leverage buffering, it is important to understand NXLog’s flow control feature. Flow control has no effect unless the following sequence of events occurs in a route:

a processor or output module instance is not able to process log data at the incoming rate,
that module instance’s log queue becomes full, and
the preceding input or processor module instance has flow control enabled.

In this case, flow control will cause the input or processor module instance to suspend processing until the succeeding module instance is ready to accept more log data.

Example 72. Flow Control Enabled

This example shows the NXLog’s default flow control behavior with a basic route. Events are collected from the Windows Event Log with im_msvistalog and forwarded with om_tcp. The om_tcp instance will be blocked if the destination is unreachable or the network can not handle the events quickly enough.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

  <Input eventlog>
    Module      im_msvistalog

    # Flow control enabled (default)
    FlowControl TRUE
</Input>

<Output tcp>
    Module      om_tcp
    Host        192.168.1.1
</Output>

<Route r>
    Path        eventlog => tcp
</Route>

The om_tcp instance is unable to connect to the destination host and its log queue is full. Because the im_msvistalog instance has flow control enabled and the next module in the route is blocked, it has been paused. No events will be read from the Event Log until the tcp instance becomes unblocked.

Flow control is enabled by default, and can be set globally or for a particular module instance with the FlowControl directive. Generally, flow control provides automatic, zero-configuration handling of cases where buffering would otherwise be required. However, there are some cases where flow control should be disabled and buffering configured explicitly as required.

Example 73. Flow Control Disabled

In this example, Linux Audit messages are collected with im_linuxaudit and forwarded with om_http. Flow control is disabled for im_linuxaudit to prevent processes from being blocked due to an Audit backlog. To avoid loss of log data in this case, the LogqueueSize directive could be used as shown in Increasing the Log Queue Size to Protect Against UDP Message Loss.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

  <Input audit>
    Module      im_linuxaudit
    <Rules>
        -D
        -w /etc/passwd -p wa -k passwd
    </Rules>

    # Disable flow control to prevent Audit backlog
    FlowControl FALSE
</Input>

<Output http>
    Module      om_http
    URL         http://192.168.2.1:8080/
</Output>

<Route r>
    Path        audit => http
</Route>

The om_http instance is unable to forward log data, and its log queue is full. Because it has flow control disabled, the im_linuxaudit instance remains active and continues to process log data. However, all events will be discarded until the om_http log queue is no longer full.

25.3.4. The pm_buffer Module

Log queues are enabled by default for processor and output modules instances, and are the preferred way to configure buffering behavior in NXLog. However, for cases where additional features are required, the pm_buffer module can be used to add a buffer instance to a route in addition to the above buffers normally used by NXLog.

Additional features provided by pm_buffer include:

both memory- and disk-based buffering types,
a buffer size limit measured in kilobytes,
a WarnLimit threshold that generates a warning message when crossed, and
functions for querying the status of a pm_buffer buffer instance.

Note

In a disk-based pm_buffer instance, events are not written to disk unless the log queue of the succeeding module instance is full. For this reason, a disk-based pm_buffer instance does not reduce peformance in the way that a persistent log queue does. Additionally, pm_buffer (and other processor modules) should not be used if crash-safe processing is required; see Reliable Message Delivery.

Example 74. Using the pm_buffer Module

This example shows a route with a large disk-based buffer provided by the pm_buffer module. A warning message will be generated when the buffer size crosses the threshold specified.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

  <Input udp>
    Module      im_udp
</Input>

<Processor buffer>
    Module      pm_buffer
    Type        Disk

    # 40 MiB buffer
    MaxSize     40960

    # Generate warning message at 20 MiB
    WarnLimit   20480
</Processor>

<Output ssl>
    Module      om_ssl
    Host        10.8.0.2
    CAFile      %CERTDIR%/ca.pem
    CertFile    %CERTDIR%/client-cert.pem
    CertKeyFile %CERTDIR%/client-key.pem
</Output>

<Route r>
    Path        udp => buffer => ssl
</Route>

The SSL/TLS destination is unreachable, and the disk-based buffer is filling.

25.3.5. Other Buffering Functionality

Buffering in NXLog is not limited to the functionality covered above. Other modules implement or provide additional buffering-related features, such as the ones listed below. (This is not intended to be an exhaustive list.)

The UDP modules (im_udp, om_udp, and om_udpspoof) can be configured to set the socket buffer size (SO_RCVBUF or SO_SNDBUF) with the respective SockBufSize directive.
The external program and scripting support (im_exec, im_perl, im_python, im_ruby, om_exec, om_perl, om_python, om_ruby) can be used to implement custom buffering solutions.
Some modules (such as om_batchcompress, om_elasticsearch, and om_webhdfs) buffer events internally in order to forward events in batches.
The pm_blocker module can be used to programmatically block or unblock the log flow in a route, and in this way control buffering. Or it can be used to test buffering.
The om_blocker module can be used to test buffering behavior by simulating a blocked output.

Example 75. All Buffers in a Basic Route

The following diagram shows all buffers used in a simple im_udp => om_tcp route. The socket buffers are only applicable to networking modules.

25.3.6. Receiving Logs via UDP

Because UDP is connectionless, log data sent via plain UDP must be accepted immediately. Otherwise the log data is lost. For this reason, it is important to add a buffer if there is any possibility of the route becoming blocked. This can be done by increasing the log queue size of the following module instance or adding a pm_buffer instance to the route.

Example 76. Increasing the Log Queue Size to Protect Against UDP Message Loss

In this configuration, log messages are accepted with im_udp and forwarded with om_tcp. The log queue size of the output module instance is increased to 5000 events to buffer messages in case the output becomes blocked. To further reduce the risk of data loss, the socket buffer size is increased with the SockBufSize directive and the route priority is increased with Priority.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

  <Input udp>
    Module          im_udp

    # Raise socket buffer size
    SockBufSize     150000000
</Input>

<Output tcp>
    Module          om_tcp
    Host            192.168.1.1

    # Keep up to 5000 events in the log queue
    LogqueueSize    5000
</Output>

<Route udp_to_tcp>
    Path            udp => tcp

    # Process events in this route first
    Priority        1
</Route>

The output is blocked because the network is not able to handle the log data quickly enough.

Example 77. Using a pm_buffer Instance to Protect Against UDP Message Loss

Instead of raising the size of the log queue, this example uses a memory-based pm_buffer instance to buffer events when the output becomes blocked. A warning message will be generated if the buffer size exceeds the specified WarnLimit threshold.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

  <Input udp>
    Module      im_udp

    # Raise socket buffer size
    SockBufSize 150000000
</Input>

<Processor buffer>
    Module      pm_buffer
    Type        Mem

    # 5 MiB buffer
    MaxSize     5120

    # Warn at 2 MiB
    WarnLimit   2048
</Processor>

<Output http>
    Module      om_http
    URL         http://10.8.1.1:8080/
</Output>

<Route udp_to_http>
    Path        udp => buffer => http

    # Process events in this route first
    Priority    1
</Route>

The HTTP destination is unreachable, the http instance log queue is full, and the buffer instance is filling.

25.3.7. Reading Logs From /dev/log

Syslog messages can be read from the /dev/log/ socket with the im_uds module. However, if the route becomes blocked and the im_uds instance is suspended, the syslog() system call will block in programs attempting to log a message. To prevent that, flow control should be disabled.

With flow control disabled, events will be discarded if the route becomes blocked and the route’s log queues become full. To reduce the risk of lost log data, the log queue size of a succeeding module instance in the route can be increased. Alternatively, a pm_buffer instance can be used as in the second UDP example above.

Example 78. Buffering Syslog Messages From /dev/log

This configuration uses the im_uds module to collect Syslog messages from the /dev/log socket, and the xm_syslog parse_syslog() procedure to parse them.

To prevent the syslog() system call from blocking as a result of the im_uds instance being suspended, the FlowControl directive is set to FALSE. The LogqueueSize directive raises the log queue limit of the output instance to 5000 events. The Priority directive indicates that this route’s events should be processed first.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

  <Extension _syslog>
    Module          xm_syslog
</Extension>

<Input dev_log>
    Module          im_uds
    UDS             /dev/log
    Exec            parse_syslog();

    # This module instance must never be suspended
    FlowControl     FALSE
</Input>

<Output elasticsearch>
    Module          om_elasticsearch
    URL             http://192.168.2.1:9022/_bulk

    # Keep up to 5000 events in the log queue
    LogqueueSize    5000
</Output>

<Route syslog_to_elasticsearch>
    Path            dev_log => elasticsearch

    # Process events in this route first
    Priority        1
</Route>

The Elasticsearch server is unreachable and the log queue is filling. If the log queue becomes full, events will be discarded.

25.3.8. Forwarding Logs From File

Because flow control will pause an im_file instance automatically, it is normally not necessary to use any additional buffering when reading from files. If the route is blocked, the file will not be read until the route becomes unblocked. If the im_file SavePos directive is set to TRUE (the default) and NXLog is stopped, the file position of the im_file instance will be saved and used to resume reading when NXLog is started.

Example 79. Forwarding From File With Default Buffering

This configuration reads log messages from a file with im_file and forwards them with om_tcp. No extra buffering is necessary because flow control is enabled.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

  <Input file>
    Module      im_file
    File        '/tmp/in.log'

    # Enable flow control (default)
    FlowControl TRUE

    # Save file position on exit (default)
    SavePos     TRUE
</Input>

<Output tcp>
    Module      om_tcp
    Host        10.8.0.2
</Output>

<Route r>
    Path        file => tcp
</Route>

The TCP destination is unreachable, and the im_file instance is paused. No messages will be read from the source file until the om_tcp instance becomes unblocked.

Sometimes, however, there is a risk of the input log file becoming inaccessible while the im_file instance is suspended (due to log rotation, for example). For this case, the tcp log queue size can be increased (or a pm_buffer instance added) to buffer more events.

Example 80. Forwarding From File With Additional Buffering

In this example, log messages are read from a file with im_file and forwarded with om_tcp. The om_tcp log queue size has been increased in order to buffer more events because the source file may be rotated away.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

  <Input file>
    Module          im_file
    File            '/tmp/in.log'
</Input>

<Output tcp>
    Module          om_tcp
    Host            192.168.1.1

    # Keep up to 2000 events in the log queue
    LogqueueSize    2000
</Output>

<Route r>
    Path            file => tcp
</Route>

The TCP destination is unreachable and the om_tcp instance is blocked. The im_file instance will continue to read from the file (and events will accumulate) until the tcp log queue is full; then it will be paused.

25.3.9. Discarding Events

NXLog’s flow control mechanism ensures that input module instances will pause until all output module instances can write. This can be problematic in some situations when discarding messages is preferable to blocking. For this case, flow control can be disabled or the drop() procedure can be used in conjunction with the pm_buffer module. These two options differ somewhat in behavior, as described in the examples below.

Example 81. Disabling Flow Control to Selectively Discard Events

This example sends UDP input to two outputs, a file and an HTTP destination. If the HTTP transmission is slower than the rate of incoming UDP packets or the destination is unreachable, flow control would normally pause the im_udp instance. This would result in dropped UDP packets. In this situation it is better to selectively drop log messages in the HTTP route than to lose them entirely. This can be accomplished by simply disabling flow control for the input module instance.

Note

This configuration will also continue to send events to the HTTP destination in the unlikely event that the om_file output blocks. In fact, the input will remain active even if both outputs block (though in this particular case, because UDP is lossy, messages will be lost regardless of whether the im_udp instance is suspended).

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

  <Input udp>
    Module          im_udp

    # Never pause this instance
    FlowControl     FALSE
</Input>

<Output http>
    Module          om_http
    URL             http://10.0.0.3:8080/

    # Increase the log queue size
    LogqueueSize    2000
</Output>

<Output file>
    Module          om_file
    File            '/tmp/out.log'
</Output>

<Route udp_to_tcp>
    Path            udp => http, file
</Route>

The HTTP destination cannot accept events quickly enough. The om_http instance is blocked and its log queue is full. New events are not being added to the HTTP output queue but are still being written to the output file.

Example 82. Selectively Discarding Events With pm_buffer and drop()

In this example, process accounting logs collected by im_acct are both forwarded via TCP and written to file. A separate route is used for each output. A pm_buffer instance is used in the TCP route, and it is configured to discard events with drop() if its size goes beyond a certain threshold. Thus, the pm_buffer instance will never become full and will never cause the im_acct instance to pause—events will always be written to the output file.

Note	Because the im_acct instance has flow control enabled, it will be paused if the om_file output becomes blocked.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

  <Input acct>
    Module      im_acct

    # Flow control enabled (default)
    FlowControl TRUE
</Input>

<Processor buffer>
    Module      pm_buffer
    Type        Mem
    MaxSize     1000
    WarnLimit   800
    Exec        if buffer_size() >= 80k drop();
</Processor>

<Output tcp>
    Module      om_tcp
    Host        192.168.1.1
</Output>

<Output file>
    Module      om_file
    File        '/tmp/out.log'
</Output>

<Route udp_to_tcp>
    Path        acct => buffer => tcp
</Route>

<Route udp_to_file>
    Path        acct => file
</Route>

The TCP destination is unreachable and the om_tcp log queue is full. Input accounting events will be added to the buffer until it gets full, then they will be discarded. Input events will also be written to the output file, regardless of whether the buffer is full.

25.3.10. Scheduled Buffering

While buffering is typically used when a log source becomes unavailable, NXLog can also be configured to buffer logs programmatically. For this purpose, the pm_blocker module can be added to a route.

Example 83. Buffering Logs and Forwarding by Schedule

This example collects log messages via UDP and forwards them to a remote NXLog agent. However, events are buffered with pm_buffer during the week and only forwarded on weekends.

During the week, the pm_blocker instance is blocked and events accumulate in the large on-disk buffer.
During the weekend, the pm_blocker instance is unblocked and all events, including those that have accumulated in the buffer, are forwarded.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

  <Input udp>
    Module      im_udp
    Host        0.0.0.0
</Input>

<Processor buffer>
    Module      pm_buffer

    # 500 MiB disk buffer
    Type        Disk
    MaxSize     512000
    WarnLimit   409600
</Processor>

<Processor schedule>
    Module  pm_blocker
    <Schedule>
        # Start blocking Monday morning
        When    0 0 * * 1
        Exec    schedule->block(TRUE);
    </Schedule>
    <Schedule>
        # Stop blocking Saturday morning
        When    0 0 * * 6
        Exec    schedule->block(FALSE);
    </Schedule>
</Processor>

<Output batch>
    Module      om_batchcompress
    Host        10.3.0.211
</Output>

<Route scheduled_batches>
    Path        udp => buffer => schedule => batch
</Route>

It is currently a weekday and the schedule pm_blocker instance is blocked.

If it is possible to use flow control with the log sources, then it is not necessary to use extra buffering. Instead, the inputs will be paused and read later when the route is unblocked.

Example 84. Collecting Log Data on a Schedule

This configuration reads events from the Windows Event Log and forwards them to a remote NXLog agent in compressed batches with om_batchcompress. However, events are only forwarded during the night. Because the im_msvistalog instance can be paused and events will still be available for collection later, it is not necessary to configure any extra buffering.

During the day, the pm_blocker instance is blocked, the output log queue becomes full, and the eventlog instance is paused.
During the night, the pm_blocker instance is unblocked. The events in the schedule log queue are processed, the eventlog instance is resumed, and all pending events are read from the Event Log and forwarded.

nxlog.conf


  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

  <Input eventlog>
    Module      im_msvistalog
</Input>

<Processor schedule>
    Module  pm_blocker
    <Schedule>
        # Start blocking at 7:00
        When    0 7 * * *
        Exec    schedule->block(TRUE);
    </Schedule>
    <Schedule>
        # Stop blocking at 19:00
        When    0 19 * * *
        Exec    schedule->block(FALSE);
    </Schedule>
</Processor>

<Output batch>
    Module      om_batchcompress
    Host        10.3.0.211
</Output>

<Route scheduled_batches>
    Path        eventlog => schedule => batch
</Route>

The current time is within the specified "day" interval and pm_blocker is blocked.