108.17. Multi-Line Parser (xm_multiline)
This module can be used for parsing log messages that span multiple lines. All lines in an event are joined to form a single NXLog event record, which can be further processed as required. Each multi-line event is detected through some combination of header lines, footer lines, and fixed line counts, as configured. The name of the xm_multiline module instance is specified by the input module’s InputType directive.
The module maintains a separate context for each input source, allowing multi-line messages to be processed correctly even when coming from multiple sources (specifically, multiple files or multiple network connections).
Warning
|
UDP is treated as a single source and all logs are processed under the same context. It is therefore not recommended to use this module with im_udp if messages will be received by multiple UDP senders (such as Syslog). |
108.17.1. Configuration
The xm_multiline module accepts the following directives in addition to the common module directives. One of FixedLineCount and HeaderLine must be specified.
- FixedLineCount
-
This directive takes a positive integer number defining the number of lines to concatenate. This is useful when receiving log messages spanning a fixed number of lines. When this number is defined, the module knows where the event message ends and will not hold a message in the buffers until the next message arrives.
- HeaderLine
-
This directive takes a string or a regular expression literal. This will be matched against each line. When the match is successful, the successive lines are appended until the next header line is read. This directive is mandatory unless FixedLineCount is used.
NoteUntil a new message arrives with its associated header, the previous message is stored in the buffers because the module does not know where the message ends. The im_file module will forcibly flush this buffer after the configured PollInterval timeout. If this behavior is unacceptable, disable AutoFlush, use an end marker with EndLine, or switch to an encapsulation method (such as JSON).
- AutoFlush
-
If set to TRUE, this boolean directive specifies that the corresponding im_file module should forcibly flush the buffer after its configured PollInterval timeout. The default is TRUE. If EndLine is used, AutoFlush is automatically set to FALSE to disable this behavior. AutoFlush has no effect if xm_multiline is used with an input module other than im_file.
- EndLine
-
This is similar to the HeaderLine directive. This optional directive also takes a string or a regular expression literal to be matched against each line. When the match is successful the message is considered complete.
- Exec
-
This directive is almost identical to the behavior of the Exec directive used by the other modules with the following differences:
-
each line is passed in
$raw_event
as it is read, and the line terminator in included; and -
other fields cannot be used, and captured strings can not be stored as separate fields.
This is mostly useful for rewriting lines or filtering out certain lines with the drop() procedure.
-
108.17.2. Examples
XML is commonly formatted as indented multi-line to make it more readable. In the following configuration file the HeaderLine and EndLine directives are used to parse the events. The events are then converted to JSON after some timestamp normalization.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
<Extension multiline>
Module xm_multiline
HeaderLine /^<event>/
EndLine /^</event>/
</Extension>
<Extension xmlparser>
Module xm_xml
</Extension>
<Extension json>
Module xm_json
</Extension>
<Input filein>
Module im_file
File "modules/extension/multiline/xm_multiline5.in"
InputType multiline
<Exec>
# Discard everything that doesn't seem to be an xml event
if $raw_event !~ /^<event>/ drop();
# Parse the xml event
parse_xml();
# Rewrite some fields
$EventTime = parsedate($timestamp);
delete($timestamp);
delete($EventReceivedTime);
# Convert to JSON
to_json();
</Exec>
</Input>
<Output fileout>
Module om_file
File 'tmp/output'
</Output>
<Route parse_xml>
Path filein => fileout
</Route>
<?xml version="1.0" encoding="UTF-8">
<event>
<timestamp>2012-11-23 23:00:00</timestamp>
<severity>ERROR</severity>
<message>
Something bad happened.
Please check the system.
</message>
</event>
<event>
<timestamp>2012-11-23 23:00:12</timestamp>
<severity>INFO</severity>
<message>
System state is now back to normal.
</message>
</event>
{"SourceModuleName":"filein","SourceModuleType":"im_file","severity":"ERROR","message":"\n Something bad happened.\n Please check the system.\n ","EventTime":"2012-11-23 23:00:00"}
{"SourceModuleName":"filein","SourceModuleType":"im_file","severity":"INFO","message":"\n System state is now back to normal.\n ","EventTime":"2012-11-23 23:00:12"}
Each log message has a header (TIMESTAMP INTEGER SEVERITY) which is used as the message boundary. A regular expression is defined for this with the HeaderLine directive. Each log message is prepended with an additional line containing dashes and is written to a file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<Extension dicom_multi>
Module xm_multiline
HeaderLine /^\d\d\d\d-\d\d-\d\d\d\d:\d\d:\d\d\.\d+\s+\d+\s+\S+\s+/
</Extension>
<Input filein>
Module im_file
File "modules/extension/multiline/xm_multiline4.in"
InputType dicom_multi
</Input>
<Output fileout>
Module om_file
File 'tmp/output'
Exec $raw_event = "--------------------------------------\n" + $raw_event;
</Output>
<Route parse_dicom>
Path filein => fileout
</Route>
2011-12-1512:22:51.000000 4296 INFO Association Request Parameters:
Our Implementation Class UID: 2.16.124.113543.6021.2
Our Implementation Version Name: RZDCX_2_0_1_8
Their Implementation Class UID:
Their Implementation Version Name:
Application Context Name: 1.2.840.10008.3.1.1.1
Requested Extended Negotiation: none
Accepted Extended Negotiation: none
2011-12-1512:22:51.000000 4296 DEBUG Constructing Associate RQ PDU
2011-12-1512:22:51.000000 4296 DEBUG WriteToConnection, length: 310, bytes written: 310, loop no: 1
2011-12-1512:22:51.015000 4296 DEBUG PDU Type: Associate Accept, PDU Length: 216 + 6 bytes PDU header
02 00 00 00 00 d8 00 01 00 00 50 41 43 53 20 20
20 20 20 20 20 20 20 20 20 20 52 5a 44 43 58 20
20 20 20 20 20 20 20 20 20 20 00 00 00 00 00 00
2011-12-1512:22:51.031000 4296 DEBUG DIMSE sendDcmDataset: sending 146 bytes
--------------------------------------
2011-12-1512:22:51.000000 4296 INFO Association Request Parameters:
Our Implementation Class UID: 2.16.124.113543.6021.2
Our Implementation Version Name: RZDCX_2_0_1_8
Their Implementation Class UID:
Their Implementation Version Name:
Application Context Name: 1.2.840.10008.3.1.1.1
Requested Extended Negotiation: none
Accepted Extended Negotiation: none
--------------------------------------
2011-12-1512:22:51.000000 4296 DEBUG Constructing Associate RQ PDU
--------------------------------------
2011-12-1512:22:51.000000 4296 DEBUG WriteToConnection, length: 310, bytes written: 310, loop no: 1
--------------------------------------
2011-12-1512:22:51.015000 4296 DEBUG PDU Type: Associate Accept, PDU Length: 216 + 6 bytes PDU header
02 00 00 00 00 d8 00 01 00 00 50 41 43 53 20 20
20 20 20 20 20 20 20 20 20 20 52 5a 44 43 58 20
20 20 20 20 20 20 20 20 20 20 00 00 00 00 00 00
--------------------------------------
2011-12-1512:22:51.031000 4296 DEBUG DIMSE sendDcmDataset: sending 146 bytes
The following configuration will process messages having a fixed
string header containing dashes. Each event is then prepended with a
hash mark (#
) and written to a file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<Extension multiline>
Module xm_multiline
HeaderLine "---------------"
</Extension>
<Input filein>
Module im_file
File "modules/extension/multiline/xm_multiline1.in"
InputType multiline
Exec $raw_event = "#" + $raw_event;
</Input>
<Output fileout>
Module om_file
File 'tmp/output'
</Output>
<Route parse_multiline>
Path filein => fileout
</Route>
---------------
1
---------------
1
2
---------------
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
ccccccccccccccccccccccccccccccccccccc
dddd
---------------
#---------------
1
#---------------
1
2
#---------------
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
ccccccccccccccccccccccccccccccccccccc
dddd
#---------------
The following configuration will process messages having a fixed line
count of four. Lines containing only whitespace are ignored and
removed. Each event is then prepended with a hash mark (#
) and
written to a file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<Extension multiline>
Module xm_multiline
FixedLineCount 4
Exec if $raw_event =~ /^\s*$/ drop();
</Extension>
<Input filein>
Module im_file
File "modules/extension/multiline/xm_multiline2.in"
InputType multiline
</Input>
<Output fileout>
Module om_file
File 'tmp/output'
Exec $raw_event = "#" + $raw_event;
</Output>
<Route parse_multiline>
Path filein => fileout
</Route>
1
2
3
4
1asd
2asdassad
3ewrwerew
4xcbccvbc
1dsfsdfsd
2sfsdfsdrewrwe
3sdfsdfsew
4werwerwrwe
#1
2
3
4
#1asd
2asdassad
3ewrwerew
4xcbccvbc
#1dsfsdfsd
2sfsdfsdrewrwe
3sdfsdfsew
4werwerwrwe
Often, multi-line messages are logged over Syslog and each line is processed as an event, with its own Syslog header. It is commonly necessary to merge these back into a single event message.
Nov 21 11:40:27 hostname app[26459]: Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
Nov 21 11:40:27 hostname app[26459]: eth2 1500 0 16936814 0 0 0 30486067 0 8 0 BMRU
Nov 21 11:40:27 hostname app[26459]: lo 16436 0 277217234 0 0 0 277217234 0 0 0 LRU
Nov 21 11:40:27 hostname app[26459]: tun0 1500 0 316943 0 0 0 368642 0 0 0 MOPRU
Nov 21 11:40:28 hostname app[26459]: Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
Nov 21 11:40:28 hostname app[26459]: eth2 1500 0 16945117 0 0 0 30493583 0 8 0 BMRU
Nov 21 11:40:28 hostname app[26459]: lo 16436 0 277217234 0 0 0 277217234 0 0 0 LRU
Nov 21 11:40:28 hostname app[26459]: tun0 1500 0 316943 0 0 0 368642 0 0 0 MOPRU
Nov 21 11:40:29 hostname app[26459]: Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
Nov 21 11:40:29 hostname app[26459]: eth2 1500 0 16945270 0 0 0 30493735 0 8 0 BMRU
Nov 21 11:40:29 hostname app[26459]: lo 16436 0 277217234 0 0 0 277217234 0 0 0 LRU
Nov 21 11:40:29 hostname app[26459]: tun0 1500 0 316943 0 0 0 368642 0 0 0 MOPRU
The following configuration strips the Syslog header from the netstat output stored in the traditional Syslog formatted file, and each message is then printed again with a line of dashes used as a separator.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
<Extension syslog>
Module xm_syslog
</Extension>
<Extension netstat>
Module xm_multiline
FixedLineCount 4
<Exec>
parse_syslog_bsd();
$raw_event = $Message + "\n";
</Exec>
</Extension>
<Input filein>
Module im_file
File "modules/extension/multiline/xm_multiline3.in"
InputType netstat
</Input>
<Output fileout>
Module om_file
File 'tmp/output'
<Exec>
$raw_event = "-------------------------------------------------------" +
"-----------------------------\n" + $raw_event;
</Exec>
</Output>
<Route parse_multiline>
Path filein => fileout
</Route>
------------------------------------------------------------------------------------
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth2 1500 0 16936814 0 0 0 30486067 0 8 0 BMRU
lo 16436 0 277217234 0 0 0 277217234 0 0 0 LRU
tun0 1500 0 316943 0 0 0 368642 0 0 0 MOPRU
------------------------------------------------------------------------------------
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth2 1500 0 16945117 0 0 0 30493583 0 8 0 BMRU
lo 16436 0 277217234 0 0 0 277217234 0 0 0 LRU
tun0 1500 0 316943 0 0 0 368642 0 0 0 MOPRU
------------------------------------------------------------------------------------
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth2 1500 0 16945270 0 0 0 30493735 0 8 0 BMRU
lo 16436 0 277217234 0 0 0 277217234 0 0 0 LRU
tun0 1500 0 316943 0 0 0 368642 0 0 0 MOPRU