Return to
Portfolio

116. Patterns

Patterns provide a way to extract important information (e.g. user names, IP addresses, URLs, etc.) from free-form log messages.

Many sources, Syslog for example, generate log messages in an unstructured but human readable format, typically a short sentence or sentence fragment. Consider the following message generated by the SSH server when an authentication failure occurs:

Failed password for john from 127.0.0.1 port 1542 ssh2

To create a report about authentication failures, the username (john in the above example) needs to be extracted. Patterns support simple string matching and also allow use of regular expressions for this purpose. Moreover, it can leverage regular expressions in ways which outstrip simple string extraction.

  • The matching executed against the field(s) can be an exact match or a regular expression.

  • Patterns contain match criteria to be executed against one or more fields, matching the pattern only if all fields match. This technique allows patterns to be used with structured logs as well.

  • Patterns can extract data from strings using captured substrings and store these in separate fields.

  • Patterns can modify the log by setting additional fields. This is useful for message classification.

  • Patterns can contain test cases for validation.

  • Patterns can be collected into Pattern Groups, greatly simplifying their application to specific sources.

Patterns are used by the NXLog agent. This makes it possible to distribute pattern matching tasks to the agents, and receive pre-processed, ready-to-store logs instead of parsing all logs at the central log server—which can yield a significant reduction in CPU load on the server.

For more information about the patterns used by the NXLog agent, please refer to the pm_pattern module documentation in the NXLog Reference Manual.

116.1. Pattern Groups

Pattern groups are used to collect together those patterns which are used to match log messages generated by a particular application or log source. Some pattern groups are not applicable to specific log sources. With pattern groups it is easy to exclude (or include) patterns which cannot match at all because the source would never generate such log messages. For example, when there is no SSH service on a system, there is no need to match patterns in the SSHd group against the logs coming from this system.

Pattern groups also serve an optimization purpose. They can have an optional match criteria. One or more fields can be specified using either EXACT or REGEXP match. The log message is first checked against this match criteria. If it matches, only then will be the patterns belonging to the group matched against the log message.

To create a pattern group, the following form needs to be filled out.

Creating a pattern group

After form submission, the pattern group can be viewed:

Viewing a pattern group

In the above example the ssh patterns will only be checked against the log if the field SourceName matches the string sshd. The SourceName field must be extracted from the Syslog message with a syslog parser prior to running the logs through the pattern matcher.

116.2. Creating a Pattern

Patterns can be created directly by clicking on the CREATE PATTERN menu item. In this case an empty form must be filled out.

Pattern information block

Here, enter the basic pattern information. Make sure the Pattern Group is set.

Next, define at least one field and value to match. For example, a message field:

Pattern’s Match block pre-populated with values

This can be made more generic as needed so that, for example, the pattern can extract the user name and the destination IP address from the message:

Pattern’s Match block

Those parts of the pattern are replaced with regular expression constructs, (\S+) in the above example, which are not static. Captured substrings are stored in the selected fields. In the above example AccountName and DestinationIPv4Address are used to store the values extracted with (\S+).

If it is necessary, add more than one field to execute the matching operation against. The match type can be either an EXACT or a REGEXP match. If this is toggled to REGEXP, the NXLog Manager will offer to escape special characters:

Escaping special characters in a pattern

If the regular expression does not start with the caret (^), the regular expression engine will try to find an occurrence anywhere in the subject string. This is a costly operation. Typically, the regular expression is intended to match the start of the string, and for this reason the interface shows a hint:

Consider using ^
Note
The regular expressions are compiled and executed by the NXLog engine using the PCRE library. The regular expression must be PCRE compatible in order to work.

The last block is for optional test cases:

Test case for a pattern

This built-in testing interface is extremely useful for verifying the functionality of pattern definitions, without the costly overhead of loading the pattern into the agent and running it against a set of logs.

After clicking the Calculate Fields button, the captured field values appear. Field values are populated with the content of the log message used when the pattern was created.

Note
If the field values are not appearing or if the values are unexpected, closely review the regular expression(s) in use. The syntax of regular expressions is very compact and oversights are not uncommon.

116.3. Message Classification with Patterns

Patterns load values from captured substrings into fields, but can also used to create additional fields and populate them with values. This feature can be used for message classification, and to tag log messages with special values, which can then be used later in the processing chain.

Set block of the Create Pattern form

Event taxonomy fields allow events to be handled in a uniform manner, regardless of their source.

NXLog Manager comes with five special fields for this purpose. Their names all begin with Taxonomy. A dictionary of permissible values for these fields is provided.

These fields are optional, however it is strongly recommended they be used. Custom fields, with their own permissible values, can also be created.

If there is no need to classify the event with a Taxonomy field, click Delete to remove it.

The pattern list has a simple search input box in the upper right corner. This can search for entries in the list and will show rows which contain the specified keyword.

Pattern list

There is a more powerful search interface which allows searching in any of the patterns' properties (fields, test cases, etc). Click on the SEARCH PATTERN menu item under the PATTERN menu.

Searching for patterns

116.5. Exporting and Importing Patterns

NXLog Manager can export and import patterns in an XML format. This is the same format used by the NXLog agent. To export a pattern or a pattern group, check its checkbox in the list and click Export. Import a pattern database file by clicking on the IMPORT PATTERN menu item or the Import button under the pattern list.

116.6. Using Patterns

Patterns are used and executed by the NXLog engine. Unlike other log analysis solutions which utilize a single pattern matcher in the central engine, the architecture of NXLog Manager allows patterns to be used on the agents as well.

To use the patterns in an NXLog agent, add a pm_pattern processor module and select the appropriate pattern groups:

Configuring the pm_pattern module

The patterns will be pushed to the NXLog agent after clicking Update config and they will take effect after a restart. See the Agents chapter for more information about agent configuration details.

Note
Some patterns work with a set of fields and this requires some preprocessing (e.g. syslog parsing) in some cases. Instead of writing a regular expression to match a full Syslog line which includes the header (priority, timestamp, hostname etc), it is a lot more efficient to write the regular expression to match the Message field (instead of the raw_event field) and have a syslog parser store the header information in separate fields before the pattern matching. These patterns will be usable when the same message is collected over a different protocol.