Return to
Portfolio

25.5. Detecting a Dead Agent or Log Source

It is a common requirement to detect conditions when there are no log messages coming from a source. This usually indicates a problem such as a broken network connection, a server down, or a stuck application or system service. Usually this problem should be detected by monitoring tools (such as Nagios or OpenView), but the absence of logs can also be a good reason to investigate.

Note
The im_mark module is related: it can generate messages periodically in order to show that the agent is still functioning.

The solution to this problem is the combined use of statistical counters and Scheduled checks. The input module can update a statistical counter configured to calculate events per hour. In the same input module a Schedule block checks the value of the statistical counter periodically. When the event rate is zero or drops below a certain limit, an appropriate action can be executed such as sending out an alert email or generating an internal warning message. Note that there are other ways to solve this issue and this method may not be optimal for all situations.

Example 86. Alerting on Absence of Log Messages

The following configuration example creates a statistical counter in the context of the im_tcp module to calculate the number of events received per hour. The Schedule block within the context of the same module checks the value of the msgrate statistical counter and generates an internal error message when there were no logs received in the past hour.

nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<Input in>
    Module  im_tcp
    Port    2345
    <Exec>
        create_stat("msgrate", "RATE", 3600);
        add_stat("msgrate", 1);
    </Exec>
    <Schedule>
        Every   3600 sec
        <Exec>
            create_stat("msgrate", "RATE", 10);
            add_stat("msgrate", 0);
            if defined get_stat("msgrate") and get_stat("msgrate") <= 1
                log_error("No messages received from the source!");
        </Exec>
    </Schedule>
</Input>