104. Debugging NXLog
When other troubleshooting fails to identify (or resolve) an issue, inspecting the NXLog agent itself can prove useful. Some techniques are outlined below.
104.1. Dump Debug Info to NXLog’s Internal Log
A simple way to quickly get a more complete picture of NXLog’s current status is to dump debug info into the internal log. This information can be helpful in debugging, for example, why an input module is not sending to an output module. Normally, internal events are written to the log file configured with the LogFile directive.
-
On Linux, send SIGUSR1 to the application.
# kill -SIGUSR1 $PID
-
On Windows, send the service control command "200" to the application.
> sc control nxlog 200
2017-03-29 10:05:19 INFO event queue has 2 events;jobgroup with priority 10;job of module in/im_file, events: 0;job of module out/om_null, events: 0;non-module job, events: 0;jobgroup with priority 99;non-module job, events: 0;[route 1]; - in: type INPUT, status: RUNNING queuesize: 0; - out: type OUTPUT, status: RUNNING queuesize: 0;
The status is the most important piece of information in the dumped log entries. A status of PAUSED
means the input module is not able to send because the output module queue is full. In such a case the queuesize
for the corresponding output(s) would be over 99. A status of STOPPED
means the module is fully stopped, usually due to an error (e.g. TCP disconnection for om_tcp).
104.2. Switch to DEBUG Log Level
NXLog’s log level can be switched to DEBUG level without requiring a restart.
For extended debugging sessions, consider setting the
LogLevel directive to DEBUG
.
-
On Linux, send SIGUSR2.
# kill -SIGUSR2 $PID
-
On Windows, send service control command 201.
> sc control nxlog 201
104.3. Generate Core Dumps
Core dumps can act as a helpful resource for the NXLog development and support teams for debugging issues.
104.3.1. Core Dumps on Linux
Note
|
It is necessary to install the NXLog debug symbols package in order to produce useful core dump files. |
-
Remove the User and Group directives from the configuration. NXLog needs to be running as root:root to produce a core dump.
-
Use ulimit to remove the core file size limit.
# ulimit -c unlimited
-
Run NXLog manually to test that it can create a core dump.
# /opt/nxlog/bin/nxlog -f
-
Find the NXLog process and kill it with the SIGABRT signal.
# kill -ABRT `ps aux | grep [/]opt/nxlog/bin/nxlog | awk '{print $2}'`
-
Verify that a core dump file was created at
/opt/nxlog/var/spool/nxlog/core
.# ls -l /opt/nxlog/var/spool/nxlog/ total 26708 -rw------- 1 root root 27348992 Oct 30 08:51 core
-
If the core dump file was created successfully, run NXLog again as root in order to catch the next crash.
# /opt/nxlog/bin/nxlog -f
104.3.2. Core Dumps on Windows
Core dumps can be generated on Windows by using ProcDump from Microsoft Sysinternals.
Note
|
ProcDump runs on Windows Vista and higher, and Windows Server 2008 and higher. |
For example, run the following to write a full dump of the nxlog
process when its handle count exceeds 10,000:
> procdump -ma nxlog -p "\Process(nxlog)\Handle Count" 10000
104.4. Inspect Memory Leaks
If NXLog’s memory usage exceeds 200 MB, there is likely a memory leak.
104.4.1. Inspecting Memory Leaks on Linux
We recommend using Valgrind on GNU/Linux to debug memory leaks.
-
Install the debug symbols (
-dbg
) package (for example,nxlog-dbg_3.0.1759_amd64.deb
).NoteThe NXLog debug symbols package is currently only available for Linux. This package is not included with NXLog by default, but can be provided on request. -
Install Valgrind.
-
Set the NoFreeOnExit directive to TRUE in the NXLog configuration file. This directive ensures that modules are not unloaded when NXLog is stopped, which allows Valgrind to properly resolve backtraces into modules.
-
Start NXLog under Valgrind with the following command. If User is set to
nxlog
in the configuration, then the command must be executed withsu
, otherwise Valgrind will not be able to create themassif.out
file at the end of the sampling process.# cd /tmp # su -lc "valgrind --tool=massif --pages-as-heap=yes /opt/nxlog/bin/nxlog -f" nxlog
-
Let NXLog run for a while until the Valgrind process shows the memory increase, then interrupt it with Ctrl+C. The output is written to
/tmp/massif.out.xxxx
. -
Send the
massif.out.xxxx
file with a bug report. -
Optionally, create a report from the
massif.out.xxxx
file with thems_print
command:# ms_print massif.out.xxxx
The output of the
ms_print
report contains an ASCII chart at the top showing the increase in memory usage. The chart shows the sample number with the highest memory usage—marked with(peak)
. This is normally at the end of the chart (the last sample). The backtrace from this sample indicates where the most memory is allocated.
104.4.2. Inspecting Memory Leaks on Windows
Windows Process Explorer from Microsoft Sysinternals can be used to inspect memory use of all running programs.
Once a potential source of excessive memory use has been determined, use DebugView from Microsoft Sysinternals to inspect the application’s debug output.