Return to
Portfolio

25.4. Character Set Conversion

It is recommended to normalize logs to UTF-8. The xm_charconv module provides character set conversion: the convert_fields() procedure for converting an entire message (all event fields) and a convert() function for converting a string.

Example 85. Character Set Auto-Detection of Various Input Encodings

This configuration shows an example of character set auto-detection. The input file can contain differently encoded lines, and the module normalizes output to UTF-8.

nxlog.conf [Download file]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<Extension _charconv>
    Module              xm_charconv
    AutodetectCharsets  utf-8, euc-jp, utf-16, utf-32, iso8859-2
</Extension>

<Input filein>
    Module              im_file
    File                "tmp/input"
    Exec                convert_fields("auto", "utf-8");
</Input>

<Output fileout>
    Module              om_file
    File                "tmp/output"
</Output>

<Route r>
    Path                filein => fileout
</Route>