Using Graylog Extractors to Split Logs
If you’re new to Graylog, Graylog Extractors are a great way to pull out information from your logs for easier storing and manipulation.
If like me, you sometimes experience a “String fields longer than 32kb” indexing error on one of your fields, a good way to help mitigate it is to use extractors to split your field into two.
I wouldn’t recommend this all the time as obviously the way to solve this is to set your particular field to be non_indexable but in certain cases when you need the full data and still want to be able to search for it, this might be the better solution.
Extractors are little code instructions that you place in your graylog inputs that processes all your incoming logs and perform extraction tasks on it based on a regex that you provide it.
There are 2 times of tasks that can be performed. cut and copy.
Cut would match your regex and cut out the matching string from the original field and paste them onto a new field defined by you.
Copy well, would do the exact same thing except that it would leave the original field untouched.
So lets get started.
Remember the goal is to: Split a long string into two.
-
First, in your Graylog Web Interface, hit the
System/Inputs
menu at the top of your navigation bar. ClickInput
and you should see a whole list of inputs -
Choose the input that you’d like to place your extractor in and click on
Manage extractors
-
Next, hit
Import
under theActions
bar -
Paste the following JSON
{
"extractors": [
{
"condition_type": "regex",
"condition_value": "^.{16383,}$",
"converters": [],
"cursor_strategy": "cut",
"extractor_config": {
"regex_value": "^.{0,16383}(.*)"
},
"extractor_type": "regex",
"order": 0,
"source_field": "response",
"target_field": "responseTail",
"title": "cut response"
}
],
"version": "1.2.2 (91c7822)"
}
Remember to replace source_field
with the actual field that you’d like to listen to and target_field
to the field that you want to copy it into.