David - Musings of an SRE

Logging Tornado Logs To Graylog (with JSON)

If you’re building your microservices on Tornado and your chosen deployment strategy is to treat it like cattle instead of pets, logging is going to have to play a very important part of your architecture.

To ensure that all your information is logged and stored in a centralized location, I built a full HA cluster of graylog nodes and elasticsearch nodes which I will use to keep track of my application.

This little guide is for users who already have a graylog server installed and setup and is really meant as a development log to note down the steps needed to get your Tornado application ready for production logging.

Package

Install the graypy package.

$ pip install graypy

Setup the GELF handlers

In your main application .py file, hook tornado’s inbuilt logging module with the logging handler provided by graypy

# app.py

def main():
    app = TornadoApplication()
    ...
    # Add:
    handler = grapy.GELFHandler('host_to_graylog_node', port, localname="name_of_your_app_identifier")

    logging.getLogger("tornado.access").addHandler(handler)
    logging.getLogger("tornado.application").addHandler(handler)
    logging.getLogger("tornado.general").addHandler(handler)
    ...
    tornado.ioloop.IOLoop.current().start()

For more information about what tornado.access, tornado.application, tornado.general` logs, check out the main tornado.log documentation

At this point, if you were to restart your application and your GELFHandler is configured correctly, you should see something like the following logged onto your Graylog.

404 GET /users/profiles/37 (127.0.0.1) 1.28ms

This is great, now you’re all set!

Bonus: Lets add additional fields

By default, Graylog logs everything to your full_message field in Elasticsearch. In order to better take advantage of Elasticsearch’s indexing, why not generate better formatted logs that would store your application information into a spread of indexable fields.

For eg. wouldn’t it be great to be able to search directly for status: 400 for all requests with the status_code of 400?

The good thing about python’s logging module is that it allows you to format your logs in whatever way you want.

JSON-ify your logs

Lets first make your logs into JSON.

To do so, we will make use of the log_function method that is provided by Tornado.

def log_function(handler):
    info = {
        'status': handler.get_status(),
        'method': handler.request.method,
        'url': handler.request.uri,
        'remote_ip': handler.request.remote_ip,
        'duration': (handler.request.request_time()*1000)
    }
    log_extra = logging.LoggerAdapter(logging.getLogger("tornado.access"),
            info)
    log_extra.info(info)

def main():
    app = TorandoApplication()
    app.log_request = log_function
    ...
    tornado.ioloop.IOLoop.current().start()

This will create an output that looks like:

{"duration": "15.03ms", "method": "GET", "status": 500, "remote_ip": "127.0.0.1", "url": "/tours"}

And logs it as a full_message in graylog. Not only that, those JSON fields like duration, method etc. will also appear as fields in your logging entry.

Now you’re all set!

References