David - Musings of an SRE

Parsing S3 Logs with GoAccess and s3cmd Sync

In this post, I’d like to demostrate a way of analyzing logs gathered through Amazon AWS’s log collection using the awesome tool GoAccess as well as going through a quick walkthrough in setting up S3cmd Sync to pull down your logs quickly from your S3 buckets.

Why GoAccess?

So far for me, GoAccess has a great suite of analytic tools from Terminal outputs like:

test To HTML reports like: test

Lets get started!

Prerequisite:

  • You’re already storing AWS Elastic Load Balancer logs
  • You’re already storing AWS S3 Logs

If you’re not sure how to start, please check out the link here for setting up ELB logs and here for setting up S3 logs.

Installing GoAccess

The best way to install GoAccess in my opinion is still from source.

$ git clone https://github.com/allinurl/goaccess.git
$ cd goaccess
$ autoreconf -fi
$ ./configure --enable-geoip --enable-utf8
$ make
# make install

Install the various configuration scripts

GoAccess have a couple of predefined log formats that it currently supports.

From:

  • Common Log Format (CLF)
  • Combined Log Format (XLF/ELF)
  • W3C Format (IIS)
  • Amazon CloudFront (Download Distribution).

Surprisingly, they haven’t added Amazon’s ELB log format and S3 access format by default.

But no worries, you can easily do it on your own.

Open up ~/.goaccessrc, create it if there’s nothing there.

Add the following:

# ~/.goaccessrc

# AWS S3 Log Format
date-format %d/%b/%Y
log-format %^ %^ [%d:%^] %h %^ %^ %^ %^ "%^ %r %^" %s %^ %b %^ %^ %^ "%^" "%u" %^

# AWS Elastic Load Balancer Log Format
log-format %dT%t.%^ %^ %h:%^ %^ %T %^ %^ %^ %s %^ %b "%r" "%u"
date-format %Y-%m-%d
time-format %H:%M:%S

Note: Please ensure that your log-formats are all returned each in a single line.

Pull your logs using S3cmd

Prerequisite:

  • Download and Install S3cmd.

A quick way to setup s3cmd is to use your favorite package manager to install.

# Ubuntu
$ sudo apt-get install s3cmd

# RHEL/Centos/Fedora
$ sudo yum install s3cmd

# Homebrew (OSX)
$ brew install s3cmd

Once that is done, prepare your AWS Access Key and Access Secret. You’ll need to put those details in when you configure it:

$ s3cmd --configure

Just go through the various prompt. Default answers are usually fine. It should be quite straight forward.

And you’re done!

Now, all you need is to sync it up with

$ s3cmd sync  s3://bucket-name/path/ /path/to/your/local/dir

And you’re done!

Analyze Away!

Now, go to your /path/to/your/local/dir/ directory and run find . -exec cat {} \; | goaccess -a and you should see GoAcccess parsing your logs.

I hope this would be useful for you guys looking to make sense of your AWS logs.

References: