Parsing S3 Logs with GoAccess and s3cmd Sync
In this post, I’d like to demostrate a way of analyzing logs gathered through Amazon AWS’s log collection using the awesome tool GoAccess as well as going through a quick walkthrough in setting up S3cmd Sync to pull down your logs quickly from your S3 buckets.
Why GoAccess?
So far for me, GoAccess has a great suite of analytic tools from Terminal outputs like:
To HTML reports like:
Lets get started!
Prerequisite:
- You’re already storing AWS Elastic Load Balancer logs
- You’re already storing AWS S3 Logs
If you’re not sure how to start, please check out the link here for setting up ELB logs and here for setting up S3 logs.
Installing GoAccess
The best way to install GoAccess in my opinion is still from source.
$ git clone https://github.com/allinurl/goaccess.git
$ cd goaccess
$ autoreconf -fi
$ ./configure --enable-geoip --enable-utf8
$ make
# make install
Install the various configuration scripts
GoAccess have a couple of predefined log formats that it currently supports.
From:
- Common Log Format (CLF)
- Combined Log Format (XLF/ELF)
- W3C Format (IIS)
- Amazon CloudFront (Download Distribution).
Surprisingly, they haven’t added Amazon’s ELB log format and S3 access format by default.
But no worries, you can easily do it on your own.
Open up ~/.goaccessrc
, create it if there’s nothing there.
Add the following:
# ~/.goaccessrc
# AWS S3 Log Format
date-format %d/%b/%Y
log-format %^ %^ [%d:%^] %h %^ %^ %^ %^ "%^ %r %^" %s %^ %b %^ %^ %^ "%^" "%u" %^
# AWS Elastic Load Balancer Log Format
log-format %dT%t.%^ %^ %h:%^ %^ %T %^ %^ %^ %s %^ %b "%r" "%u"
date-format %Y-%m-%d
time-format %H:%M:%S
Note: Please ensure that your log-formats are all returned each in a single line.
Pull your logs using S3cmd
Prerequisite:
- Download and Install S3cmd.
A quick way to setup s3cmd is to use your favorite package manager to install.
# Ubuntu
$ sudo apt-get install s3cmd
# RHEL/Centos/Fedora
$ sudo yum install s3cmd
# Homebrew (OSX)
$ brew install s3cmd
Once that is done, prepare your AWS Access Key and Access Secret. You’ll need to put those details in when you configure it:
$ s3cmd --configure
Just go through the various prompt. Default answers are usually fine. It should be quite straight forward.
And you’re done!
Now, all you need is to sync it up with
$ s3cmd sync s3://bucket-name/path/ /path/to/your/local/dir
And you’re done!
Analyze Away!
Now, go to your /path/to/your/local/dir/
directory and run find . -exec cat {} \; | goaccess -a
and you should see GoAcccess parsing your logs.
I hope this would be useful for you guys looking to make sense of your AWS logs.
References: