Skip navigation

Geek Speak

4 Posts authored by: mellowd

Log aggregation

Posted by mellowd Nov 2, 2014

Way back in the past I used to view logs after an event has happened. This was painfully slow, especially when viewing the logs of many systems at the same time.


Recently I've been a big fan of log aggregators. On the backend it's a standard log server, while all the new intelligence is on the front end.


One of the best uses of this in my experience is seeing what events have occurred and which users have made changed just before. Most errors I've seen are human error. Someone has either fat fingered something or failed to take into account all the variables or effects their change could have. The aggregator can very quickly show you that x amount of routers have OSPF flapping, and that x user just made a change 5 minutes ago.


What kind of intelligent systems are you using on your logs? Do you use external tools, or perhaps home grown tools to run through your logs and pull relevant information and inform you? Or, do you simply use logs as a generic log in the background to only go through when something goes wrong?


Online logging

Posted by mellowd Oct 27, 2014

There are a number of companies doing log analysis in 'the cloud' - What do people think of the security implications of this?


Your logs that are uploaded are generally inside some sort of private container, however there have been a number of high profile security concerns. This includes holes in regular open-source software as well as lax security by companies providing cloud services.


If you're uploading security logs to a remote system, and that system is compromised, you're essentially giving a blueprint for how to get into your network for those who now have your logs.


What's the best strategy for this? I have a few, each with advantages and disadvantages:

  • Never use one of these services - Keep it all in house, though you lose a ton of the analytics they provide unless you've got developers inhouse to do this.
  • Filter what you upload -  This gives a broken picture. Partial logs don't mean much and it will be difficult to figure out what you should be filtering.
  • Put your trust in them -  Famous last words? I err on the side of caution and trust no-one.


Each of these has advantages and disadvantages and I'm eager to see what others feel.

When saving logs I like to have as verbose data as possible to be stored. However when viewing a log I may only be looking at specific parts of that log. Another concern is if I need to give my logs to a third party and I don't want to reveal certain information to that 3rd party. I'll go over a couple of things that I use on a day to day basis. Note that entire books have been written about SED and AWK so my use of them is very limited compared to what could be done.



The way I use SED is very similar to VI's search and replace tool. A good example of this is that my blog ( sits behind a reverse proxy. I have an IPTables rules that logs any blocked traffic. Now I'd like to share my deny logs, but I don't want you to see my actual server IP, I only want you to see my reverse proxy IP. If I just showed you my raw logs, you'd see my actual IP address. By grepping this log through sed, I can change it on the fly. The format to do so is sed s/<source pattern>/<destination pattern>/

I've used sed to change my IP to and now can happily show the logs. Note this is done in real-time so piping tail through sed is possible:

Oct 19 14:43:11 mellowd kernel: [10084876.715244] IPTables Packet Dropped: IN=venet0 OUT= MAC= SRC= DST= LEN=118 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=34299 DPT=1900 LEN=98

Oct 19 14:49:33 mellowd kernel: [10085258.251596] IPTables Packet Dropped: IN=venet0 OUT= MAC= SRC= DST= LEN=434 TOS=0x00 PREC=0x00 TTL=52 ID=0 DF PROTO=UDP SPT=5063 DPT=5060 LEN=414

Oct 19 14:52:53 mellowd kernel: [10085458.580901] IPTables Packet Dropped: IN=venet0 OUT= MAC= SRC= DST= LEN=40 TOS=0x00 PREC=0x00 TTL=104 ID=62597 PROTO=TCP SPT=6000 DPT=3128 WINDOW=16384 RES=0x00 SYN URGP=0



AWK is very handy to get only the information you require to show. In the above example there is a lot of information that I might now want to know about. Maybe I'm only interested in the date, time, and source IP address. I don't care about the rest. I can pipe the same tail command I used above through awk and get it to show me only the fields I care about. By default, awk uses the space as the field separation character and then each field is numbered sequentially. The format for this is awk '{ print <fields you want to see> }'


I'll now simply cat the syslog file, and use awk to show me what I want to see:

sudo cat /var/log/syslog |  grep IPTables | awk '{ print $1" "$2" "$3"\t"$13 }'

Oct 19 14:27:47    SRC=

Oct 19 14:29:04    SRC=

Oct 19 14:37:06    SRC=

Oct 19 14:40:32    SRC=

Oct 19 14:40:32    SRC=

Oct 19 14:41:36    SRC=

Oct 19 14:43:11    SRC=

Oct 19 14:49:33    SRC=

Oct 19 14:52:53    SRC=

Oct 19 14:54:50    SRC=

Oct 19 14:58:04    SRC=

Oct 19 15:01:49    SRC=

Oct 19 15:05:38    SRC=

Oct 19 15:06:48    SRC=

Oct 19 15:07:35    SRC=

Oct 19 15:13:42    SRC=

I've included spaces and a tab character between the fields to ensure I get the output looking as I want it. If you count the original log you'll see that field 1 = Oct, field 2 = 19, field 3 = the time, and field 13 = SRC=IP

I may not want to see SRC= in the output, so use sed to replace it with nothing:

sudo cat /var/log/syslog |  grep IPTables | awk '{ print $1" "$2" "$3"\t"$13 }' | sed s/SRC=//

Oct 19 14:27:47

Oct 19 14:29:04

Oct 19 14:37:06

Oct 19 14:40:32

Oct 19 14:40:32

Oct 19 14:41:36

Oct 19 14:43:11

Oct 19 14:49:33

Oct 19 14:52:53

Oct 19 14:54:50

Oct 19 14:58:04

Oct 19 15:01:49

Oct 19 15:05:38

Oct 19 15:06:48

Oct 19 15:07:35

Oct 19 15:13:42


I'm eager to see any other handy sed/awk/grep commands you use in a similar scope to the ones I use above.


Log time lengths

Posted by mellowd Oct 13, 2014

How long do you keep your logs for? The answer can vary wildly depending on the industry you work for. As an example, most VPN providers specifically note that they do not hold logs, so even if a government requested certain logs, they would not have them. The logs they don’t keep are likely to be only user access logs. They’ll still have internal system logs.


Ultimately keeping logs for long is of little benefit unless there is a security reason to do so. The recent shellshock bug is a great example of when older logs can be useful. You may have a honeypot out in the wild and once a known issue comes to the fore, scan your logs to see if this particular bug has been exploited before it was well known.


Country and industry regulations will also influence the amount of times logs are kept. Many countries require that documentation and logging data be kept for a certain amount of years for any number of reasons.


I’m interested to know how long you keep logs for. What particular logs as well as why that length of time was chosen.

Filter Blog

By date: By tag: