Showing results for 
Search instead for 
Did you mean: 

Data retention policies: lessons learned & what stays on the boat.

Level 10

What seems like a lifetime ago I worked for a few enterprises doing various things like firewall configurations, email system optimizations and hardening of Netware, NT4, AIX and HPUX servers. There were 3 good sized employers, a bank and two huge insurance companies that both had financial components. While working at each and every one of them, I was, subject to their security policy (one of which I helped to craft, but that is a different path all together), and none of which really addressed data retention. When I left those employers, they archived my home directories, remaining email boxes and whatever other artifacts I left behind. None of this was really an issue for me as I never brought any personal or sensitive data in and everything I generated on site was theirs by the nature of what it was. What did not occur to me then, though, was that this was essentially a digital trail of breadcrumbs that could exist indefinitely. What else was left behind and was it also archived? Mind you, this was in the 1990s and network monitoring was fairly clunky, especially at scale, so the likely answer to this question is "nothing", but I assert that the answer to that question has changed significantly in this day and age.

Liability is a hard pill for businesses to swallow. Covering bases is key and that is where data retention is a double edged sword. Thinking like I am playing a lawyer on TV, keeping data on hand is useful for forensic analysis of potentially catastrophic data breaches, but it can also be a liability in that it can prove culpability in employee misbehavior on corporate time, resources and behalf. Is it worth it?

Since that time oh so long ago I have found that the benefit has far outweighed the risk in retaining the information, especially traffic data such proxy, firewall, and network flows.  The real issues I have, as noted in previous posts, is the correlation of said data and, more often than not, the archival method of what can amount to massive amounts of disk space.

If I can offer one nugget of advice, learned through years of having to decide what goes, what stays and for how long, it is this: Buy the disks. Procure the tape systems, do what you need to do to keep as much of the data as you can get away with because once it is gone it is highly unlikely that you can ever get it back.

Level 17

Very true, about not being able to get your data back. Anyone who has experienced a db crash knows this - last back up - 2 1/2 days ago... sales, engineering, and production all get very angry as the past two days seem to not exist.

  * that case was a - need more often back ups - and the supposed guy in charge was the cause of it.

Level 10

This can be very hairy nowadays. It's not quite as simple as backing everything up and throwing it in a cave.  Now there are compliance standards to go by, PCI, SOX, FISMA, etc. PCI (financial/credit info), for example, seems to push for shorter retention times, and some types of data, such as CVV numbers on cards cannot be stored at all and any stored data to be heavily protected.  FISMA usually follows NIST/SANS guidelines, and SOX can vary from 3-7 years to permanent  It can get even worse for older tech companies which may have stored different types of data spread over multiple media.  Some of those tech companies need to retain data as long as possible for such things as patent cases and may have data that falls under different compliance standards. Extracting all of that data from R2R or DAT tapes, parsing out data, and backing them up again to standards can be a arduous process.  The best bet is to retain non-sensitive (read: does not fall under a compliance standard) for as long as possible.  Also, for DR purposes having off-site storage would be prudent as well.  If in doubt, thoroughly check the data.  You don't want to be responsible for not having data backed up to company standards and/or lose a patent lawsuit, but you definitely don't want to be dinged by the government - or even worse yet be responsible for having sensitive information compromised.

Level 15

The type of industry that you are in and the compliance requirements have generally in my experience been the deciding factor for the length of data retention periods.  In my long career, I have found the cave with 1000's of backup tapes to which the tape drive has long since died or the need to have everything purged in 180 days.


Your closing paragraph really makes a great point. And while retaining certain data may be a liability, it's a greater risk to the business entity when you start deleting things. Because no one keeps track of what they've deleted. And you'll find yourself asking, "Did I lose that file, or did I delete that file?"

Level 10

Bit rot is another oft overlooked issue in this same vein. The tapes with no drive? Borderline useless unless you can find a used replacement. Flash media failing over time, optical media being scratched or losing their burn? Also a big problem, much like the usefulness of offsite backups. Cloud solves some of these problems or pushes them off and makes them the problem of another entity that assumes liability, but then you run into issues of confidentiality. I have settled on crashplan for most of my personal stuff with 4 total locations: 2 offsite, one onsite and one cloud. I use custom encryption keys to help ease my mind, but with compliance issues that some entities face, this is a non-starter or requires some custom work for stripping and pruning.

It's a whole job in and of itself.

Level 13

What's sad is that in many cases, the system owners or management don't want to pay for the needed storage, but when that "aged-out" data is later needed, demand explanations as to why it isn't available and can't be reconstructed. It's funny how the funding suddenly becomes available once that happens.

Level 10

This is spot on, Terry. Keeping email threads and change / request management requests is key here. For all of its faults and draconian implementations, there is real value to some of the ITIL practices in this respect. Even if it is secondary or tertiary to the problem, having an accounting trail to point back at when asked those questions is key to not making the same mistakes over and over.

Level 13

I agree that retaining logs is great from a network admin point of view.  However, any given organization's legal department may have the final say.  Regulatory compliance is part of the answer, but discoverable data retention must also be considered as Nick hints at in the main post.  Many organizations limit email data retention in order to limit the scope of any evidenciary search they may be subject to at a later time.

For example, suppose a given organization has a policy that emails must be deleted after 30 days but a subsequent search (such as from a subpoena) shows that network logs exist for a time period prior to the 30 days before the search began.  Now the grilling begins.  Questions such as "Why are the policies not consistent?", "What other data sources exist?", "What is hidden in the backups?", "Why are you obstructing the investigation?", "What else are you hiding?".

All good reason for the logs to be managed and owned by the InfoSec/Compliance groups rather than the individual producer groups (i.e. app, server, network).


Pretty much everyone is spot on... no need to re-iterate what has been said.

About the Author
15+ years IT experience ranging from networking, UNIX, security policy, incident response and anything else interesting. Mostly just a networking guy with hobbies including, film, beer brewing, boxing, MMA, jiu jitsu/catch wresting/grappling, skateboarding, cycling and being a Husband and Dad. I don't sleep much.