When you create and store a file, or look at an email, after a few days or even a few weeks, the probability you will look at that email or file again is close to 0%.  The same goes for records in a database or perhaps a monthly statement. Once created, these documents may be reviewed once or twice, and then never seen again.

Data retention is a topic chief financial officers, data center managers and storage administrators used to worry a lot more about when data storage was crazy expensive (e.g., $1,000 a gigabyte wasn’t uncommon in the mid-90s). Today, storing data is amazingly cheap (measured in one or two pennies per gigabyte or less); consequently, companies are not as concerned about costs as they are about client satisfaction, IT resiliency and the regulatory, legal and technology requirements around maintaining and storing so much data.  

As more states pass expanding regulations around privacy and protecting personal identifiable information (PII), and as the requirement increases for companies to notify stakeholders when they experience a security breach or data loss, data retention is once again a topic of interest.  


If a client asks how long you are retaining their PII, can you answer this question? Do you know how to figure this out?


For example, in the State of California, with the revised California Consumer Privacy Act (CCPA), consumers now have the right to know how long a business will retain a consumer’s “sensitive personal information.”  If a client called you, can you answer this question? Do you know how to find the answer?

Within the European Union, the General Data Protection Regulation (GDPR) is also focused on personal data and retention, but the GDPR regulation is relatively vague on the data retention topic.  In short, the regulation says data cannot be kept longer than you need it. Meaning there is no specific time requirement (with exceptions), but the guidance is to be mindful of the data you collect.  

And, even though the IRS is more specific in regards to how long data must be preserved, even the IRS provides a range.  The range is between three and seven years dependent on the filing. 

Although specified time limits imposed by regulatory and legislative bodies in general are often vague, or not specified at all, to avoid litigation and technology costs, and avoid internal confusion, companies should implement policies to demonstrate their retention needs. This means companies should consider a number of attributes, including those mentioned above (e.g., privacy regulation, federal or state requirement) when developing a data retention policy. 


The longer a business retains data, the more data backup storage is required, and the more expensive the costs.


Here are some additional considerations:

Legal Requirements:  The longer you keep data could be a liability or an asset.  Does retaining data help you in case of litigation or will it hurt you?  Email is a perfect example of this retention dilemma. Some companies force email deletions after 12 months while other businesses consciously keep emails forever.

Client Satisfaction:  How far back do you need to go to access a client’s transaction record?  Bank charges and credit card statements have long retention periods due to concerns of litigation and because of federal and state regulations (e.g., records related to mortgage applications could be retained up to 50 years). Transportation companies can typically release records after a few months, or maybe even a few years, once a delivery is completed. Have you ever wondered how a hotel property remembers you at check-in even though your last visit was 4 years ago? It’s because they have a retention system that keeps records for at least 4 years.

Disaster Recovery Responsible companies back up data to protect against a business disruption (e.g., such as from an IT failure or a ransomware attack).  However, the longer a business retains data, the more expensive the costs associated with both primary storage and the storage required to manage backups.

If you retain data for time periods measured in years, backed up data will just keep growing and growing. Consequently, even though the primary storage required to run the firm may be considered relatively finite and “inexpensive,” as more and more backup copies accrue, the cost to maintain and store this forever growing backup data will grow at an accelerated rate. 

Also, more data means a longer recovery time. Recovering data on the average laptop computer runs around 12 hours, and recovering data on a multi-terabyte server could take days. Even with solid state disks, the recovery process takes a long time.

Technology Changes:  Rarely does anyone consciously think of this one, but technology is constantly evolving. Every year, storage is denser, faster and less expensive. Disk and tape storage sold in 2015 are no longer available for sale, but many companies store their data on these antiquated storage devices.  Unfortunately, at some point vendors stop supporting these older technologies, and companies need to migrate this data – stored on old technology – to new platforms. Do you have tape or optical storage in your shop? Get ready to replace it!

In the 1980s, my colleagues and I imagined a time would come when someone would have a full-time job doing nothing, but moving data from antiquated storage technology to the newest vintage.


No decision IS a decision to “keep data forever.”


As outlined, data retention policies vary greatly depending on a company’s business and fiduciary requirements. Careful consideration should be given to developing specific retention policies for different types of data even within the same organization. These policies must be designed to keep data long enough to meet regulations and business needs, but at the same time avoid becoming excessively expensive and difficult to restore.

Sadly, a lot of IT managers and executives, go without making a conscious decision on how to manage and retain data; therefore, no decision IS a decision to “keep data forever.” Although kicking the can down the road is an easy option, ultimately this plan does nothing more than make data retention someone else’s problem.

Managing data retention policies is a delicate balancing act between retention requirements, technology lifespan, and costs.  Yet, time spent carefully considering the relevant factors involved can both save the company money, and be invaluable when a data loss, security breach, or legal situation occurs.

The post CCPA, GDPR and The Dry Topic of Data Retention appeared first on Puldy Resiliency Partners.

Photo of Michael Puldy Michael Puldy

Michael has over three decades of technology, information risk management, and operations experience including two-plus decades in leadership roles at IBM.  Michael is passionately focused on ways companies can improve their offensive and defensive posture towards internal and external threats.

Michael has the…

Michael has over three decades of technology, information risk management, and operations experience including two-plus decades in leadership roles at IBM.  Michael is passionately focused on ways companies can improve their offensive and defensive posture towards internal and external threats.

Michael has the distinction of being named a Ponemon Fellow by the Ponemon Institute. He is an award-winning speaker and author of professional and peer reviewed papers, blogs, and has published two books. Michael has a patent pending with the United States Trademark and Patent Office, and he is currently writing The Renaissance of Resiliency, discussing the evolution of data center centric IT disaster recovery to business continuity to a future where total resiliency is a way of life.

He holds a bachelors degree in Computer Science from Clemson University and a Master of Business Administration from the University of North Florida.