Three Steps to University Data Protection
And How They Differ for Structured and Unstructured Data
By Madhu Shashanka, Concentric Founder and Chief Scientist
It is no mystery that university networks are rife with private information extremely attractive to cyber thieves, especially considering all of the personal, academic and financial information they are tasked to protect. Because of this, data breaches such as the Accellion file transfer software breach earlier this year, which leaked files of sensitive information from Stanford University; the University of Maryland, Baltimore; the University of Miami; the University of California, Merced; the University of Colorado; and Yeshiva University, will continue. Universities need to understand that different forms of stored personally identifiable information (PII) need different security tactics to best secure the data.
Every effective university PII protection effort needs to address three critical requirements: data discovery, access governance and risk mitigation. On-campus IT teams grappling with privacy mandates need to consider these factors across their unstructured and structured stored data. And while regulations such as the Family Educational Rights and Privacy Act (FERPA) and Payment Card Industry (PCI) compliance outline expectations for handling PII, they do not help when it comes to the methods you need to follow. It is important to understand effective approaches—and how they differ—across data housed in databases versus data found in documents, spreadsheets, presentations and other more informal (but equally important) data storage locations.
A typical university manages unstructured data in thousands of files containing everything from student records to university financial information to staff human resource information. Discovering PII in these files remains one of the toughest data security challenges for campus IT teams, and it’s easy to understand why. It is, on the other hand, a bit harder to understand why discovery of the other type of data—structured data—can also be so tough.
Structured databases should provide an easy way to discover university PII – but databases are often not designed with privacy regulations in mind. As a result, sensitive information may be found scattered across different databases, in different tables and in different fields. Sometimes PII is duplicated across tables or in unrelated databases. Finding it all can be tougher than you think, but it’s a critical first step. In short, university PII protection starts with its discovery.
Fortunately, emerging automated PII discovery tools can help find university PII data in both structured and unstructured data. In the unstructured data world, rules and end-user classification programs have long been used in an attempt to identify PII – but they haven’t been effective or manageable. Finding PII across a university’s databases requires being able to determine which databases and tables contain regulated data, and identifying duplications and accessing risks. Recent artificial intelligence (AI) innovations show promise in automating discovery for both structured and unstructured university data.
Data Access Governance
A complete and clear understanding of who can access university PII, and how they can do it, is critical to understanding risk and implementing mitigation strategies. But access methods for “who and how” differ for structured and unstructured data. For example, large-scale databases supporting web applications, such as those handling student self-service operations, typically connect those applications to databases via a handful of service accounts. Tracing who has access isn’t usually a problem. Increasingly, API connections to databases extend access, sometimes outside the organization itself to service providers managing everything from employee health plans to financial aid. It goes without saying that even though it may be simple to determine who has access, each connection needs careful oversight.
Cataloging access for unstructured data stored by universities is far more complicated. Empowered students and end users make access control decisions, and those decisions can be dispersed and ungoverned. Inappropriate sharing with external or personal emails, link sharing (especially unprotected or non-expiring links), files stored outside of designated locations, and unclassified files that slip by data loss prevention services are just a few ways university data can be lost. Understanding and managing access in this context is an enormous governance challenge for campus IT teams.
As with the data discovery process, recent innovations in AI can clarify who has access and whether their access to university PII is appropriate. Replacing legacy approaches that rely on file locations, pattern-matching rules or end-user document markup, AI can assess risk based on document content and the security practices in use for similar content.
Campus IT teams, now equipped with an understanding of what data they have and where the risks are, can develop more effective PII protection strategies. The tactics for protecting structured and unstructured data are, again, quite different. Here are some key tips for protecting university structured data:
- Refactor your database to eliminate duplication, clarify data structure, and make PII discovery easier for whomever will inherit the job once you’re gone.
- Tokenize and/or encrypt sensitive fields to add an extra layer of security on top of your access control best practices.
- Delete what you don’t need. A major university PII spill of unneeded years-old data is, to be blunt, an unforced error.
- Explore emerging technologies for API security and granular database access controls. Most service accounts currently have very broad access and as a result poor API designs or implementations that can be a weak link. See what you can do to tighten things up.
There are emerging tactics to also consider for unstructured data stored at universities:
- Strive for least-privileges access control at the file level for all university-critical data.
- Leverage AI-based automation to discover data and assess risk.
- Folder-level security isn’t good enough – in our research we’ve found sensitive files in folders accessible broadly across the organization.
- Continuously monitor the situation. Universities create thousands of new files each year and a one-time audit is not going to cut it.
- Look for ways to enlist your entire security stack in the PII risk management effort. With AI, for example, you can now autonomously assess risk and automatically tag files as sensitive. Those tags help data loss prevention solutions do a faster, more accurate job.
Having a clear understanding of how to discover, assess and protect structured and unstructured university data, and their differences, provides a foundation for an effective and manageable program to protect critical PII and regulated data for a safer on-campus network.