DLP with GSuite and GCP, plus those Chromebooks

Data Loss Prevention in the cloud and on devices

Nicholas Parks
8 min readApr 14, 2020
Photo by Kai Wenzel on Unsplash

Let’s start with the disclaimers. I have been using AWS since 2010, I have worked in places that use G Suite, and I also pay for my own Enterprise G Suite. I am writing this on a previous model year Chromebook, and I love Spinnaker on GKE. Lastly, I have created production Pivotal environments on Azure. However, I count my AWS experience in production as several times more robust than Google Cloud Platform (GCP). Hell, I remember that day in 2011 when Cloud Formation became real. I am writing this article because there is an exciting story that Google has not figured out how to tell yet.

For almost two years, we have been hearing Thomas Kurian spin this story of Google Cloud becoming the enterprise cloud. Okay, sure, I mean, maybe you should have done enterprise things sooner such as having premium support before 2020. Anyway Thomas, I will buy your arguments for a dollar. Many of my colleagues (myself included) do consider GCP a technically superior cloud platform in some areas. However, we have noticed Google’s complete inability to engage enterprises like an enterprise wants to be engaged — almost comical when you observe it up close. Well, Alphabet (Google’s parent company) is an advertising company.

What is the story that Google can tell the enterprise? It is an ever-improving story of security and data protection across Gsuite and GCP — The Data Loss Prevention/Protection story. First, let’s look at some of GSuite’s security and data protection capabilities.

There are many security and data controls built into GSuite, but I want to highlight Vault (and not Hashicorp’s Vault). Vault is GSuite’s means to support data retention policies that some organizations have to comply with. It also supports eDiscovery activities. If you have ever been a security architect or had to deal with corporate compliance, you know about eDiscovery. Let’s explore that some more.

As a surprise to no one, Vault works with Gmail. Email is a vector for information leaking. It is also a vector for your organization to receive the information it should not possess. As you would expect, the message retention and search capabilities between Vault and GMail are rather excellent. That’s cool, and I can expend many more words talking about email. However, what I find more compelling is Google Drive’s integrations with Vault as opposed to Gmail.

Google Drive is the Gsuite storage and document sharing solution. It is where all your business productivity apps (Docs/Sheets/Slides/etc.) and other GSuite apps store their relevant data files. I want to call to attention how Vault searches for files.

How Vault Searches for items in Drive

See that first line. If some other person leaked (AKA directly shared with) information to someone in your organization, you would know about it. This is similar in nature to someone emailing your organization a document it should not have. Many a crime-drama plot depends on some document being miraculously found. You have seen the lawyer shows — “you hid this document containing exculpatory evidence” or “why did we not see this during discovery.” Those are not phrases your legal team wants to hear during a deposition. Thus, Vault is your go-to eDiscovery solution — not because the URL for it is https://ediscovery.google.com/! Your legal team will also like that the context of a search occurs by creating a “matter”.

A matter can include (but not limited to):

  • Who can access the matter
  • Saved search queries
  • Export data sets — the collection of emails or documents matching searches

The Vault interface looks like the Google Groups interface, unloved and not updated. Hopefully, more users of Vault will bump the utilization metrics so that those teams can justify user experience additions.

Google Cloud Platform

Moving on to where the cloud magic happens, GCP (Google Cloud Platform) provides foundations to protect your data. Firstly, GCP uses encryption at rest by default. That’s good. It would be nice if the other cloud providers would do that. With crypto-by-default and various key management capabilities, you can implement crypto-shredding across GCP. There is a good explainer regarding crypto-shredding on medium. If you want to read about crypto-shredding and Google’s Big Query you can read that here. However, I don’t want to write about crypto-shredding. Rather, Google’s Cloud Data Loss Prevention (DLP) solution. Google Cloud DLP provides several capabilities that help protect your companies data from accidental leakage and generally misuse internally. Let’s first look at classification.

If you don’t use Cloud DLP for anything else do use it for classifying the types of sensitive data in your cloud environment. As the saying goes, you can’t manage what you can’t measure. With Cloud DLP, the auto-magical ability to classify text allows your enterprise information security teams to identify what type of sensitive material is located where in the cloud. The list of text types that Cloud DLP can identify is extensive. There is the rather obvious credit card, phone, and email addresses that can be identified. For those global enterprises, the ability to identify the various national IDs is also a capability. For example, if you are a global bank that stores various types of national IDs so you can report interest income properly, you want to make sure those national IDs only exist in the correct locations in your cloud environment (or not at all). Since I used a global bank as an example it can detect International Bank Account Number codes as well. The extensive text classification capability can be extended.

The classification capabilities enable the de-identification and redaction capabilities. Once sensitive information is encountered you can choose to have the data obfuscated in some fashion. Some of your choices include (but are not limited too):

  • Pseudonymization: many will know this as tokenization
  • Date Shifting: change the dates
  • Redaction: replace the offending text with asterisks

Another Cloud DLP magic is with images. Yes, you can redact images. Cloud DLP can scan images for offending text and place a black box over the offending text in the image. That is awesome, how can someone use all this magic?

A typical use of the magic would be checking Cloud Storage for data exposure. You may use Google Cloud Storage to share information within the enterprise. You may also use it to share or capture documents to/from customers. You can always configure your applications and cloud storage with the right access to prevent incidents. However, there may be instances where you don’t need to retain sensitive details beyond initial usage. If an application fails to remove sensitive data, Cloud DLP can detect this content and optionally redact it.

There are various other ways you can use DLP. There is an API so application developers can clean/vet data that applications received. There is, of course, integration with BigQuery. The DLP job reports can be “sunk” into BigQuery allowing for a whole bunch of fancy analytics. Finally, DLP also works with Data Catalog giving you additional meta-data management magic.

Photo by Anete Lūsiņa on Unsplash

Chromebooks and Endpoint Management

Your organization may be using GSuite for business productivity already. Additionally, you are running internal and customer-facing business apps in the cloud and using SaaS offerings. This is all nice but what about all those computing endpoints? You made the cloud “safe(r)” for your data by following security best practices and layering on DLP, but those devices used by your knowledge workers — are those devices “safe”?

In these corona-times, endpoint management is emerging as a critical need for many organizations. Some organizations are semantically operational with everyone at home, but now these organizations are facing a 100% bring your own device (BYOD) computing ecosystem for the first time. Oh, the phishers have been having a field day! This is particularly impacting smaller firms with smaller IT shops that only had to support the endpoints in the office. The abrupt change to 100% BYOD for some has resulted in many compromised devices and possible data beaches within a couple of weeks. This is where Chromebooks and GSuite’s endpoint management features can close the gap.

Endpoint management for Android/ChromeOS allows your security organization to monitor and administer device access to data you use with GSuite. For example, you can require a password policy and encryption across all devices. Additionally, you can allow separate work and personal profiles on devices. You can also wipe an account from a device or even remotely wipe an entire device. What is nice about the “wiping” is you can configure auto-wiping of a corporate-owned Android device if it does not “phone home” for a while.

Beyond wiping, there are other endpoint management features. You can disallow USB storage for Chrome Devices, control what apps can be installed, disable Bluetooth, and disallow Linux development features for ChromeDevices. Additionally, the reporting capabilities allow you to identify unused devices and who logically owns which devices.

GSuite’s endpoint management features are still geared towards supporting full-fledged Android devices. ChromeOS devices are supported a little differently. If you are using GSuite and mostly SaaS solutions, Chromebooks are a valid option for your enterprise. This allows your information security professionals to have visibility into how corporate data moves within the cloud and on the endpoints.

One can say that these corona times have finally killed off the “perimeter”.

Google’s data protection story is a really strong story as it covers common business productivity and cloud capabilities. I did not even cover many of the other features in Gsuite that help an organization protect data. I did not write about the context-aware capabilities of Google Cloud Platform — or details of data encryption. The DLP offering by itself is a differentiator and that is without even mentioning risk analysis capabilities. The end-user devices (Android/ChromeOS) allow the data protection ecosystem to extended from the cloud directly into the hands of your knowledge workers — creating a closed data ecosystem.

When you describe these capabilities in person you can see the light bulb turn on. Maybe all those GCP sales guys they are hiring should get a primer — Google is leaving money on the table otherwise.

--

--