Organizations, their data, and digital assets are under attack now more than ever before. Despite the higher visibility that hacks and breaches originating outside of the enterprise receive, insider attacks can be equally if not more damaging. According to a 2016 survey done by Accenture & HfS Research1,2, the prevalence of insider attacks is growing.  Their research shows that 69% of the (surveyed) enterprise security executives reported insider attacks in their organization in the preceding 12-months in contrast to 57% which reported external attacks during the same time period the previous year.

Typical insider attacks involve an abusive current or a past employee with access to the organization’s data and systems or an authorized third party personal (e.g. contractor) motivated with a malicious intent. Insider threats, in lieu of data exfilteration, are particularly difficult to catch for two reasons:  One, malicious activity can be indistinguishable from the user’s authorized usage, and two, the users typically have access to more sensitive data than they should have. A Ponemon Institute survey reported that 62% of the business users have had access to data that they should not have had3. It therefore isn’t surprising that one-third of large enterprises do not have the tools or capability to detect insider threats, and only 9% of those who do, find them effective4.

Man in the cloud (MITC) attacks

Man in the cloud attacks (MITC)5 are the latest breed of threats affecting sensitive enterprise data stored in various cloud services, particularly cloud storage which often have a sync client. Using any of the malware delivery methods out there, MITC attackers place a piece of malware on a user’s machine designed to switch the sync token either to direct future sync activities to an account controlled by the attacker or to social engineer sync/sharing permissions with that of the user. Given how an MITC attack is perpetrated, the only indicator left behind is that of the malware drop, which makes detecting MITC attacks extremely difficult.

As the InfoSec Institute called out6, detection methods are limited to catching the process that delivered the malware or malware analysis on the endpoint. MITC and insider attacks share a commonality that they are both designed to access data that does not belong to them. Although MITC attacks do not have distinct tell-tale signs, such attacks can be detected using data access patterns within the realm of the cloud service, akin to those used to prevent data exfiltration.

DLP to help detect attacks?

Data Loss Prevention (DLP) is an important tool in the enterprise security/compliance stack and is actually also a popular remedy for malicious and unintentional attempts to exfiltrate data. But, a DLP system’s effectiveness is bounded by the breadth of the policies that a user defines, irrespective of the sophistication underlying the DLP system. Defining and curating DLP policies for a large enterprise with billions of files and digital assets is not only tedious and error-prone but also requires reliable data classification.  The problem of data classification in large enterprises, albeit old, remains unsolved (or difficult to manage at the least). Classic approaches to data classification involve federating the task to data owners (an approach ridden with bias and inconsistencies), deploying classification policies (an approach that requires a reliable source of knowledge about the content, making it a circular problem) or the use of a dedicated tool (which does not reduce the burden of the task but streamlines it).

The science of detecting insider threats in the cloud

Skyhigh’s proprietary insider threat detection algorithm is designed to detect data exfiltration by an insider (or an MITC attacker masquerading as an authorized user) by contrasting data access patterns with historical use and content ownership, while circumventing the need for extensive data classification. The conventional practice is to first identify sensitive data, create access policies and monitor usage. On the contrary, Skyhigh’s Insider Threat Detection uses historical content and data access patterns to automatically construct user-content groups, which simultaneously identifies the data and users who own it.

Insider threat detection is then a matter of calling for attention when a user who does not belong to the user-content group accesses data. The algorithm utilizes a graph construct to represent user and content interactions (gathered from historical usage data). Users, data, and actions are represented using independent nodes, and an edge is created for every action between the corresponding nodes. Edge weights are also used to represent additional metrics that quantify the interactions and positively correlate the edges to the likelihood of observing a node tuple.


Figure: User-content interactions with three node types

Fast and scale friendly graph partitioning methods are used to identify densely connected sub-graphs, which in turn helps in community identification. Depending on the content associated within a community, it can represent a local user-content group as a team that collaborates in the enterprise, a large department, or even a major sub-division of the enterprise. Various metrics to measure sub-graph structures are used to hierarchically partition the graph. Subsequent annotation provides a simple way to align organizational structure with the detected community structures. observing a node tuple.

Figure: Graph for an Organizational Structure

Figure: Identifying an Embedded Community Structure

Variations in data access patterns clearly validate the detected communities and justify independent behaviors. For example, certain user-groups tend to use some service actions more often than others, while some user-groups interact with one kind of data more than the others, and some groups strictly interact only with the service actions within the community.

Figure: Bias and variations across user-content groups

Insider threat detection then becomes a function of identifying abnormal changes of usage pattern.  This may include when users start to heavily use service actions in a new community, when there is an increase in the number of super users, when there is sudden increase in download action within a community, or even when there is an overall change in occurrence of different communities.

Figure: Anomalous departure from a community pattern

Leveraging this research, Skyhigh has been able to develop algorithms that have proved extremely effective in accurately detecting insider threats attempting to steal data from cloud based systems of record, including Office 365, Box, Google Drive, and Salesforce.