Extract from Aidan Randle-Conde’s article “Continued Progress on Hanzo’s Innovate UK Grant: Identifying Data Leakage on Slack”
For the last several months, the Hanzo team has been building artificial intelligence models using grant funds that we received from Innovate UK’s Sustainable Innovation Fund. The grant was designed to help companies recover from the COVID-19 pandemic. We’ve been looking into ways to extend Hanzo Hold for Slack, our purpose-built Slack ediscovery tool, to address the new workplace risks caused by the abrupt transition to remote work. A little while ago, I wrote an update about the model we’re building to detect human resources risks like discrimination, threatening language, and bullying on the Slack platform.
Today, I want to shift focus a bit and talk about a completely separate model we’re building to detect data leakage. This model, geared toward identifying personal data disclosures —names, Social Security numbers, and so on—and organisational intellectual property such as patent applications aims to alert organisations early for proactive remediation. The model doesn’t care whether those disclosures are accidental or intentional; it’s just looking for information that might be problematic. Let’s take a closer look at the risk we’re addressing, and then I’ll walk you through our current solution.
Defining the Risk: Data Leakage on Collaboration Platforms
One of the significant challenges modern companies face is identifying and mitigating data leakage: the unauthorised sharing of sensitive information.