This U.S.-based healthcare organization was migrating more than 40 terabytes of legacy content from SharePoint on-premises to SharePoint Online. It needed to ensure all patient privacy and sensitive information was protected according to HIPAA guidelines. Although migration was a top priority, the organization recognized the importance of compliance, information governance, and cleaning up legacy content. The final objective was to classify the cleansed content to the industry-defined MeSH taxonomy.
This organization chose Microsoft due to its ability to address all its requirements and provide both short and long-term strategy for managing content.
This healthcare organization had more than 40 terabytes of legacy content residing in file shares and was evaluating options to migrate to SharePoint Online and take advantage of OneDrive for Business. As part of that process, there was a requirement to improve knowledge management for business users, implement concept-based search, and enforce information governance initiatives involving data security and the protection and management of personally identifiable information (PII), personal health information (PHI), and other sensitive information, as well as compliance with regulatory guidelines.
The customer faced the following challenges that prompted them to move to Office 365:
- Inability to determine what should be deleted, saved, or archived from legacy content
- Inability to identify privacy or sensitive information that needed special processing
- Requirement to improve search results, information transparency, and knowledge management
- Requirement for one tool that could be used throughout the organization by subject-matter experts to manage content
To migrate to Office 365, the organization partnered with Netwrix. To deal with legacy content and protect sensitive health information, Netwrix helped the organization implement the conceptClassifier Platform and conceptClassifier for SharePoint Online . The solution identifies duplicates, versions, redundant, outdated, or trivial (ROT) data and goes beyond basic cleanup. It identifies any data privacy or organizationally-defined sensitive information, undeclared or erroneously tagged records, or noncompliance exceptions, enabling the organization to identify sources of risk and significantly reduce the amount of content to be migrated.
The solution also protects this information in real time, as it is created, or ingested. Standard product comes with over 80 rules to address compliance requirements (HIPAA regulations). Content that contains privacy vulnerabilities is automatically moved to a secure repository, preventing download, and notifications are sent to the appropriate personnel for disposition.
In addition, the use of the solution by subject-matter experts was a critical feature for this organization. They implemented conceptTaxonomyManager to automatically create an initial set of classification clues for the taxonomy. Their administrators tuned the classification clues, created new ones on the fly, and quickly used test scenarios against their live content, prior to deployment. Nodes of the taxonomy represent the type of content associated with system-generated clues and those entered by administrators, and can consist of a single word or a string of words, entities, acronyms, and synonyms, as well as keywords. This enabled administrators to test the clues in real time and define a threshold for classification, based on clue and concept matching. This functionality was extremely important to the organization, as it delivered one tool that could be used to manage content enterprise-wide after the migration, using the taxonomy manager tool.
After the migration to SharePoint Online, users were able to search on phrases, concepts, and multi-word terms, in order to retrieve highly accurate content. The organization’s new insight engine identifies similar concepts, subjects, and topics, even if the search words are never used.