Posted in Compliance, Conservation and preservation, Electronic records, Governance, Information Management, Information Security, Legal, Records management, Retention and disposal, Security

Destroying digital records – are they really destroyed?

Most people should be aware that pressing the ‘delete’ option for a file stored on a computer doesn’t actually delete the item, it only makes the file ‘invisible’. The actual file is still accessible on the disk and can be retrieved relatively easily or using forensic tools until the space it was stored on is overwritten.

Traditional legacy electronic document and records management (EDRM) systems have two components:

  • A database (e.g., SQL, Oracle) where the metadata about the records are stored
  • A linked file share where the actual objects are stored, most of which are copies of emails or network file share files that remain in their original location.

In most on-premise systems, email mailboxes, network file shares, and the EDRMS database and linked file share are likely to be backed up.

When a digital record comes to the end of its retention and is subject to a ‘destruction’ process, how do you know if the record has actually been destroyed? And even if it is, how can you be sure that the original isn’t still stored in a mailbox, network file share, or a back up?

This post examines what actually happens when a file is ‘deleted’ from a Windows NT File System (NTFS), and questions whether digital records stored in an EDRMS are really destroyed at the end of the retention period.

The Windows NTFS Master File Table (MFT)

Details of every file stored on a computer drive will be found in the NTFS Master File Table (MFT).

In some ways, the MFT operates like a traditional electronic document management system – it is a kind of database that it records metadata about the attributes of the digital objects stored on the drive. These attributes include the following:

attriblist

As noted in the diagram above, the details stored by the MFT include the $File_Name and $Data attributes.

  • The $File_Name attributes include the actual name of the file as well as when it was created and modified, and its size.  This is the information that can be seen via File Explorer and is often copied to the EDRMS metadata.
  • The $Data attribute contains details of where the actual data in the file is stored on the disk (in 0s and 1s) or the complete data if the file is small enough to fit in the MFT record.

If the MFT record has many attributes or the file data is stored in multiple fragments on a disk (for example as a file is being edited), additional MFT ‘extension’ records may be created.

When a file is deleted, the MFT records the deletion.

  • If the file is simply deleted, the record will remain on the disk and can be recovered from the Recycle Bin.
  • If the file is deleted through SHIFT-DEL or emptying the Recycle Bin, the MFT will be updated to the ‘Deleted’ state and update the cluster bitmap section to set the file’s cluster (where the data is stored) as being free for reuse. The MFT record remains until it is re-used or the data clusters are allocated in whole or part to another file.

So, in summary, ‘deleting’ a file does not actually delete it. It may either:

  • Store the file in the Recycle Bin, making it relatively easy to recover, or
  • Change the MFT record to show the file as being deleted but leave the file data on the desk until it is overwritten.

How does an EDRMS store and manage files?

The following summary relates to a well-known Electronic Document and Records Management System (EDRMS). Other systems may work differently but the point is that records managers should understand exactly how they work and what happens when electronic files are destroyed at the end of a retention period.

Most EDRM systems are made up of two parts:

  • A database (SQL, Oracle etc) to store the metadata about the record.
  • An attached file store that stores the actual digital objects.

When EDRM systems are used to register paper or physical records (files and boxes), only the database is used.

When digital records are uploaded to the EDRMS:

  • The metadata in the original file, including the file type, original file name, date created, date modified and author are ‘captured’ by the system and recorded in the new database record.
  • Additional metadata may be added, including a content or record ‘type’.
  • The record will usually be associated with a ‘container’ (e.g., ‘file’). This containment makes the record appear to be ‘contained’ within that container, whereas in fact it is simply a metadata record of an object stored elsewhere.
  • The original record filename is changed to random characters (to make it harder to find, in theory) and then stored on the attached (usually Windows NTFS) file store, often in a series of folders.
  • A link is made between the database record and the record object stored in the file store (the MFT record).

When the end-user opens the EDRMS, they can search for or navigate to containers/files and see what appears to be the digital objects ‘stored’ in that container/file. In reality, they are seeing a link to the object stored (randomly) in the file store.

What happens when an EDRMS record is destroyed?

If there is no requirement to extend their retention, or keep them on a legal hold, records may be destroyed at the conclusion of a retention period.

For physical records, this usually means destroying the physical objects so they cannot be recovered, a process that may include bulk shredding or pulping.

For digital records, however, there may be less certainty about the outcome of the destruction. While the EDRMS may flag the record as being ‘destroyed’ it is not completely clear if the destruction process has actually destroyed the records and overwritten the digital records in a way that ensures its destruction to the same level as destroyed paper files. 

Also:

  • If the original associated NTFS file share becomes full and a new one is used, the original is likely to be made read only.
  • There is likely to be a backup of the EDRMS.
  • The original records uploaded to the EDRMS probably continue to exist on network files shares, in email, or in back up tapes.
  • Digital forensics can be used to recover ‘deleted’ files from the associated file share.

Consider this scenario:

  • An email containing evidence of something is saved to a container in an EDRMS.
  • The container of records is ‘destroyed’ after the retention period expires.
  • A legal case arises after the container is ‘destroyed’
  • A subpoena is made for all records, including those specific records.
  • Has the record actually been destroyed, or could it still be recoverable, including from backups or the digital originals?

Is it really possible to destroy digital records, and does it matter?

Yes, records can be destroyed by overwriting the cluster where the record is kept, and some EDRM systems may offer this option.

But:

  • Do EDRM systems overwrite the cluster when a digital record is destroyed in line with your records retention and disposal authorities, or simply mark the record as being deleted, when it is still technically recoverable?
  • Could the record still exist in the network file shares or email, or in backups of these or the EDRMS?
  • Might it be possible to recover the record with digital forensics tools?
  • Does it matter?

It might be worth asking IT and your EDRMS vendor.

References:

 

 

Posted in Compliance, Electronic records, Exchange Online, Information Management, Microsoft Teams, Office 365 Groups, Products and applications, Records management, Retention and disposal, SharePoint Online

Understanding and applying retention policies to content in MS Teams

This post highlights the need to understand how retention works in MS Teams, why it may be related to how long you keep emails (including for backup purposes), and why you need to consider all the elements that make up an Office 365 Group when considering how – and how long – to retain content in MS Teams.

Overview of retention in MS Teams

If you are unfamiliar with how retention works with MS Teams, these two related sites provide very useful detail.

overview_of_security_and_compliance_in_microsoft_teams_image1
Image from the first link above – Security Compliance Overview

The quote below from the second link is relevant to this post:

‘Teams chats are stored in a hidden SubstrateHolds folder in the mailbox of each user in the chat, and Teams channel messages are stored in a hidden SubstratesHolds folder in the group mailbox for a team. Teams uses an Azure-powered chat service that also stores this data, and by default this service stores the data forever. With a Teams retention policy, when you delete data, the data is permanently deleted from both the Exchange mailboxes and the underlying chat service.’

and

‘Teams chats and channel messages aren’t affected by retention policies applied to user or group mailboxes in the Exchange email or Office 365 groups locations. Even though Teams chats and channel messages are stored in Exchange, they’re only affected by retention policies applied to the Teams locations.’

In summary:

  • One-to-one chat in MS Teams is stored in a hidden folder of the mailbox of each user in the chat. Documents shared in those chats are stored in the OneDrive for Business of the person who shared it.
  • Group chat in Team channels is stored in a hidden folder of the mailbox of the associated Office 365 Group – and also in an Azure chat service. Documents are stored in the Office 365 Group’s SharePoint site (other SharePoint site libraries may also be linked in a channel).

Another quote from the same post:

‘In many cases, organizations consider private chat data as more of a liability than channel messages, which are typically more project-related conversations.’

Teams content is kept in mailboxes, retention may be similar

Typically, in the on-premise past, organisations will have backed up their Exchange mailboxes (and possibly also enabled journaling, to capture emails), for disaster recovery, ‘archiving’ and investigations. Unless a decision is made to invest in cloud back-ups, Office 365 retention policies may also be applied to Exchange mailboxes, effectively replacing the need to back them up. Retention policies applied to Exchange mailboxes don’t affect the teams chat folder.

Organisations should probably apply the same retention period to both emails and Teams chats as they do to email mailbox backups now. That is, if mailboxes are typically kept for 7 – 10 years after the person leaves the organisation, then keep the Teams chats for the same period.

Note that, even if a poster deletes an item (if that option is enabled), it will still be retained if there is a retention policy.

Suggestions for retention in MS Teams

As there can be different retention requirements, depending on the subject matter, here are some suggestions for retention:

  • One-to-one chat is like email, you will never know everything that is being said or sent. So a single retention policy that mirrors email would be appropriate.
  • Teams chat is more likely to be about the subject of the Team, which is based on an Office 365 Group, its own mailbox, and has a SharePoint site. In this case, you could consider a retention policy applied to all Office 365 Groups or specific Groups – for example ‘Project Groups’, then ensure that the retention policy or policies cover all aspects of the Office 365 Group (mailbox, team chat, SharePoint).
  • If all the records relating to a particular subject matter (including email, chat and documents) must be retained for 25 years, then you need to understand all the options.

It underscores the need to plan carefully for retention management for all the key workloads in Office 365.

Posted in Classification, Compliance, Electronic records, Governance, Information Management, Office 365, Products and applications, Records management, Retention and disposal, SharePoint Online

Shifting the paradigm for managing records – from EDRMS to Office 365

Computer systems used to to manage electronic documents and records, commonly known as ‘EDRMS’, have been around for at least 20 years.

Many (but not all) of these systems developed from electronic databases originally used to register and manage only paper records, replacing the old paper registers (hence ‘Registries’).

How does an EDRMS work

A common theme with most EDRM systems is that they describe (via metadata) and provide some kind of visual ‘file’ or ‘folder’ structure for digital objects, almost always stored in a linked network file store.

To store records in this way, EDRM systems required end-users to upload a copy of a digital object (document, email, photograph) to a pre-defined digital container, corresponding to a ‘file’ or ‘folder’. The digital file might have be assigned a range of metadata including the classification (business function and activity) or file plan details, title, business owner or area, and retention information.

Once an object was uploaded, end users were required to add metadata about the object, including the object ‘title’ (if it didn’t copy the original title). Additional metadata fields, for example ‘Document Type’, might also be required.

The system recorded the date and time the object was uploaded and who uploaded it. As noted, the system might copy some of the uploaded object’s metadata, for example the default title, date created and author.

The uploaded document then ‘became’ a record, visible ‘within’ a digital container (‘file’) along with other records.

EDRMModel2

EDRM systems had (at least) three weaknesses:

  • End-users were required to upload the records to the EDRMS, and to one correct container (file/folder)
  • The EDRMS contained a copy of a digital object that almost always remained in the original storage location (email, network file share)
  • The EDRMS tended to be based on records as documents (including emails, and sometimes photos). Newly evolving forms of record such as text messages, social media posts and new digital forms were difficult to upload without costly add-ons that didn’t necessarily capture everything

These weaknesses meant that:

  • End users avoided uploading records because it was extra work (uploading and then adding metadata)
  • The EDRMS contained only a percentage of all potential records stored in any location
  • The original copies of records, remained in email and network file shares

There were exceptions to this situation, but most (and very much in the minority in terms of total volume) involved the requirement to meet compliance obligations to capture certain types of records.

The Office 365 model

Microsoft took a different approach with the approach to records management in Office 365.

Instead of centralising the storage of records in one system or location (with the weaknesses described above), records in the Office 365 environment generally remain in their original location (Exchange Online, SharePoint Online, OneDrive for Business, MS Teams), where they are covered by an overarching records management framework.

O365RMModel
The Office 365 model for records management

What this means is that records can be stored in any of the above locations and managed in those locations through (among other things):

  • User types, licences and roles set in the Office 365 admin portal
  • Retention and other controls set in the Office 365 Security and Compliance admin portal/s (the two were split in early January 2020).

How the paradigm shifted

The paradigm has shifted from (a) an attempt to manage records in a single system where not everything is captured and originals remain in place in email and network file shares, to (b) the distributed management of records where originals remain in place (assuming SharePoint and OneDrive are used instead of network file shares and personal drives, and email remain in Exchange) and records are managed through ‘global’ settings.

The new paradigm does not exclude the ability to store (or aim to store) digital records in a single location – SharePoint Online (including for specific compliance reasons), but it provides the opportunity manage records wherever they are and use a range of additional tools to manage content from creation through to disposal.

Why the new paradigm matters

The new paradigm is likely to be counter-intuitive to many records (and other information) managers. Records management training for many years has been focused on the idea of storing and managing records in a central location with specific controls (classification, metadata and retention).

But the reality is that there are now too many digital records, and too many types of digital records, to ever expect these to be all stored in an EDRMS. And, even if only some are, what about all the others? Has a legal subpoena ever been focused only on records stored in the EDRMS?

Plan to manage records

Many organisations have acquired and are implementing Office 365, sometimes at the expense of the traditional EDRMS. It doesn’t take long for end-users to adopt the new technology because it is so easy to use.

Any suggestion that specific records now need to be copied to the EDRMS seems to be counter-intuitive. And yet, that is how some records managers continue to see Office 365 – as yet another source of records to be uploaded to the EDRMS. It is not a viable plan.

Records managers need to be at the forefront of planning for Office 365, in particular managing content across the four primary workloads. Records managers should be able to provide advice on:

  • The architecture of SharePoint Online
  • Controls around the creation of sites, including naming conventions and the ongoing management of sites
  • The structure of SharePoint Online sites, document libraries and metadata in particular
  • The retention model for Exchange Online, SharePoint Online, OneDrive for Business, MS Teams. This includes understanding existing disaster recovery arrangements and potentially replacing them with retention policies.
  • Disposal actions
  • Other compliance obligations

Plan for change

Moving away from the centralised management of records in an EDRMS to a less visible (for end-users) decentralised model, or even implementing Office 365 without any other previous document and records management system, requires careful change management.

End users (and records managers) used to the idea of uploading records to a central EDRMS may find the new ‘invisible’ and decentralised model of recordkeeping unusually simple (to the point of disbelief).

Consequently, additional re-assurance, training and awareness sessions, may be required to demonstrate and confirm how the records are managed in the new environment. There is potential for some ‘push back’ as, although it requires very little end-user effort, it manages more records than ever before, including in ‘personal’ spaces such as mailboxes and personal drives.

IT will also need to be involved as disaster recovery processes, such as backing up email and network file shares, may no longer be required.

For end users who have never had to use an EDRMS, change management activities might focus more on improving awareness and knowledge about how records will be managed in the future, including in ‘personal’ spaces.

 

Posted in Classification, Compliance, Electronic records, Governance, Information Management, Legal, Office 365, Office 365 Groups, Products and applications, Records management, SharePoint Online, Training and education

AI curated chaos or control – the equally valid but opposite ends of the SharePoint spectrum

There are, broadly speaking, two ‘bookend’ options when it comes to creating new SharePoint Online sites and the document libraries in those sites:

  • ‘Controlled’ model: The creation of new sites is restricted to a small group of individuals with admin rights, who also oversee the creation of document libraries and application of metadata. A combination of controlled and manually applied classification and metadata and retention policies are used to access and manage content over time. Artificial intelligence (AI) tools can also be used to manage content.
  • ‘Chaos/uncontrolled’ model: The creation of new sites, including the creation of document libraries is not restricted. AI tools (including auto-classification) and auto-applied retention policies are used to classify, access and manage content over time. This model assumes that any form of random categorisation applied by end users (e.g., library names, metadata) is mostly ignored by AI tools.

From a traditional information governance and records management (ISO 15498/ISO 16175) point of view, the second ‘chaos’ or uncontrolled model option seems to run counter to conventional wisdom and agreed standards.

From a practical point of view, the first ‘control’ model option seems to run counter to common sense given the volume and range of digital information and the difficulty of classifying or categorising information and records correctly.

Which option is better?

Confusingly, perhaps, the answer may be a combination of both.

  • Certain types of more formal records, such as those required for corporate compliance, formal policies, staff files, accounting information not stored in a finance system, property information, and/or product information, is almost certainly going to be better off in a controlled SharePoint sites with pre-defined libraries and metadata. These types of documents are more likely to be subject to records retention requirements and almost certainly may be subject to eDiscovery and legal holds.
  • Other types of less formal records, including ‘working’ documents, chats and conversations may be better off stored in uncontrolled SharePoint sites, including SharePoint sites linked with Office 365 Groups and Teams, and in MS Teams/Outlook. These types of records are less likely to be subject to records retention requirements but may be subject to eDiscovery and legal holds.

Ultimately, the way the organisation needs to implement Office 365, including SharePoint Online and apply retention policies and other options will depend on its need to comply with oversight and legal requirements (including minimum retention periods), and/or its tolerance for risk.

How does this work in Office 365/SharePoint Online?

If both options Organisations need to make a conscious decision to allow both options, and be prepared to manage both.

The key features of Office 365 and SharePoint to allow both options are listed below:

  • Office 365 retention policies apply to all of Exchange Online, all OneDrive for Business accounts, entire sites (invisible to users) or parts of sites (visible to users).
  • Some retention policies may be applied based on the auto-classification of records, subject to review.
  • The creation of SharePoint sites is either controlled (requested and provisioned) or uncontrolled (created by end users) via either (a) ‘Create sites’ in the end-user SharePoint portal or (b) when a new Team is created in MS Teams.
  • All sites, including Office 365 Group/Team sites are reviewed regularly for activity and inactive sites with no content of value deleted.
  • All controlled sites are assigned either an invisible retention policy or individual visible retention policies (with disposal review), depending on their content.
  • All uncontrolled sites are assigned an invisible retention policy. Uncontrolled and inactive sites with content are also made read only.

Features of controlled and uncontrolled SharePoint sites

SharePoint Online is quite different from older versions of the application and those who dismiss it based on previous experience should consider having another look as a lot has changed in the past couple of years.

SharePoint Online allows the creation of sites that contain important content that needs to be controlled of managed as records, as well as sites created and managed entirely by end-users. And, as an added bonus, all the content is stored in the one place, not in multiple locations (network drives, email servers, EDRM system, etc).

The elements that make up both types of sites, as well as ‘informational’ sites, are described below:

  • Controlled sites
    • Where the organisation’s official records are stored and managed.
    • Created by SharePoint Administrators.
    • More formal in nature, containing the official records.
    • Structure decided by business areas – for example, document libraries using agreed naming conventions.
    • Use of Content Types and site column or local library metadata to define the content.
    • Application of Office 365 retention policies to entire sites or individual document libraries, with disposal reviews. Auto-classification is less likely to be required as the content has already been structured as required.
  • Uncontrolled sites
    • Usually based on end-user created Office 365 Groups or MS Teams.
    • Where ‘working documents’ are created and managed, with the emphasis on allowing end-users collaborate and communicate easily and effectively – and move content to formal sites when required.
    • Created by end-users but naming monitored by SharePoint administrators (or using rules).
    • Informal in nature, used for working documents (effectively replacing personal and network file shares, and other unapproved systems).
    • A fluid structure for document libraries, driven by end-user requirements (not imposed by others).
    • Little if any use of Content Types or metadata.
    • Retention based on Group activity (E5 licences), otherwise based on Office 365 site retention policies and/or auto-classification options.
    • No disposal reviews – content is deleted after a given period of time.
  • Informative
    • Communication sites (e.g., ‘intranet’)
    • Used to publish information to the organisation

Things to watch out for

It is largely true that if you give people an option, someone is bound to try it, sooner or later, especially if it says ‘Create site’, ‘Create team’, or ‘Create group’. Early adopters learn quickly and can just as quickly abandon something that provides no benefit. 

In a ‘free for all’ SharePoint environment, where end-users can create new sites, teams or groups (both of the latter have a SharePoint site), the most likely issues will include:

  • Sites with names that are very similar to ones that already exist, created because the end-user didn’t know another existed (it may not be obvious) or didn’t like the name.
  • Sites with names that make no sense (including common acronyms) or are just ‘wrong’ or contrary to preferred naming conventions.
  • Sites used to create and store content that really should be stored in a more formal site or, conversely, doesn’t belong in the organisation’s official information systems (e.g., photos of someone’s wedding).

All of these issues require some general rules about the creation of new sites (or Office 365 Groups or Teams or Yammer Groups), including suggested naming.

Global and SharePoint admins can monitor the environment and fix issues when they arise rather than wielding a big stick.

What’s great about it

You can have the best of both worlds with SharePoint Online.

  • Keep formal official records in ‘formal’ sites with controlled structures and metadata.
  • Allow end-users to get on with creating, collaborating, sharing (one copy, not attachments), chatting, on any device.

If your communications and change management are good, end-users will soon learn how much fun it can be to use Teams, or access their content from File Explorer (or both!), without having to having to be trained how to save records. All they need to know is how to use the ‘Move’ option to move the final version of records to a formal site.

The foundation of any compliance program is knowing where all of your data lives and then classifying, labeling, and governing it appropriately.

Posted in Compliance, Electronic records, Governance, Information Management, Office 365, Products and applications, Records management, Retention and disposal, SharePoint Online

Planning for records retention in Office 365

Almost ten years ago, in January 2010, I published a post here about the origins of the statute of limitations. The post had the following introduction:

The retention – and eventual disposal – of records is a common business practice, despite occasional concerns about what gets destroyed. Justice Scalia, in Arthur Andersen LLP v United States (No. 04-368, 2004) said as much about the destruction of records relating to Enron by Arthur Anderson:

‘… we all know that what are euphemistically termed “record-retention programs” are, in fact, record-destruction programs, and that one of the purposes of the destruction is to eliminate from the files information that private individuals can use for lawsuits and that Government investigators can use for investigations.’

Almost ten years on, it still seems an appropriate way to introduce a post about the retention of records, this time in relation to records stored in Office 365.

Bottom line – you need to plan for it, and make sure your legal team is consulted.

Blatant plug for a great book

Before you read further, I recommend you have a look at the comprehensive e-book ‘Office 365 for IT Pros‘. This 1000+ page ebook includes, in Chapter 19 – Office 365 Data Governance, a comprehensive description of how to create, apply and manage retention policies in Office 365.

What do we mean by retention?

The retention of records is generally based on business, legal or regulatory requirements to keep certain records for a minimum period of time. In the case of government records, there may also be an archival requirement.

The retention period may relate to or be based on a statute of limitations that governs when potential legal actions expire. For example, simple contracts generally need to be kept for a minimum of six or seven years (depending on jurisdiction after they expire. More complex contracts (including deeds) may need to be kept for much longer.

Records that need to be retained should not be deleted and must remain accessible for the period of time during which the integrity, authenticity and reliability of – and often the context for – the retained records must remain inviolate.

Retention is not IT back up or (or for) disaster recovery

Retention management is not about IT backups or ‘archiving’, or disaster recovery programs. These activities are focused more on the ability to recover data and records.

On-premise to online – a different paradigm

Many organisations have (or should have) records retention schedules, also known as disposal or disposition authorities. Records retention schedules and authorities define what needs to be kept and for how long.

Most records are managed in similar ways:

  • As paper (usually printed from digital records), stored in files and/or boxes. These records may be tracked in a database.
  • As digital records, uploaded to a third-party electronic document management system (EDMS), while leaving the original records stored in Exchange mailboxes or File Explorer.
  • As entire business systems (with little thought given to individual records).
Goonellabah_2.jpg
An example of old-style paper file storage. These records are around 20 years old and are well overdue for disposal. 

Into the on-premise mix:

  • MRM policies may be enabled on Exchange mailboxes, allowing end users to apply retention tags to emails. An archiving policy may be in place as well.
  • Individual user mailboxes and ‘home drives’ may be retained for a period of time after the user account is deactivated.
  • There is a backup regime.

Many of the options above will not, or may not, exist (at least in the same way) in Office 365.

On the other hand, Office 365 now includes a range of records retention options that can be used to better manage retention.

Why you need a plan for retention in Office 365

Implementing Office 365 retention policies without a good plan to transition from the on-premise environment, is fraught with potential failure, potential confusion, uncertainty, and legal risk.

To quote from page 882 of Office 365 for IT Pros:

‘… it is wise to take time to chart out how retention will work across the tenant for all workloads before you create any policies. Fools rush to implement retention without thought!

A good starting point is to contact the records management team and get a copy of the organisation’s records retention schedules or authorities to understand what needs to be kept and for how long and also – importantly – where the records are currently stored.

Retention options in Office 365

In simple terms there are two types of retention that can be applied to records in Office 365. The following paraphrases parts of chapter 19 of the book Office 365 for IT Pros.

Explicit (visible) retention policies

This option involves (a) creating retention labels that define a retention period (and if a disposition review is required), (b) publishing the labels as retention policies to specific Office 365 workloads, and (c) applying them manually (including via PS scripts) to content that needs to be retained.

Retention labels published as explicit retention policies can be applied (including automatically, in certain circumstances and/or with an E5 licence) to the following workloads

  • Exchange email (all/select)
  • SPO sites (all/select)
  • ODfB accounts (all/select)
  • O365 groups (all/select)

ExplicitRetentionPolicy.JPG

Implicit (invisible) retention policies

This option involves (a) creating retention policies (that do not include a disposition review) and then (b) applying them to all or specific Office 365 workloads.

Implicit retention policies can be applied to:

  • Exchange email (all/select)
  • SPO sites inc O365 Group sites (all/select)
  • ODfB accounts (all/select)
  • O365 groups (all/select)
  • Skype for Business (specific users)
  • Exchange public folders (all)
  • Teams channel messages (specific teams)
  • Team chats (specific users)

The first four workloads are the same for both types. Which one should you choose – explicit or implicit?

ImplicitRetentionPolicy.JPG
Implicit retention policy

Keep in mind that explicit policies take priority over implicit policies.

Retention policy limits

Both explicit and implicit retention policies have specific limits. You can create:

  • Up to 10 organisation-wide retention policies. For example, all mailboxes, all OneDrive accounts, all SharePoint sites, all Office 365 Groups. 
  • Up to 1000 narrow/specific retention policies. Each of these can point to up to 100 sites, 100 ODfB/O365 group accounts, and 1000 mailboxes. 

These limits, and the difference between explicit and implicit, show why planning for retention is essential.

Questions you might want to ask

Some questions to consider as part of the planning process:

  • For Exchange mailboxes, is it better to have (a) a single implicit policy to keep all  mailboxes for x years, or (b) a single implicit policy that targets only certain mailboxes (e.g., senior managers only), or (c) multiple explicit policies that end-users can apply? What do you do now with Exchange on-premise? Do you journal emails? How will you do that if you go completely online? Could a single implicit policy achieve the same outcome as backing up mailboxes?
  • For OneDrive accounts, is it better to (a) have a single implicit policy to keep all ODfB account for x years, or (b) rely on the ODfB admin storage setting to keep the content after an account becomes inactive (default is 30 days), or (c) have an explicit policy that end users can apply themselves?
  • For SharePoint sites, is it better to (a) have a single implicit policy to keep all SPO sites for x years, or (b) have a single implicit policy applied to up to 100 sites at a time, or (c) create multiple explicit policies that end-users can apply?
  • For Office 365 Groups, is it better to (a) have a single implicit policy, or (b) focus instead on MS Teams channel retention and/or (c) a retention policy for the associated SharePoint sites? Do you have AD premium where Group expiry can be implemented? If yes, should you enable or disable it?

And for all of the above – how will you keep track of what has been applied where?

Recommendations

My suggested recommendations would be:

  • Exchange mailboxes. If the organisation keeps back ups of on-premise mailboxes so these can be recovered after a period of time, remove the default MRM policies and create a single organisation-wide implicit retention policy.
  • OneDrive for Business. For similar reasons to the Exchange mailboxes, create a single organisation-wide implicit retention policy to retain content in user accounts for a given period of time (say, 7 years). Also change the default storage period from 30 days to the same period of time.
  • SharePoint Online. My sense is that a single implicit retention policy for all SharePoint sites is unwise. Retention options should be considered when sites are created. For example, try not to mix (or allows users to mix) content that may have different retention requirements in the same document library – aim to apply the aggregation to the highest level of ‘aggregation’ – site or library. Create labels for and publish a small number of specific explicit retention policies (mapped to the organisation’s records retention schedule). Create one or more implicit policies for specific groups of sites (for example, all inactive project sites). Whatever model you implement, ensure you know if you need to record what was destroyed, including unique metadata from document libraries.
  • Office 365 Groups. The SPO part of Groups can be covered by the SPO retention policy, the email by the Exchange mailbox policy. You may need to consider if you need to create teams chat or channel retention policies.

What happens at the end of the retention period?

So far this post has raised questions about the type of policy you might apply to different workloads.

But what happens when the retention period ends? Should you allow records to be deleted without any kind of disposition review, or check first? This single point may be a key factor in your decision around what type of policy to create and implement, and where.

Some questions to ask:

  • Is is appropriate in your environment for end users to destroy business records by applying a policy?
  • Should someone review the content before it is disposed of?
  • Do you need to keep the metadata associated with content stored in SharePoint document libraries? If yes, where is it going to be stored?
  • Do you need a permanent record of what was destroyed?
  • What do you do if you dispose of something that should not be disposed of?

The answers to these questions may differ depending on the workload to which the retention policy has been applied and whether the retention policy is explicit or implicit.

What does a retention plan look like?

Pages 880 and 881 of Office 365 for IT Pros has an excellent model plan for retention in Office 365. The following (slightly edited) points should be documented for every retention policy that is created and published.

  • Name: It seems obvious, but naming can be important especially for explicit policies. Consider, for explicit policies, adding the retention period to the name, e.g., ‘Company Financial Records – 7 years’.
  • Purpose: What is the purpose of the policy?
  • Retention settings: How long should the content be kept for? Should this be based on when it was created, modified, or when the label was applied? Should there be a disposition review? Who will review the content when it is due for disposition?
  • File Plan: Explicit retention policies can be mapped to a file plan and thereby linked to the retention schedules. If they do, what part of the File Plan should the policy map to?
  • Type: Will the policy be implicit (invisible to users) or explicit (visible to users)?
  • Scope: What Office 365 workloads will this policy cover?
  • Broad or Narrow: Will this policy be applied across an entire workload or to specific mailboxes, sites, accounts? If an explict policy uses retention labels, what are those labels?
  • End of retention: What should happen when the retention period expires? For example, does the metadata need to be kept, should a record be kept of what was destroyed?
  • Lock (optional): Is the content that comes under the scope of the policy considered to be a formal record for the company and if so, is a preservation lock needed? (This option requires additional PowerShell work)

Joint planning is a must

Your organisation is almost certainly going to have business, legal or regulatory requirements for keeping records.

Records managers know how to interpret and apply these to records, and what to do when records reach the end of their retention period. It’s a good idea to consult with these experts, and with your legal team.

It would be a very brave IT shop that unilaterally applies retention policies to records stored in Office 365 without reference to or consultation with records managers, legal teams, or records retention schedules.