Most people should be aware that pressing the ‘delete’ option for a file stored on a computer doesn’t actually delete the item, it only makes the file ‘invisible’. The actual file is still accessible on the disk and can be retrieved relatively easily or using forensic tools until the space it was stored on is overwritten.
Traditional legacy electronic document and records management (EDRM) systems have two components:
A database (e.g., SQL, Oracle) where the metadata about the records are stored
A linked file share where the actual objects are stored, most of which are copies of emails or network file share files that remain in their original location.
In most on-premise systems, email mailboxes, network file shares, and the EDRMS database and linked file share are likely to be backed up.
When a digital record comes to the end of its retention and is subject to a ‘destruction’ process, how do you know if the record has actually been destroyed? And even if it is, how can you be sure that the original isn’t still stored in a mailbox, network file share, or a back up?
This post examines what actually happens when a file is ‘deleted’ from a Windows NT File System (NTFS), and questions whether digital records stored in an EDRMS are really destroyed at the end of the retention period.
The Windows NTFS Master File Table (MFT)
Details of every file stored on a computer drive will be found in the NTFS Master File Table (MFT).
In some ways, the MFT operates like a traditional electronic document management system – it is a kind of database that it records metadata about the attributes of the digital objects stored on the drive. These attributes include the following:
As noted in the diagram above, the details stored by the MFT include the $File_Name and $Data attributes.
The $File_Name attributes include the actual name of the file as well as when it was created and modified, and its size. This is the information that can be seen via File Explorer and is often copied to the EDRMS metadata.
The $Data attribute contains details of where the actual data in the file is stored on the disk (in 0s and 1s) or the complete data if the file is small enough to fit in the MFT record.
If the MFT record has many attributes or the file data is stored in multiple fragments on a disk (for example as a file is being edited), additional MFT ‘extension’ records may be created.
When a file is deleted, the MFT records the deletion.
If the file is simply deleted, the record will remain on the disk and can be recovered from the Recycle Bin.
If the file is deleted through SHIFT-DEL or emptying the Recycle Bin, the MFT will be updated to the ‘Deleted’ state and update the cluster bitmap section to set the file’s cluster (where the data is stored) as being free for reuse. The MFT record remains until it is re-used or the data clusters are allocated in whole or part to another file.
So, in summary, ‘deleting’ a file does not actually delete it. It may either:
Store the file in the Recycle Bin, making it relatively easy to recover, or
Change the MFT record to show the file as being deleted but leave the file data on the desk until it is overwritten.
How does an EDRMS store and manage files?
The following summary relates to a well-known Electronic Document and Records Management System (EDRMS). Other systems may work differently but the point is that records managers should understand exactly how they work and what happens when electronic files are destroyed at the end of a retention period.
Most EDRM systems are made up of two parts:
A database (SQL, Oracle etc) to store the metadata about the record.
An attached file store that stores the actual digital objects.
When EDRM systems are used to register paper or physical records (files and boxes), only the database is used.
When digital records are uploaded to the EDRMS:
The metadata in the original file, including the file type, original file name, date created, date modified and author are ‘captured’ by the system and recorded in the new database record.
Additional metadata may be added, including a content or record ‘type’.
The record will usually be associated with a ‘container’ (e.g., ‘file’). This containment makes the record appear to be ‘contained’ within that container, whereas in fact it is simply a metadata record of an object stored elsewhere.
The original record filename is changed to random characters (to make it harder to find, in theory) and then stored on the attached (usually Windows NTFS) file store, often in a series of folders.
A link is made between the database record and the record object stored in the file store (the MFT record).
When the end-user opens the EDRMS, they can search for or navigate to containers/files and see what appears to be the digital objects ‘stored’ in that container/file. In reality, they are seeing a link to the object stored (randomly) in the file store.
What happens when an EDRMS record is destroyed?
If there is no requirement to extend their retention, or keep them on a legal hold, records may be destroyed at the conclusion of a retention period.
For physical records, this usually means destroying the physical objects so they cannot be recovered, a process that may include bulk shredding or pulping.
For digital records, however, there may be less certainty about the outcome of the destruction. While the EDRMS may flag the record as being ‘destroyed’ it is not completely clear if the destruction process has actually destroyed the records and overwritten the digital records in a way that ensures its destruction to the same level as destroyed paper files.
Also:
If the original associated NTFS file share becomes full and a new one is used, the original is likely to be made read only.
There is likely to be a backup of the EDRMS.
The original records uploaded to the EDRMS probably continue to exist on network files shares, in email, or in back up tapes.
Digital forensics can be used to recover ‘deleted’ files from the associated file share.
Consider this scenario:
An email containing evidence of something is saved to a container in an EDRMS.
The container of records is ‘destroyed’ after the retention period expires.
A legal case arises after the container is ‘destroyed’
A subpoena is made for all records, including those specific records.
Has the record actually been destroyed, or could it still be recoverable, including from backups or the digital originals?
Is it really possible to destroy digital records, and does it matter?
Yes, records can be destroyed by overwriting the cluster where the record is kept, and some EDRM systems may offer this option.
But:
Do EDRM systems overwrite the cluster when a digital record is destroyed in line with your records retention and disposal authorities, or simply mark the record as being deleted, when it is still technically recoverable?
Could the record still exist in the network file shares or email, or in backups of these or the EDRMS?
Might it be possible to recover the record with digital forensics tools?
Does it matter?
It might be worth asking IT and your EDRMS vendor.
The SharePoint Online admin centre contains a number of configuration options and settings. Most of these settings relate to the administration of SharePoint as a service and are not described further unless they relate to the management of records.
Active sites
The section named ‘Active sites’ lists all active sites, including details of storage used and when it was last modified. The list can be exported as a csv file.
The records management team should have a retention plan for every SharePoint site, including Office 365 Group-based sites and communication sites. The SharePoint Admin and the Records Manager/s to review the list from time to time to review where content is stored and if any sites could potentially be deleted.
Creating new sites
As noted in the screenshot above, the SharePoint Admin can create a new site directly from this portal, or it may be scripted.
Organisations that are new to Office 365, and especially larger organisations that want to manage corporate records in SharePoint, might consider restricting – at least initially – the ability for end users to create new SharePoint sites, as well as new Teams in MS Teams, Groups in Outlook that also create SharePoint sites via the Office 365 Group.
If there is no control over the creation (at least initially) of SharePoint sites, the number of sites could grow exponentially with no regard to corporate recordkeeping requirements. Sites holding important records may abandoned or forgotten, or be invisible to people who need to see them.
As soon as there is sufficient critical mass in terms of SharePoint sites for business areas, and training and awareness for end users, these controls may be loosened.
There are three options to create new sites from this portal:
Team site. These create an Office 365 Group with Members who become the Members (add/edit) of the SharePoint site. It is recommended that an Office 365 Group is created first to ensure consistency in Group naming and controls. These types of sites, with a Team in MS Teams, may work better for smaller business units or project teams with less than 30 staff. They are also more likely to contain ‘working documents’ or have content (including the connected mailbox) that can be covered by a single retention policy.
Communication site.
Other options (sites). The options here are Team Sites, Document Center, Enterprise Wiki, Publishing Site. Team sites created here are best for large departmental or divisional sites where access can be controlled through AD Security Groups. These types of sites are more likely to last for several years, contain formal, final versions of records stored in controlled and well-named document libraries, and be subject to more than one retention policy (including both site and library policies).
All new sites must be provisioned, which is described further below.
Admins
The SharePoint admin can only assign, from the admin portal, Site (Collection) Administrator permissions for individual SPO sites. Site Owners, Members and Visitors are assigned in the individual sites once they are created.
Generally speaking, Site Owners should work in the business unit that ‘owns’ the SharePoint site. Site Owners should not be the head of the business unit unless they are prepared to manage the SharePoint site.
Site Administrators are the Site Collection Administrators found in that section of the permissions ribbon menu for the site, under ‘Advanced permissions settings’.
Generally speaking:
All SharePoint Admins should be Site Collection Admins
Site Collection Admins should be grouped in a Security Group (so each site doesn’t have to be modified every time, only the SG)
If the SharePoint Admin is not listed in the Site Collection Admin group (including via the recommended SG), they may get ‘access denied’ if they try to open the site directly. They can, however, still see the site and modify the admins from the SP Admin portal.
The Primary Admin is by default ‘Company Administrator’. It is good practice to: (a) create a single SG for SharePoint Admins, and (b) remove Company Administrator as it doesn’t really need to be there – GAs can access the SP Admin portal anyway.
It is recommended that a key or senior records or information manager be added to the Site Collection Administrator Security Group added to all SharePoint sites to provide access to all to the content, if required. This can be removed on a case by case basis if there are concerns about the security of the content in those sites.
External Sharing
External Sharing is always disabled, even if it is enabled globally. A decision must be made for each site to allow external sharing.
External sharing allows records to be shared directly with external parties, rather than being attached to emails. This provides better security for those records as the ability to prevent the download the record can also be added.
Hub sites (or sub-sites?)
Hub sites (top level and ‘subsidiary’ sites) are effectively the replacement for sub-sites in SharePoint. See below regarding the architecture of SharePoint sites.
More features – Records Management
The SharePoint admin portal has a ‘classic’ setting under ‘More features’ called ‘Records Management’. This is not what it appears to be – it is in fact a way to set up ‘send to connections’ to ‘send’ (actually copy) content to a Records Center.
There are a number of problems with this (one of which is that it copies the most recent version and re-creates it in a new library) and it is not recommended for the management of records.
OneDrive Admin
The OneDrive Admin portal includes a ‘Storage’ section that defines how much storage user’s will get as well as a setting for how long the content will be retained.
Records managers should be involved in discussions around the retention of OneDrive for Business content both while the account is active (via an Office 365 retention policy) and after the account is de-activated (via the setting here).
There are only two options after the tenant name element:
/teams/. This would normally be used for logical organisational business units and projects. It includes team sites created for Office 365 Groups.
/sites/. This would normally be used for cross organisational business units or subject areas and communication sites.
Site name and naming conventions
The site name comes after the URL path option (teams or sites).
The URL name for the site should have no spaces (otherwise these are changed to ‘%20’), and should be limited to 14 characters. Common or not obvious acronyms should always be avoided. For example, the name ‘Business Development Management’ should not be ‘BDM’ but could be abbreviated to ‘BusDevMgt’.
The display name for the site can have spaces, added back after the site is created. For example, ‘Business Development Management’.
The URL name and the site display name should always bear some similarity to each other.
Type of site
Three types can be created (as noted in the previous post):
Modern. These are directly associated with an Office 365 Group which can be linked to a Team in MS Teams.
Communication. These are the replacement for sites that were created using the ‘Publishing’ features. Generally speaking these sites store fewer records and contain or display information of an ‘informative’ nature, like the intranet.
Standard. While this option provides the ability to create several types of sites, but the most common to use here will be a standard team site.
The type of site may be influenced by the type of content and records stored in it. Generally speaking:
Higher level business units (department, division, with more than 30 staff) may be better as a standard site where formal records are stored. Membership of these sites can be via AD Security Groups.
Lower level business units, often with fewer than 30 staff, may work better as modern team sites based on Office 365 Groups where all the members of the business units are Members of the Office 365 Group (and Team), rather than using AD Security Groups. These sites are more likely to contain content that is subject to change or be considered ‘working documents’.
Additional standard sites may be created ‘between’ these two levels, to meet the requirements of the organisational business unit hierarchy. The number of people in each business unit, and their need to collaborate via MS Teams, may provide a guide to the best type of site.
Keep in mind however that AD Security Groups are usually maintained by IT, while Office 365 Groups (that provide the same type of access control as SGs) are maintained by the O365 Group Owner.
Higher level standard team sites can have hyperlinks to lower level business units on the left hand navigation or in a links web part on the page. Sub-sites (except perhaps to control access to a smaller group, such as ‘Leadership’ or ‘Management’ of larger groups) should be avoided.
New site provisioning
All new sites need to be provisioned before they made available.
This usually means doing the following after they are created, noting that SPO sites that are created from Office 365 Groups may have fewer options initially in the ‘Site Settings’ area.
Remove any left hand navigation options that are not required or could cause confusion via the ‘Edit’ menu. For example, consider removing ‘Notebook’ and ‘Pages’. Suggest you don’t remove Site Content or Recycle Bin unless really necessary – the end user can still access these via the cog/gear icon on the top right.
Removing and adding any webparts on the main page. To do this, click on Edit and use the web part menus that appear on the left of each.
Users and Permissions. For standard sites (non-Office 365 Group-based), go to Site Information – View all Site Settings, and click on ‘Site permissions’ This is where you add Site Owners, Members and Visitors. You can also add Site Collection Administrators here, if you forgot to do it from the admin portal. For O365 Group-based sites, click on Site Permissions from the cog/gear icon. You will see that the O365 Group Owners are now the Site Owners, and the O365 Group Members are the site Members. You can add anyone else via the “invite people’ option that includes the option to invite to the Group or just the site. You can get to the same other settings as for normal sites by clicking on ‘Advanced permission settings’.
Look and feel. For normal sites, click here to change the display title and log. You won’t normally change anything else. On O365 Group-based sites you can change the display name from the cog icon menu – Site Information.
Web Designer Galleries (both types). This is where you set up Site columns and Site content types. There are other options in normal sites.
Site Collection Administration (do this before Site Actions). The only two things you will modify here are the Site collection features and the Document ID settings (when they appear, after enabling a feature). The features to enable are:
Document ID Service, Document Sets (optional), Open Documents in Client Application by Default, Reporting, SharePoint Server Enterprise Site Collection features, SharePoint Service Standard Site Collection Features, Three-state workflow (optional), Video and Rich Media (optional), Workflows (optional if you use SharePoint Designer to create workflows).
Once you have enabled Document IDs, you will see that option in the settings; open this to change the default DocID prefix to be the same as the URL of the site or something very similar, to ensure it is unique (12 characters only). Document IDs are PREFIX-LibraryNumber-DocumentNumber. DO NOT enable the Site Publishing features; this is an old option that was for on-prem sites such as the intranet. It has been replaced by communication sites.
Site Actions. Click on Manage site features and enable the following (you will see that a couple are already enabled):
SharePoint Service Enterprise site features; SharePoint Service Standard Site Features.
Site Administration. Click on Regional settings to ensure the settings there are correct; they usually need to be changed for O365 Group-based sites.
Search. There is usually no need to change anything here.
Site libraries and metadata
Every new SharePoint site has a default ‘Documents’ library.
Except perhaps for lower-level sites where the content is mostly working documents of little value, this library should be hidden or deleted and replaced with document libraries that have (a) more appropriate names, (b) metadata and if required, (c) either folders or document sets.
Document library names
As with site URL and display names, all document library have both a URL name and a display name. The URL name is the name given when the library is first created (from the gear/cog icon – ‘Add an app’).
The URL name should ideally have fewer than 20 characters and no spaces (which will be replace by three characters ‘%20’). Consider including the year in the URL name instead of creating folders for each year, especially on team sites that may continue for many years (with a new library for each year). Separate year-based libraries may be easier to manage when it comes to retention and won’t contain an excessive number of items after several years, making it harder to isolate content. The URL name should also not repeat elements already in the site name, for example: https://tenantname.sharepoint/com//teams/tenantnamemeetings/
The display name can be changed via the Library Settings after it is created. The display name should bear some similarity to the library’s URL name. If the library display name needs to change, consider creating a new library with and moving the content to that new library.
Standard and added metadata (columns)
Every new SharePoint site comes with a very rich set of metadata columns that can be added to each document library.
The standard set of columns available on every library are listed below:
Type (icon linked to document)
Name
Modified
Modified By
Label applied by
Label setting
Retention label
Retention label Applied
Sensitivity
App Created By
App Modified By
Check In Comment
Checked Out To
Comment count
Compliance Asset Id
Content Type
Copy Source
Created
Created By
File Size
Folder Child Count
ID
Item Child Count
Item is a Record
Like count
Title
Version
The unique ‘Document ID’ metadata column will only appear when that feature is enabled from the Site Collection Administration section of the site. Document IDs have the format PREFIX-L-NNNNNN:
Prefix. This should be changed in the Document ID set up to be the same as the site URL name, so any documents produced on that site have the same or similar name. Otherwise, the default prefix is a random set of 12 characters.
The library ID (L). These IDs are sequential. As the ‘Documents’ library already exists, the next library will usually be ‘2’.
The document ID (N). These IDs are also sequential and never re-used. If a document is deleted, the document ID is also removed from use. Accordingly, a gap in document IDs indicates that documents have been deleted.
Added or new metadata columns may also be created as either (a) site columns, which can be added to any library on the site, or (b) ‘local’ library columns. The advantage of using site columns is that the same column can be added to any library or list, ensuring consistency across the site.
Every new metadata column can be anyone of the following:
Single line of text
Multiple lines of text
Choice (menu to choose from)
Number (1, 1.0, 100)
Currency ($, ¥, €)
Date and Time
Lookup (information already on this site)
Yes/No (check box)
Person or Group
Hyperlink or Picture
Calculated (calculation based on other columns)
Task Outcome
Full HTML content with formatting and constraints for publishing
Image with formatting and constraints for publishing
Hyperlink with formatting and constraints for publishing
Summary Links data
Rich media data for publishing
Managed Metadata (open the Term Store Managed Metadata service)
Each of the above:
Can have a description
Is optional but can be mandatory
Can be used to enforce unique values
Can have a maximum length
Can have a default value
Can use validation formulas (for example, must have a certain form or content or an error will be display)
Site columns are added to document libraries manually from the Library settings page.
Metadata columns can be used to group content in a library instead of using folders or document sets, but this concept can be quite alien to users who generally prefer to work with folders.
Note – If metadata columns are made mandatory and a library is synced to File Explorer, the synced library will become read only in File Explorer.
Note also – Only the standard File Explorer metadata will appear in synced document libraries. If there is a requirement to see or enter more metadata, the end-user will need to work directly with the browser version.
Site Content Types
Site Content Types, from a recordkeeping point of view, allow the organisation to define a type of content that will be stored on a site. They usually have added metadata.
The main types of Content Type that are likely to be used from a recordkeeping point of view are document and document sets.
Document content types may also have a standard template attached.
SharePoint also includes the ability to use a Document Set content type. A document set is type of folder which, as the name suggests, is useful for grouping a common set of documents where additional metadata is required for the ‘folder’.
Document sets must be enabled as a feature from the Site Collection Administration section of the Site Settings, and then can be added ‘as is’, or new document sets created (from the ‘Site Content Types section of the Site Settings) based on the basic template.
An example of their use would be to store a set of documents relating to an object where there is a requirement to describe that object in the folder. For example:
Employee. Metadata might be: name, employee ID, business area.
Note – While Content Types can be very useful to ensure additional metadata is added or to group content (in document sets), end users will be required to choose the correct content type (for example, document type) and then add metadata as necessary.
This process can be off-putting and should, accordingly, only be used when there is a genuine requirement and no other obvious way to identify the ‘type’. For example, instead of forcing users to choose between two content types consider either:
Including a metadata choice option (with a default setting) to select the option.
Separating the library into the separate ‘types’, so that two different document types (especially with different metadata requirements) aren’t stored in the same library).
Using file naming conventions to identify the type, e.g., ‘Contract-WilsonBrosFeb2020Final.docx’.
Library views
Standard ‘out of the box’ library list view
Every new document library displays content against a set of default columns:
Type (icon)
Name
Modified (date)
Modified by
Each item in the library has a Share option (if sharing is not disabled) as well as a three dot ellipsis menu (see below).
The list view may be modified (via All documents – Edit current view) to display more columns from the standard list as well as the document ID and any other added columns.
Views may be sorted, grouped, and filtered as required. SharePoint also includes a handy option to display all contents without folders.
Multiple views may be saved, each with a URL that can then be added as a link on the left navigation, on the page, or even sent to others. For example, a view might show a filtered view of all contracts due to expire in the coming six months.
Libraries, and views of libraries, can be embedded on the site pages. This could be useful if, for example, there is a need to see the most recent documents at a glance, or documents with an expiry date, etc.
Other document and records management functionality
All SharePoint document libraries include a range of document and records management functionality accessible from the three dot ellipsis menu, including:
Permission/access control
Sharing
Copy to/Move to
Check out/in
Version history and the ability to restore versions
Alerts
Every SharePoint site also includes a Recycle Bin that stores deleted content for 90 days. The Site Collection Administrator can review and restore documents deleted by anyone on the site for the previous 90 days. If there is concern about documents being deleted, any user with access can easily set an alert on the library for any changes that are made.
Retention policies
Retention policies, created in the Compliance Admin portal, work in two different ways in SharePoint document libraries.
These are published as retention policies and must be applied to each individual library.
It is not possible to delete content in libraries where this type of retention policy has been applied.
If the retention policy has the disposition review option enabled, the person or role nominated will be advised that the records are due for review; they will remain in place until a decision is made to delete them.
Implicit (invisible) retention policies
These are applied at the site level and may delete content when the retention period has ended.
There is no disposition review process, the documents are deleted by ‘System Account’ when the retention period has expired and the 90 day Recycle Bin period has finished.
Going live
If you have managed to reach the end of these three posts – well done! There is a lot to take in. Hopefully this information will help you going live with a new SharePoint implementation, a migration from SharePoint on-premise, or a clean up of either.
If you’d like me to write about other aspects of managing records in SharePoint, let me know via the feedback option here.
The main elements that impact on the management of records in Office 365 are Users (for licences), Roles and Groups, as can be seen in the screenshot.
Users – licencing and applications
Organisations that acquire Office 365 will generally have the relevant licences required (a) to set up and administer SharePoint Online, and (b) for users to use it (and OneDrive for Business).
This post assumes that organisations will have at least an E3 licence which includes SharePoint for end users, visible as an app when they log on to https://office.com, along with all other applications included in the licence, for example Exchange/Outlook, OneDrive for Business, MS Teams and so on. End users with access to these items will also be able to download and use the equivalent mobile device apps.
Roles
The three key roles that impact on the management of records in SharePoint are as follows:
Global Admin (GA)
Global Admins:
Are responsible for managing the entire Office 365 environment. This includes creating new Groups (Security Groups, Distribution Lists and Office 365 Groups).
Are responsible for assigning key roles, including the SharePoint Administrator and Compliance Administrator (and other roles).
May have responsibility for, and/or the skills and knowledge required to set up and administer SharePoint Online and create new sites for the organisation.
May also be able to create and publish retention policies in the Compliance admin portal.
Note – Organisations that outsource the administration of Office 365 should always have at least one GA account to access the tenant if ever required. If they don’t have a log on, they should have or acquire a very good understanding of the access and privileges afforded to the outsourced company.
SharePoint Administrator (SP Admin)
The SP Admin role will usually be a ‘system’ role that is responsible for managing the SharePoint environment, including OneDrive for Business. As noted above, a GA with the right skills can also manage the SharePoint environment.
Generally speaking, SharePoint Administrators will focus on the technical and configuration aspects of SharePoint. They are not usually responsible for confirugint SharePoint to manage records, managing records, or creating and publishing retention policies. This role usually falls to either the GA or Compliance Administrator.
Compliance Administrator
The Compliance Admin role is responsible, among other things, for the creation and publishing of retention labels and policies in the Compliance Admin portal. A GA can perform this role (along with all other roles) if required.
Compliance Admins will usually be responsible for disposition reviews linked with retention labels, and be involved in eDiscovery cases.
The Compliance Admin can search and view the audit logs for all activity across Office 365 and can carry out broad content searches with the ability to export the content of those searches. As this role is relatively powerful, it should be limited to key senior individuals in the organisation.
Office 365 and Security Groups
Office 365 Groups are Azure/Exchange objects just like Security Groups and Distribution Lists. Accordingly, there should be controls around their creation, including naming conventions.
As every Office 365 Group has an associated SharePoint site, organisations should consider restricting the ability for end users to create Office 365 Groups, and only allowing Global Admins and members of a Security Group to do this. Neither SharePoint Admins or Compliance Admins would normally create AD Groups.
If the ability to create Office 365 Groups is not restricted, an Office 365 Group will be created with an associated SharePoint site whenever:
A new Team is created in MS Teams.
A new Group is created from Outlook.
A new Yammer Group/Community is created.
External sharing
The ability to share content externally from SharePoint and OneDrive for Business is controlled from the Office 365 Admin portal. This is a global setting that can be disabled by the Global Admins if required.
It is assumed, for the purpose of this post, that that setting is enabled to allow external sharing.
Note that enabling external sharing at the global level does not enable it globally for all SharePoint sites; sites must be individually modified to allow it.
Compliance Admin
The Compliance admin portal can be accessed by the GAs and also the Compliance Admins (and some other roles). It is where retention labels and policies are created (in line with the corporate file plan/BCS) and published, and disposition reviews are undertaken, so records managers need access.
Other options in this section that relate to the management of records include the audit logs, content search and eDiscovery.
Retention policies
Retention policies may be applied to all the key workloads in Office 365 where records are stored:
Exchange Online
SharePoint Online
OneDrive for Business
MS Teams
Office 365 Groups
Retention labels published as retention policies are visible to and can be applied by end-users. Generally these are more likely to be applied at the document library level rather than to individual records, or in mailboxes or OneDrive for Business.
Retention policies that are not based on labels may be applied to all, or parts of, the four workloads listed above. For example, they may be applied to all, or a sub-set of Exchange mailboxes or OneDrive for Business accounts, or SharePoint sites. Retention policies may also be applied to individual or team chats in MS Teams.
Organisations seeking to use retention policies in Office 365 should understand how these work, have a plan for their implementation, and keep track of what has been applied where.
Retention policies for all mailboxes or all ODfB accounts may replace previous on-premise backup options for those workloads. It is unlikely that end-users will (or will want to) apply retention labels published as policies to individual emails or folders in mailboxes or OneDrive.
SharePoint sites are likely to have either or a combination of explicit and implicit/invisible retention policies. Implicit, single period retention policies may be more suitable for entire smaller, short-lived SharePoint sites. Explicit retention policies may be more suitable for the diverse range of content on more complex and long-lasting sites. Some sites may be created and populated around the need to keep a particular type of record for a long period of time – for example, employee records.
Audit logs
The Office 365 audit logs are found in the Compliance admin portal. For an E3 licence, the content in the logs is stored for 90 days.
As audit logs are an important element in keeping records, organisations may need to consider ways to retain this content for a longer period.
Note – SharePoint document libraries record the name of anyone who edited a document (and also previous versions), but they don’t record the name of anyone who simply viewed it. SharePoint lists also include audit trails, making it possible to track changes in individual rows of a list.
Content searches and eDiscovery
The Compliance admin portal provides two similar options to search for content across Office 365. Both the Content Search and eDiscovery options provide the ability to establish a ‘case’ that can be run more than once.
The eDiscovery option provides the added ability to put content on Legal Hold. Advanced eDiscovery is available with a higher licence.
Next
Click on the links below to read the next two posts:
SharePoint Online Admin centre configuration.
SharePoint site collection provisioning and configuration to manage records.
This post highlights the need to understand how retention works in MS Teams, why it may be related to how long you keep emails (including for backup purposes), and why you need to consider all the elements that make up an Office 365 Group when considering how – and how long – to retain content in MS Teams.
Overview of retention in MS Teams
If you are unfamiliar with how retention works with MS Teams, these two related sites provide very useful detail.
Image from the first link above – Security Compliance Overview
The quote below from the second link is relevant to this post:
‘Teams chats are stored in a hidden SubstrateHolds folder in the mailbox of each user in the chat, and Teams channel messages are stored in a hidden SubstratesHolds folder in the group mailbox for a team. Teams uses an Azure-powered chat service that also stores this data, and by default this service stores the data forever. With a Teams retention policy, when you delete data, the data is permanently deleted from both the Exchange mailboxes and the underlying chat service.’
and
‘Teams chats and channel messages aren’t affected by retention policies applied to user or group mailboxes in the Exchange email or Office 365 groups locations. Even though Teams chats and channel messages are stored in Exchange, they’re only affected by retention policies applied to the Teams locations.’
In summary:
One-to-one chat in MS Teams is stored in a hidden folder of the mailbox of each user in the chat. Documents shared in those chats are stored in the OneDrive for Business of the person who shared it.
Group chat in Team channels is stored in a hidden folder of the mailbox of the associated Office 365 Group – and also in an Azure chat service. Documents are stored in the Office 365 Group’s SharePoint site (other SharePoint site libraries may also be linked in a channel).
Another quote from the same post:
‘In many cases, organizations consider private chat data as more of a liability than channel messages, which are typically more project-related conversations.’
Teams content is kept in mailboxes, retention may be similar
Typically, in the on-premise past, organisations will have backed up their Exchange mailboxes (and possibly also enabled journaling, to capture emails), for disaster recovery, ‘archiving’ and investigations. Unless a decision is made to invest in cloud back-ups, Office 365 retention policies may also be applied to Exchange mailboxes, effectively replacing the need to back them up. Retention policies applied to Exchange mailboxes don’t affect the teams chat folder.
Organisations should probably apply the same retention period to both emails and Teams chats as they do to email mailbox backups now. That is, if mailboxes are typically kept for 7 – 10 years after the person leaves the organisation, then keep the Teams chats for the same period.
Note that, even if a poster deletes an item (if that option is enabled), it will still be retained if there is a retention policy.
Suggestions for retention in MS Teams
As there can be different retention requirements, depending on the subject matter, here are some suggestions for retention:
One-to-one chat is like email, you will never know everything that is being said or sent. So a single retention policy that mirrors email would be appropriate.
Teams chat is more likely to be about the subject of the Team, which is based on an Office 365 Group, its own mailbox, and has a SharePoint site. In this case, you could consider a retention policy applied to all Office 365 Groups or specific Groups – for example ‘Project Groups’, then ensure that the retention policy or policies cover all aspects of the Office 365 Group (mailbox, team chat, SharePoint).
If all the records relating to a particular subject matter (including email, chat and documents) must be retained for 25 years, then you need to understand all the options.
It underscores the need to plan carefully for retention management for all the key workloads in Office 365.
In recent weeks a number of organisations with ‘default’ Office 365 configuration settings have told me they are not using SharePoint but they are using MS Teams, and have even created new Teams.
Every new Team in MS Teams creates a linked SharePoint site via the Office 365 Group that is created when the Team is created. If the ability to create Office 365 Groups is not restricted the following is likely to happen:
Naming conventions go out the window. New Teams and SharePoint sites will probably be created with random names (eg ‘Andrews Team’, ‘Footy tipping’).
The SharePoint environment will ‘go feral’; new sites will not be provisioned according to business requirements.
This post describes what happens when a Team is created and recommends the creation of new Teams by creating an Office 365 Group.
What happens when a Team is created
At the bottom left of the MS Teams client is the option to ‘Join or create a team’. This option will be visible even if the ability to create Teams is not enabled for end users (because the control is on the creation of Office 365 Groups).
The dialogue box that opens gives the option to ‘Create Team’.
The user now has the choice to build a new team from scratch or create it from an existing Office 365 group or team. For the purposes of this post, we will assume the user chooses the first option.
The user is then asked if the team should be private, public or organisation wide. The options will affect the visibility of the Team to others. For the purpose of this post, the new Team is ‘Private’.
The next option is to name the site (‘Footy Tipping’) and give it a description.
The user is then prompted to add members (people who have edit rights) to the new Team. They may add individuals by name, a distribution list, or a security group. If external access is allowed, they may also add people outside the organization as guests. People or groups that are added are made ‘Members’ by default but this may be changed to ‘Owners’.
A key point here is who will have access to the Team if there is a single Owner. What if that person leaves the organisation?
The new Team has been created with a ‘General’ channel. The three dots to the right of the name allow the Owner to modify the members of the Team, add channels, get a link to the Team (to send to others and delete the Team.
Along the top of the new Team are three default tab: Posts, Files, Wiki.
The ‘Files’ tab appears (for those who are new to this) to allow documents to be uploaded to the Team, Synced to their File Explorer and so on. This is actually the default Documents library of the SharePoint site that is created when the Office 365 Group is created when the Team is created.
What happens in Office 365 Groups
The end user is not likely to care much about what happens anywhere else, they have a new Team and can start chatting.
Meanwhile, in the Groups area of the Office 365 Admin portal, a new Office 365 Group appears. The Global Administrator should be keeping an eye on the creation of new Groups, if they are not controlled, especially if there is a requirement to adhere to naming conventions for all AD Groups (Distribution Lists, Security Groups, and Office 365 Groups).
The Group name has had the space removed in the Group’s email address (and, as we will see, in the SharePoint site). The Global Admin can review and change the Members.
The Global Admin may also changed the settings to allow external senders to email the Group and to send copies of Group conversations (in Outlook, see below) and events to Group members. (The Microsoft Teams settings takes the Global Admin to the MS Teams Admin portal).
So, an end user has ‘simply’ created a Team, but now there is a new Office 365 Group with a mailbox (not visible but can receive emails) and a SharePoint site.
What happens in Outlook
Every new Office 365 Group has an Exchange mailbox, similar to a shared mailbox, but when a new Team is created from MS Teams, the mailbox is not visible in Outlook. If the Global admin enables the ability to ‘send copies of group conversations and events to group members’, the group members may use that Group’s mailbox address.
The mailbox is visible when a Group is created first, which is a good reason to create a new Team by creating the Office 365 Group first.
Channel chat message are stored in a hidden folder in the Group’s mailbox, where they are subject to any retention policy applied to the chat messages, separate from any retention policy applied to the mailbox.
What happens in SharePoint
As noted already, every new Team gets a SharePoint site because the Team has created an Office 365 Group.
The SharePoint Admin will see the new site in the SharePoint admin portal:
The SharePoint Admin may, via the ‘Permissions’ section, view and update the Group Owner/s and also may add additional ‘Admins’. They may make the site a Hub site and decide whether the site can be shared externally or not (the default is not shared externally).
The SharePoint admin may also delete the site – but consider that it is not now just a site but a Team and also an Office 365 Group. Some care needs to be taken here – which should be deleted first, and what happens if a retention policy has been applied to the Teams channel or the Office 365 Group?
If the SharePoint admin opens the site they will see a standard ‘modern’ team site with a single default document library. This is the ‘Files’ library that appears as a tab in the Teams General channel.
In the Permissions section of the site, the Site Owners show as the Team owners group, and the Site members (add/edit rights) show as the Team members group. There are no site visitors.
If the SharePoint admin goes to Advanced permissions settings and clicks on Site Collection Administrators they will see that only the Footy Tipping Owners are in this section. Organisations should consider adding a Security Group, that includes any records or information managers, in this section. Otherwise, any records will be more difficult to manage and the records managers will need to request access from the SharePoint admin.
Two important points that are sometimes missed:
Aside from the Global and SharePoint admin, only the Team Owners and Members can access the SharePoint site.
The SharePoint site may be shared with another person (or Group) and given Member or Visitor access but this does NOT give them access to the Team channel. They need to be added to the Team Owners or Members to have access to the Team channel.
Summary
Allowing end users to create a Team in MS Teams has a flow-on effect:
It creates an Office 365 Group with an associated SharePoint site
It creates an Exchange mailbox
It will (initially, unless this is changed) make the SharePoint site inaccessible to records managers.
It gets complicated if it is decided to delete the Team, SharePoint site, or Office 365 Group.
It is recommended, in organisations rolling out MS Teams to end users, that the ability to create Office 365 Groups is disabled except for Global Admins, and any new Team is created from a new Office 365 Group that includes the option to ‘Add Microsoft Teams to your group’, as shown below:
This will result in the following outcomes:
Controlled creation of Office 365 Groups, SharePoint sites and Teams, with appropriate naming conventions.
A new and visible mailbox for both the Group and the Team.
Stop SharePoint from ‘going feral’ and becoming uncontrolled.
Establish better governance controls for recordkeeping.
On-premise versions of SharePoint were standalone systems, usually administered by a trained and qualified SharePoint Administrator. Records managers may and may not have had access to or a role in that environment.
Generally, the only other group that would typically have access to SharePoint on-premise were the DBAs who managed the (SQL) database.
SharePoint Online is no longer a standalone system but a core part of the Office 365 ecosystem.
This post describes, for records and information managers, how SharePoint Onlineneeds to be understood in the context of the broader Office 365 administration, and how other admin roles can configure or change settings that can affect SharePoint and OneDrive for Business.
The highest level admin role in Office 365 is the Global Admin (GAs).
To protect the security of Office 365, there should be a very small number of GAs. GAs should have unique cloud-only log ons preferably using multi-factor authentication for added security. End-user accounts should NEVER be assigned the GA role.
GAs can access everything across Office 365, including the content of emails, SharePoint, OneDrive for Business and MS Teams. All activity carried out by GAs (and anyone else) is recorded in the audit logs.
Organisations that outsource the GA role to third-party companies need to be aware of the capability of the GA role and, ideally, also have at least one GA log-on account so they can, among other things, access the tenant and review the audit logs if required.
The key activities that GAs are responsible for, that impact on the management of SharePoint, are as follows.
Assigning licences. Licences (e.g., E3) provide user access to the various applications in Office 365, including Exchange, SharePoint, OneDrive for Business, MS Teams and Office (via http://www.office.com). Generally speaking it is inadvisable to remove individual options from licences. Note that the SharePoint licence gives access to use the application, it is not the admin role (next point).
Assigning roles. Roles provide admin access to the core applications (listed in the previous point) and to a range of activities (for example, Billing, Compliance, Security, User Admin). Office 365 Admin roles should always be cloud only and never assigned to normal end-user accounts. This ensure that the person logs on to perform an admin activity, as opposed to a general end-user activity. It is common (and good) practice for users may be logged on to two ;separate accounts at the same time.
Creating Groups. Groups are Azure/Exchange objects. The three main types of groups are: (a) Security Groups that control access to resources but are not email enabled; (b) Distribution Lists that provide the ability to email multiple people but don’t control access to resources; and (c) Office 365 Groups that a cross between Security Groups and Distribution Lists with much more capability. Office 365 Groups are a core element across Office 365. Every O365 Group has (a) an email mailbox, (b) and a SharePoint site. If the ability to create these types of groups is not controlled, every new Team in MS Teams will create an O365 Group with a SharePoint site (with no controls on naming). Accordingly, there needs to be close cooperation between the GA, the SharePoint admin and/or the records/information manager in relation to the creation of O365 Groups.
Enabling external access for SharePoint. This setting allows the GA to determine whether SharePoint sites and OneDrive for Business, and the content in them, can be shared externally. The setting only makes the option available for SharePoint sites but allows ODfB content to be shared externally. Individual sites must still be enabled (by the SharePoint admin) for external access.
SharePoint/OneDrive for Business Admin
The SharePoint Admin will normally be a qualified SharePoint administrator and may have administered earlier versions of SharePoint. They will also generally be the OneDrive for Business admin (as OneDrive is a SharePoint service).
The SharePoint Online admin role is much less complex in Office 365 than it was in the on-premise version. Records managers who currently manage an EDRMS could potentially become a SharePoint admin, with some training.
Additional training is required only if the organisation wishes to do additional customisation or development work, integration, or has third-party applications.
The SharePoint Admin has a number of roles:
Configuring SharePoint settings in the admin portal. This is usually a one-off activity that may be reviewed from time to time. Configuration settings should be documented.
Creating new SharePoint standard and communication sites – but NOT ‘modern’ team sites that are based on Office 365 Groups, as noted above. These should be created by the GAs who will need to be advised about (a) preferring naming conventions (if any) and (b) Group ownership and membership (which flows through to SharePoint site ownership and membership).
Provisioning new sites. This activity involves changing site collection features and site features to enable things like Document IDs and Document Sets. It also includes assigning the initial Site Collection Admin and Site Owner permissions. It may also include some basic additional options such as a new document library or list.
Assigning access and permissions. Records managers who have responsibility for managing records in SharePoint should be added to the Site Collection Admin section, ideally as part of a Security Group. This ensures that records managers can access all SharePoint sites as required (including the Preservation Hold library on sites where implicit retention policies have been applied) and, if they have the responsibility to do so, create and configure new document libraries to manage records. Both Site Collection Admins and Site Owners can apply explicit (visible) retention policies to document libraries and lists, if used.
Monitoring and managing the SharePoint environment, including resolving issues and working with Site Owners.
Managing the OneDrive for Business admin portal, including setting (a) the size of the ODfB storage and (b) the retention period for ODfB accounts after an end-user leaves.
Providing training to Site Owners, if no other training is provided.
The relationship between the various Office 365 admin elements, SharePoint admin, and the end-user experience is described in the graphic below.
SharePoint admins access the SharePoint admin portal by logging on to http://www.office.com, clicking on the ‘Admin’ option, the then SharePoint admin portal (or directly to that admin portal if they save it as a favorite).
End-users access SharePoint by logging on to http://www.office.com and clicking on the ‘SharePoint’ app, or via the mobile app.
Exchange Online Admins
The primary role of the Exchange Online (EXO) admin is to manage that application. The EXO admin may also be the MS Teams admin – see below.
If the creation of Office 365 Groups is not controlled as noted above, both EXO admins and end users can create a new Office 365 Group from Exchange or Outlook which in turn creates a new SharePoint site.
The Outlook menu with the New Group option
While emails can be copied from Exchange to SharePoint, Microsoft’s model assumes that the vast majority of emails will remain in end-user mailboxes.
Records managers need to work closely with the EXO admin/s and the Compliance admin/s (see below) to ensure that an appropriate Office 365 retention policy is applied to the content of the mailboxes. There may also be a requirement to remove the default MRM policies.
An Office 365 retention policy may initially appear to conflict with, but can support and replace previous backup strategies deployed to recover mailboxes in case of disaster or for investigation purposes. This means that a single retention policy that keeps all emails for a specific period of time will be applied to all mailboxes.
MS Teams Admins
The role of the MS Teams admin is to configure and manage the MS Teams environment. As noted above, the EXO admin may also be assigned the role of MS Teams admin.
MS Teams includes two main component parts:
Chat. One-to-one chats are stored in a hidden folder in the Exchange mailboxes of individual users. Channel chats are stored in a hidden folder in the mailbox of the linked Office 365 Group. These hidden folders are not subject to a retention policy applied to the rest of the mailbox.
Documents. These are stored in either (a) OneDrive for Business for one-to-one chat, or (b) in the linked SharePoint site for Teams channels.
Records managers should work with the MS Teams admin and the Compliance Admin to identify how retention policies will be applied to both the chat and SharePoint content in MS Teams.
Microsoft separated the Security and Compliance portals in early 2020. Consequently, there may be an admin to manage each component part – one for Security and one for Compliance.
The Compliance admin portal includes a range of actions relating to the management of information. These actions include:
Data classification. This option is still in preview but for E5 licence holders, will allow data to be classified automatically and retention policies applied to that content as an alternative to pre-defined (SharePoint site/library) ‘classification’.
Setting and monitoring alerts.
Viewing reports on various compliance matters, including the status of retention policies.
Creating and monitoring retention labels and policies. This includes retention policies for Exchange mailboxes, SharePoint Online, OneDrive for Business, and MS Teams.
Creating and monitoring data loss prevention policies.
Assigning permissions to individuals.
Managing GDPR data subject requests.
Searching audit logs (90 days of history only).
Searching for content across all of Office 365.
Reviewing disposition for records covered by explicit retention label policies, where this option is enabled
Some or all of these roles may be performed by senior records or information managers.
The Security admin portal provides access to the following actions, some of which may impact on SharePoint (sensitivity labels in particular).
Reviewing alerts
Reviewing security related reports
Creating and managing sensitivity labels and information types (and also creating and publishing retention labels)
Creating a range of security-related policies including for devices, threat protection
Assigning permissions
Conclusions
SharePoint on-premise was a standalone system that generally did not interact or integrate much with other systems.
SharePoint Online is a core part of the broader Office 365 ecosystem. A range of roles and configuration settings set across that ecosystem have – or can have – a direct impact on SharePoint Online.
Records managers who are involved with SharePoint Online need to understand this crucial difference and either learn or seek to be assigned key roles that impact on the management of records across the Office 365 ecosystem, not just in SharePoint.