Managing the outcomes of records retention in Office 365

August 12, 2019

The retention of records in Exchange Online (EXO), SharePoint Online (SPO), OneDrive for Business (ODfB) and Office 365 (O365) groups can be achieved through the application of retention labels published in the O365 Security and Compliance admin portal.

This post describes:

  • How retention labels work (in summary), including the ‘per record’ rather than the container/aggregation retention model.
  • What happens to content in Office 365 when a retention period expires.
  • The options and actions that may influence the way retention labels/policies are configured, where and how they are applied, and the outcomes required.

The post highlights the need for information and records managers to be involved in all aspects of governance, site architecture and design, and decisions around specific settings and configuration, as well as being assigned specific roles, when Office 365 is implemented.

A quick summary of how O365 retention labels work

Records retention policies in O365 are based on ‘retention labels’ that are created in the O365 Security and Compliance admin portal under the ‘Classifications’ section. Multiple labels can be applied to a single policy.

  • Click this link to read Microsoft’s detailed guidance on retention labels.
  • Click this link to read Microsoft’s detailed guidance on retention policies.

Retention outcomes

Each retention label defines one of three potential outcomes at the end of the retention period, if retention is enabled, ‘keep forever’ is not selected, and the label is not used to classify the content as a record*:

  • The content will be automatically deleted. If the content is in SharePoint, it will first be sent to the Recycle Bin, from which it can be recovered within 90 days.
    • This option may be suitable for certain types of low value records.
  • A disposition review will be triggered to notify specific people. As with the previous point, SharePoint content will be sent to the Recycle Bin if a decision is made to delete it.
    • This option will require additional, human-intervention actions, as described below, if standard records management disposal review processes are followed.
  • Nothing.

Triggers

The date when the above action will occur is based on one of four triggers:

  • Date created
  • Date last modified
  • When labelled applied
  • A event. The ‘out of the box’ (OOTB) event types are:
    • Employee activity. (Processes related to hiring, performance and termination of an employee)
    • Expiration or termination of contracts and agreements.
    • Product lifetime. (Processes relating to last manufacturing date of products).
    • A new event can also be added.
    • See this post for Microsoft guidance on event-driven retention.

An additional alternative option is available: ‘Don’t retain the content, just delete it if it’s older than n days/months/years.’ This is similar to the automatic deletion option above and may be suitable for certain types of records. 

Declaring content as records

* The option to classify or ‘declare’ content as a record is not discussed further as relates to the way records are managed in the US. Microsoft’s guidance on labels notes that: ‘At a high level, records management means that: (a) Important content is classified as a record by users. (b) A record can’t be modified or deleted. (c) Records are finally disposed of after their stated lifetime is past.’ The standard on records management, ISO 15489, defines a record as ‘evidence of business activities, often (but not exclusively) in the form of a document or object, in any form’. This means that anything can be a record. The record may continue to be modified throughout its life. 

When do retention labels become active?

Retention labels become active only when they are published. As part of the publishing process, a decision must be made if the label will apply to all (a single option) or selected parts of the O365 ecosystem:

  • The Exchange Online (EXO) mailboxes of all or specific recipients, or excluding specific recipients.
  • All or specific SharePoint Online (SPO) sites, or excluding specific sites.
  • All or specific OneDrive for Business (ODfB) accounts, or excluding specific accounts.
  • All or specific O365 Groups, or excluding specific groups. Note that content in Microsoft Teams (MS Teams) is included in the O365 Groups options that include both the SharePoint content and email/Teams chat content.

Auto-applying retention labels

Both the retention label and policy sections include the ability to auto-apply a retention policy if certain conditions are met.

  • Sensitive information types. These are the same types that appear in the Data Loss Prevention (DLP) section, for example ‘Financial data’ or ‘Privacy data’.
  • Specific keywords.
  • Content types and metadata (E5 licences only). See this post by Joanne Klein for a description of these options.

The ability of the first two options to accurately identify content and apply a retention policy should be investigated before they are relied on.

When do retention policies start working

According to Microsoft’s guidance Overview of Retention Labels:

If you publish retention labels to SharePoint or OneDrive, it can take one day for those retention labels to appear for end users. In addition, if you publish retention labels to Exchange, it can take 7 days for those retention labels to appear for end users, and the mailbox needs to contain at least 10 MB of data.

  • In EXO, the default MRM policy needs to be removed before the new policy applies.
  • In ODfB, the policy is available to be manually applied on folders or documents. It does not automatically apply to content.
  • In SPO, the policy can be applied to document libraries or documents. To avoid removing the ability for users to legitimately need to delete documents in an active library it is recommended to apply the policy after the document library has ceased to be active.
  • Content in Office 365 Groups is covered by either the EXO (for email/teams chat content) or the SPO policy (applied to libraries).

Retention labels apply to individual records within aggregations

Records labels can be applied to aggregations of records (an entire email mailbox or folder, a SharePoint library or list, an ODfB account, O365 Groups) or individual records. However, the disposal process targets individual records (e.g., individual emails, single documents in SharePoint libraries, individual list items). 

That is, even when all the individual records are disposed of, the parent aggregation remains in place without any indication that the records previously stored in it (sometimes known as a ‘stub’) have been destroyed. 

This outcome has implications for the way the outcome of a retention label is set. It requires a choice between (a) delete automatically without review or (b) review before delete.

The latter option is made complicated by the requirement to review individual documents, including potentially in the original container (document library in SPO) and export metadata relating to those records if a record of the deletion is to be retained.

What happens when records reach the end of their retention period

As noted above, the outcome at the end of the retention period (trigger date + n days/months/years) will depend on the settings on the label.

  • Where the label was applied (EXO mailbox, SPO library or list, ODfB folder or document, O365 Group)
  • Whether the records would be deleted automatically or be subject to a disposition review.

If the records are to be deleted automatically:

  • SPO and ODfB records will be sent to the site/ODfB Recycle Bin for 90 days
  • EXO emails will be moved to a ‘Cleanup’ area for 14 days, before permanent deletion.
  • Aside from the audit logs (which by default only go back 90 days), no other record will be kept of the destroyed records.

If the records are subject to a disposition review, an email is sent to the person nominated. When that person clicks on the link in the email they are taken directly to the ‘Dispositions’ sub-section of the Records Management section of the O365 Security and Compliance centre.

It is arguable that retention policies with disposition review should not be applied to ODfB content as this will require the reviewer to review all the content that has been labelled by a user in their ODfB account.

  • For more information about this subject see this Microsoft page ‘Overview of disposition reviews‘. Microsoft note, on that page ‘To get access to the Disposition page, reviewers must be members of the Disposition Management role and the View-Only Audit Logs role. We recommend creating a new role group called Disposition Reviewers, adding these two roles to that role group, and then adding members to the role group.’

The dispositions dashboard shows the number of records that are pending disposition against each retention policy label:

O365_RM_DispositionsDashboard

Pending disposition tab

When the reviewer clicks on one of the retention policies listed, the following view opens for records ‘Pending disposition’:

O365_Dispositions_Pending

An important point to note here is that records are listed individually, not in logical aggregations or collections. It is possible however to use the Search option on the left to filter by author (emails) or SharePoint site and/or site library. It is also possible to export the details (which does not include any unique metadata applied to documents in SharePoint libraries).

All the records displayed may then be selected and a ‘Finalise decision’ dialogue box appears with the following options:

  • Dispose of the records.
  • Extend the retention.
  • Re-label the records.

Disposed items tab

The Dispositions dashboard includes a ‘Disposed items’ tab.

Microsoft note that this tab ‘… shows dispositions [that] were approved for deletion during a disposition review and are now in the process of being permanently deleted. Items that had a different retention label applied or their retention period extended as part of a review won’t appear here.’

Importantly, once records are permanently deleted, they no longer appear in the ‘Disposed Items’ tab. This means that no record will be kept of the records that were destroyed.

Shortcomings of the O365 dispositions/disposal model for records stored in SPO

Only individual records appear, not all the items in a document library

If the retention outcome is based on the ‘created’ or ‘last modified’ date, individual records in SPO document libraries will start to appear as soon as they reach the retention end date. The reviewer may need (or want) to view the original library, which they can identify from the link is in the dispositions review pane.

Retention policies prevent deletion

As a retention label prevents the deletion of content by users, and this may put them off using SharePoint, it is recommended that retention in SPO document libraries be based on when the label was applied NOT when it was created or last modified. This will help to ensure that all documents appear in the disposition review area at the same time.

Event based triggers may not be suitable for disposition review

If the retention outcome is based on an event, or is auto applied and a disposition review is required, those records will appear randomly when the event is triggered. It could be difficult for records managers to decide the disposal outcome in this way without referring back to the library.

The dispositions review pane does not display the original metadata

The  dispositions review pane displays only very basic metadata from the original library. Again, the reviewer may need to view the original library, export the metadata and store that in a secure location. Note that the exported metadata includes the URL of each original record including the library name.

The document library remains even when all contained records are destroyed

If the reviewer chooses to dispose of the records listed, only the content of the library (the individual documents stored in it) is deleted, not the actual library itself. No record (e.g., a ‘stub’ of the deleted item) is kept in the library of the deleted content.

The ‘Disposed items’ tab only shows records being destroyed

The ‘Disposed items’ tab only shows records in the process of being destroyed. It does not keep a record of what was destroyed. Records managers will need to retain the metadata of what was destroyed, when, based on what disposal authority, and with whose approval.

Dispositions really only provides a ‘heads up’ for further action

The Dispositions process may be instead used as a form of ‘heads up’ that records are starting to be due for disposal in a document library. This would allow the records managers (who should be Site Collection administrators) to review the library, export the complete set of metadata, and decide if the entire library can be deleted since it is no longer required.

Conclusions

Retention labels in O365 are an effective way of managing the retention and disposal of records in that environment, subject to the following points.

Email

Emails will likely continue to be managed as complete aggregations of records – the mailbox. Users cannot be expected to create logical groupings and apply individual retention labels to those records.

Organisational records policies may mandate specific timeframes for the retention of email (e.g., 1 year), while HR/IT security policies may mandate that whole mailboxes are retained for a period of time after employees leave. It is important to understand the difference between these two models

Options to automatically transfer emails to SharePoint document libraries via rules may be possible using Flow but these rely on individual users to set up.

Consideration should instead be given to using O365 Group mailboxes, rather than individual personal mailboxes, for specific work related matters. For example, ‘Customer Complaints’, or ‘XYZ Project’.

OneDrive for Business Accounts

ODfB accounts may be covered by two forms of retention:

  • Retention labels that apply to all ODfB accounts while the account is active. These must be manually applied by users.
  • A separate retention period set for ODfB accounts after a user leaves the organisation.

If there is a requirement to prevent the deletion of content by a user from their ODfB account, the better way to achieve this is using an eDiscovery case with Legal Hold applied.

SharePoint Online

As most records will be stored in SharePoint document libraries (including Office 365 Group-based SP libraries), multiple retention labels will be required to address different types of content or retention requirements.

Careful consideration should be given to whether records can be deleted automatically at the end of the retention period or should be subject to disposition review, noting that the automatic deletion provides no opportunity to capture the metadata of the records.

The ‘auto-apply’ or event-based retention option should be used sparingly to avoid a trickle of records for disposal – unless there is enough trust that these can be accurately marked and deleted without review.

Shortcomings in the disposition review process support the following decisions for SharePoint Online content:

  • The number of retention labels should be minimised to avoid a very long drop-down menu when a label is applied. If current record retention or disposal authorities contain a lot of classes, some of these could potential be combined into a single class (e.g., ‘Company Records – 7 years’), while the site name and document library name should provide some context to the content to ‘map’ back to the original classes.
  • Retention labels should be applied when document libraries (or lists) become inactive as this will avoid conflict with users who want to delete content and also ensure that documents are ready for disposition review at the same time.
  • Retention labels applied to SPO document libraries should include the disposition review option unless a ‘delete only’ label is considered suitable for certain document libraries that clearly contain working documents or Redundant, Outdated and Trivial (ROT) content.
  • Records managers should review the content of all or most original SPO document libraries, and export the metadata of those libraries for storage in a separate location (such as an ‘archives’ site), or in the original library with the retention label changed to ‘Never Delete’. The original document library can then be deleted.
Advertisements

Records retention in Exchange Online

August 9, 2019

Retention policies created as labels in the classification section of  the Office 365 (O365) Security & Compliance admin centre can be applied to content in Exchange Online (EXO) mailboxes.

It may not be possible to apply more than one Office 365 retention policy to EXO mailboxes because, unless the mailbox is dedicated to a specific subject (for example, ‘Customer Complaints’), or using a dedicated Office 365 Group’s mailbox:

  • Emails generally contain content about multiple subjects.
  • The way the content is categorised in mailboxes , including through the use of rules and/or folders, varies between users.
  • The retention and disposal of records relies largely on the ability to assign retention policies to categories or groups of records, not individual records.
  • Organisational policies may require all user emails to be kept (‘archived’) for a period of time after they leave the organisation.

Unless emails are moved to a different storage location such as SharePoint, it may be necessary to continue apply a single, but shorter, O365 retention policy to mailboxes.

Exchange Messaging Records Management (MRM) policies

Until Office 365 retention policies appeared as an option, MRM policies applied in EXO were likely based on an organisational business requirement to keep the mailboxes (and other content) of departed users for potential legal or compliance reasons.

MRM policies in EXO are found under the ‘Compliance Management’ section of the EXO admin portal.

EXO_Compliance_RetentionMenu.JPG

When this section is opened, the following message may appear:

EXO_Retention_Message

The default MRM policy has the following options. These may be modified, or additional retention tags created, as required.

EXO_Default_MRMPolicySettingsA.JPG

If the default MRM policies have not been changed (by the Exchange administrator), the default policy/ies will apply. This means that users can use the ‘Assign Policy’ option on folders and emails to decide how long they should be kept.

Emails that are deleted before a backup is made may not be retained.

Some organisations may decide to retain all emails and the mailboxes of departed users ‘forever’. They can do this by removing all the options except ‘Never Delete’.

How O365 Retention Policies are applied to Exchange

Retention labels created in Office 365 can be used to manage the retention of emails, including (to some degree) emails that have content that meets certain pre-defined conditions.

Retention labels are created in the Office 365 Security and Compliance admin portal under the ‘Classifications’ section. This section has three options:

  • Labels. This section is used to create both ‘Sensitivity’ and ‘Retention’ labels. There is also an ‘Auto-apply’ option in the Retention section.
  • Label policies. This section partially duplicates the options in the previous option (except the ‘Create’ option), and lists the labels that have been published.
  • Sensitive info types.

Auto-apply, as its name suggests, auto-applies an existing label based on certain conditions. The conditions are as follows:

  • Apply label to content that contains sensitive info. The sensitive info types are pre-defined options for (a) Financial data (e.g., credit card numbers), (b) Medical and Health (e.g., predefined health records), (c) Privacy (e.g., personal and sensitive information. There is also the option to create a Custom setting.
  • Apply label to content that contains specific words or phrases, or properties. This option works by looking for specific words or phrases.

New labels must be published before they appear or apply anywhere in Office 365.

During the publish process, policies must specify where (in the ‘Locations’ section) the policy is to be applied.

The default option is ‘All locations. Includes content in Exchange email, Office 365 groups, OneDrive and SharePoint documents.’ Alternatively, the policy may be set to specific locations including

  • The Exchange mailboxes of all or specific recipients, or excluding specific recipients.
  • All or specific SharePoint sites, or exluding specific sites.
  • All or specific OneDrive accounts, or excluding specific accounts.
  • All or specific Office 365 Groups, or excluding specific groups.

Note that content in Microsoft Teams is included in the Office 365 Groups options which includes both the SharePoint content and email/Teams chat content.

Mixing MRM and O365 retention policies – maybe not a good idea

If the default MRM policies are not removed, any O365 retention policy that is applied to EXO will appear in the list of retention tags under the default MRM policy, as can be seen in the screenshot below which shows three options in addition to the original MRM policies: ‘Temporary records – 7 days’, ‘Financial Records’, and ‘Company records – 7 years’. If nothing is changed in the environment, these policies can be applied by users to folders and emails. 

O365_MRM_ExchangeRetentionoptions

If the organisation has decided to remove all retention tags except ‘Never Delete’ and a new O365 retention policy is applied to EXO, the ‘Never Delete’ option will prevail and the O365 policy will not work.

Accordingly, careful consideration needs to be given to the creation of O365 retention policies that may be applied to EXO records.

Should user mailboxes be kept ‘forever’?

Many IT departments keep user mailboxes of departed staff (and most other content on the network) for a long time, usually on backups, ‘just in case’ they may be required for legal or compliance requirements, including investigations into misconduct.

Recent personal experience with subpoenas for mailboxes of departed staff indicates that 10 years is likely to be the maximum retention requirement for these types of records. There may be a case to keep certain individual mailboxes for much longer, which the O365 policy allows for.

 

What happens when emails reach the end of their O365 retention policy period?

O365 retention policies define how long records are to be retained before they are either deleted or ready for review (via the Records Management – Dispositions section of the O365 Security and Compliance admin portal).

The following options define what happens when retention is enabled:

  • Retain the content (a) for a specific period (n days/months/years) or (b) forever. Option (b) is the same as the MRM policy ‘Never Delete’.
    • Action to be taken at the end of the period (except ‘forever’): (a) Delete the content automatically, (b) Trigger a disposition review (i.e., notify specific people), or (c) Do nothing, leave the content as is.
  • Don’t retain the content, just delete it if it’s older than n days/months/years.
  • Retain or delete the content based on: (a) When it was created, (b) When it was last modified, (c) When the label was applied, (d) based on an event.

The three actions above define the options for records managers:

  • Allow the emails to be deleted automatically. This is possibly the easiest and most efficient option but it will result in the deletion of any emails when they reach the end of the retention period – if they are kept in Exchange. Importantly, if a specific period of time (e.g., 7 years) is set for email retention, this could start to delete the emails of users who are still with the organisation after that period expires. This fact may affect the retention period that is set. 
  • Trigger a disposition review – see below. This option would be onerous to implement; it would take a lot of effort to review the individual emails of a departed user as part of a disposition review. It would, however, allow for selective review by using the ‘filter’ option in the Dispositions area.
  • Do nothing. This option may be useful for specific types of records, but not emails.

Disposition Reviews

Emails that are subject to a disposition review will appear in the Records Management – Dispositions section of the O365 Security and Compliance centre. Note that the ‘Type’ must be changed from ‘Documents’ to ‘Emails’ to see the emails that are due for disposal. As noted above, while it is possible to filter by user to review the emails, this process could be quite onerous.

O365_EXO_DispositionsA.JPG

Summary

The nature of email makes it almost impossible to categorise them into categories that map to different retention and disposal policies.

Most mailboxes will be subject to a single retention policy.

Office 365 retention policies can and probably should replace the default EXO MRM policies that govern the retention of emails.

Retaining emails in the mailboxes they are stored in ‘forever’ is not a practical retention model. 10 years is a reasonable maximum period, but exceptions may be required.

If O365 retention policies replace EXO MRM policies, records managers need to specify (a) how long emails need to be kept for and (b) whether they can simply be deleted when they reach the end of the retention period or need to be reviewed before deletion.

References

‘Overview of Retention Policies’ https://docs.microsoft.com/en-au/office365/securitycompliance/retention-policies (accessed 9 August 2019)

‘Set up an archive and deletion policy for mailboxes in your Office 365 organization’
https://docs.microsoft.com/en-au/office365/securitycompliance/set-up-an-archive-and-deletion-policy-for-mailboxes (accessed 6 August 2019)

SharePoint Online – records management options and settings

August 6, 2019

This post summarises the primary records management options, settings and ideas that can be applied in SharePoint Online to manage records.

This post should be read as the second part of my previous post on the records management options and settings available in the Office 365 admin and security and compliance portals. Some of these settings will be referred to in this post.

The options and settings described in this post should ideally form part of your SharePoint governance documentation.

SharePoint Governance

We have already seen in the previous post that Office 365 Global Admins (GAs) have access to all parts of the Office 365 ecosystem. But they should rarely solely be responsible for SharePoint Online (SPO).

Some form of governance arrangement is necessary for SPO, especially if you plan to manage records in that application.

Some of the key considerations are as follows.

  • Who is responsible for ‘marketing’ or promoting SharePoint in the organisation, and making sure it is used correctly? The area responsible in IT for change management should probably take the lead on this as SPO is only one part of the O365 ecosystem. Records managers should have a role too, or be consulted.
  • SharePoint Administrator. You should already have a SharePoint Administrator and that person (or persons) is likely to be sitting in your IT department. Records managers will rarely also be SharePoint administrators; the two need to work closely together.
  • Who is responsible for training people to use SharePoint, especially to highlight the recordkeeping aspects of the application?
  • Who are the Site Collection Administrators? See next point.
  • Who are the Site Owners?
  • Who can create Office 365 Groups?

Answers to these questions should all be documented in your governance documentation.

SharePoint Online Admin Portal

SharePoint Online customised administrator

The SPO administrator role, a ‘customised administrator’ set in the Office 365 (O365) portal, should normally have a log on that is separate from that person’s O365 user log on. The SPO administrator account should not be a generic one (and generic accounts should generally be avoided).

The SPO administrator accesses the SPO admin portal from the Office 365 admin portal. They will also have access to the O365 Message Centre and Service Health sections.

SharePoint Online Architecture

Why a design model is good to have

Organisations should have some sort of design model for their SPO architecture. Most records will be kept in document libraries SharePoint team sites under the /teams path but some could also be under the /sites path.

The design model should include naming conventions for sites to avoid site names that have unknown acronyms or complex names. Site names form part of the total 400 characters allowed from https to the document suffix (e.g., .docx) so site names should ideally be no longer than around 16 characters. For example:

https://tenantname.sharepoint.com/teams/sitename

Records managers should be involved in designing this architecture model and could also be part of any approval process for new sites, to ensure the proposed names are suitable.

The names of SPO sites should generally map to business functions. Where the main function is very large (e.g., Financial Management is very large, you may decide to create sites based on the ‘sub-function’. That is, under the broader Financial Management (or simply ‘Finance’) site, you could have a separate site for Finance AP and another for Finance AR). These can be linked to a hub site that could be the ‘parent’ function site.

Don’t mix functions (such as personnel and IT) in the same site if only because this site is likely to become very large.

Try to aim for team site coverage of all business areas as all areas are likely to create or maintain records.

One relatively easy way to do this is to consult with the business area and understand how they use their current Network File Share location. This has the additional benefits of ‘mapping’ their SPO site to their existing NFS structure (generally or very specific) so it is familiar to them, and assisting with the migration of NFS to SPO later on.

Creating new Site Collections

Generally speaking there now are three types of SPO site:

  • A team site not linked to an O365 Group (but can be retrospectively linked)
  • A team site linked to an O365 Group
  • A communication site

Again, generally speaking:

  • SPO team sites (linked or not with O365 Groups) are the functional replacement for network file shares and, accordingly, contain most of the ‘document’ type records.
  • SPO communication sites are used for publishing purposes, including the intranet. They may contain documents in document libraries that, again, replace network file shares previously used for this purpose.

New sites can be created:

  • Directly from the SPO admin portal.
  • Via the ‘Create Site’ dialogue available in each user’s SharePoint portal, when this option is enabled. When this option is enabled, users can create either a Team site (linked with an O365 Group), a Communication site, or a ‘classic’ site (not linked with an O365 Group).
  • When a new Office 365 Group is created. This includes, if enabled, when a new Team in MS Teams is created, a Yammer group is enabled, or the person choose to create a new group from Outlook. If this option is allowed, whoever creates the O365 Group becomes the Site Collection Administrator and the SharePoint admin will be unable to access the site. For this reason, organisations that want to control their SPO environment may wish to limit who can create Office 365 Groups.
  • Via a PowerShell script.

SharePoint Sites

Site Collection Administrators (SCA)

Every SPO site has Site Collection Administrators. To ensure that records managers can access every site to manage records, it may be useful to add them to the membership of a Security Group that is in turn added to every site’s Site Collection Administrators after it is created.

Site Collection Administrators are added and managed in Advanced Permission Settings.

SPO_AdvancedPermissions

When you click on Site Collection Administrators, this dialogue appears:

SPO_SCAs.JPG

As noted above, if the ability to create O365 Groups is not controlled, the person who creates the O365 Group (as noted in the screenshot above) will become the SCA. The SharePoint administrator will be able to see the site in the SPO admin portal but may not be able to change the SCA settings. They may need to ask a Global Admin to do this.

Being a Site Owner only is not sufficient for records managers. Site Owners should be someone in the business area that ‘owns’ and will manage the SPO site on a day to day basis.

Site collection features – document IDs and Document Sets

Site collection features are only accessible to Site Collection Administrators. The list below expands as new features are activated; as can be seen, the ‘Document ID service’ feature has been enabled on this site. (Note, ‘Site features’ are activated from the Site Administration section, see below).

SPO_SiteCollectionAdminOptions.JPG

The Document ID feature is required for recordkeeping purposes as it assigns a unique Document ID to every object (including document sets but not folders) stored in a library.

If they are to be used in the site, the Document Set feature is also enabled in the Site collection features section.

SPO_DocIDDocSetSiteCollectionFeatures.JPG

After the Document ID service is enabled a new option appears in the Site Collection Administration section called ‘Document ID Settings’ (as noted above).

As can be seen in the screenshot below, all Document IDs begin with a unique set of up to 12 characters. Ideally, the Site name should be used as this will immediately give a clue to the site name on the document.

SPO_DocIDSettings.JPG

Document IDs take the form:

  • Prefix (e.g., ‘SITENAME’)
  • Library number. This is a unique and un-modifiable number of the library where the document is stored. It is not based on the library GUID.
  • Next sequential number.

If a document is deleted or moved from the library, the document ID (the sequential number) is not re-used.

Note that Document Sets use the same Document IDs. These cannot be separately modified.

Site collection features – Site Audit logs

The option ‘Site collection audit settings’ will already be visible in the Site Collection Administration section of all new sites, however (a) the options in the audit settings need to be enabled and (b) the ‘Reporting’ Site collection feature must be activated to enable the production of Site Audit Logs as required.

Note, the Site collection audit sections settings notes that ‘If you’d like to keep audit data for longer than this, please specify a document library where we can store audit reports before trimming occurs’. The default storage location is /_catalogs/MaintenanceLogs. However, the various options shown below must be selected for anything to be saved.

SPO_SiteCollectionAuditOptions.JPG

Enabling ‘Reporting’ results in a new section in the Site collection administration sections called ‘Audit log reports’. This section allows the Site Collection Administrators to create one-off audit logs for a range of activity on the site, going back 90 days.

  • Content Activity
    • Content viewing
    • Content modifications
    • Deletion
    • Content type and list modifications
  • Information Management
    • Policy modifications
    • Expiration and Disposition
  • Security and Site Settings
    • Auditing settings
    • Security settings
  • Custom
    • Run a custom report

The 90 day time period is the same as the O365 audit logs accessible from the Security and Compliance ‘Search’ section. If audit logs are required for longer periods, an add on may be required.

Metadata – Site columns or the Managed Metadata Service

The architecture model and/or business requirements may require the use of specific metadata across your environment. Metadata may be set in three ways.

Managed Metadata Service. This option is effective if you need to use the same metadata columns on multiple sites. Experience suggests that this option will be used selectively.

Site columns. These are in addition to the many columns that already exist by default on every site. This option is very effective if the same metadata needs to be used in multiple document libraries or lists on the same site. It is not accessible on any other site. In document libraries or lists, it must be added as an existing site column (i.e., not via the Create new column option).

Library columns. These columns are created in individual libraries or lists and are not accessible on any other library or list.

All new Site and Library columns have the following options:

SPO_NewColumnOptions.JPG

Each new column may be created in an existing or new group. They may also be (a) made mandatory and/or (b) enforce unique values. Note that making a Site Column mandatory and adding it to a document library will make the library read only in File Explorer if it is synced there.

Columns may have default values and may also include JSON formatting codes.

When Site columns are added to a document library, including via a Content Type (see next section), users may be required to fill in the required metadata (especially if it is mandatory).

Site Content Types

Site Content Types are a way to define metadata requirements for different types of documents, using Site columns. The default ‘document’ Content Type on every new SPO site is simply ‘Document’; all new document-based Content Types will be created using that one as a template.

Site Content  Types may also incorporate standard document templates (via the ‘Advanced section). These templates can be auto-populated using the library metadata. In any case, all metadata in a document library is added to any Office document as its metadata ‘payload’.

Once created, Site Content Types must be added to each individual library where they are to be used. To do this, the individual library must have the setting ‘Allow management of content types’ enabled in the ‘Advanced’ section of the document library settings.

SPO_DocLib_ContentTypesYesNo.JPG

When Content Types are enabled in this way, some other drop down features in the ‘+ New’ option on the library disappear, such as the ability to create Word, Excel or PowerPoint documents as can be seen below (the option on the right shows when Content Types are allowed).

Aggregations, containers, ‘files’ – Site Libraries

SharePoint document libraries are the container, aggregation or ‘file’ (if you will) in which records are stored. They are the functional replacement for network file shares. You may end up migrating from those NFS to SharePoint.

Naming conventions for new document libraries are useful to have but the extent to which you require people to follow them (if Site Owners create them) may differ between organisations.

Document libraries ideally should contain only a year of content; including the year in the library name is a good way to maintain year-based content, which in turn makes it easier to manage at the end of the record’s life.

Avoid using the generic ‘Documents’ library that comes with every new library because users will create folders with uncontrolled names and content.

All SPO document libraries and lists have default views of the metadata. These views can be modified as required (via the option on the top right of the menu bar) with a range of additional metadata that is by default hidden from view. Multiple views can be created; pre-defined views may sometimes be easier than expecting users to depend on searches.

Document libraries include all the usual and expected document management functions including check out/in, copy to or move to and versioning.

SPO_DocumentOptionsincCheckOut.JPG

Users with Contribute or Edit permissions can view and restore versions.

SPO_DocLib_VersionHistory.JPG

If there is a requirement to know who modified what part of a document, it is recommended to enable track changes on that document.

Note, with co-authoring now available, the last person to edit the document will create the last version.

Folders and document sets

Folders  should be seen as visual ‘dividers’ within a file, not as ‘hard-coded’ structures as they are in file shares.

Document sets can include additional metadata (including a document ID), making them suitable for use in breaking down a document library. However, for most of the time, folders are a more logical ‘divider’ for users.

Note that both document sets and folders look the same in a synced library.

Both folders and document sets can have unique permissions.

Create and capture records

One of the best reasons for using SharePoint is the ability to create a single source of truth. That is, a single record stored on a library that multiple people can access and work on at the same time.

Having a single source of truth avoids the requirement to (a) create a initial copy on a personal drive or network file share, (b) attach that copy to an email and send it to multiple people who are all likely to save it somewhere and also send back a changed version.

In SPO, users can create a new record directly within a document library (or in the synced library on a drive). Anyone with access to that library can access it; alternatively the document can be shared. Co-authoring means that anyone with edit access can edit the document. Every time it is edited and closed a new version is created.

If it is necessary to refer to the original from another library, the ‘Link’ option can be used.

Access controls and permissions

All SharePoint site contain three default permission groups. Individuals will usually be added to one of these groups only, depending on their access requirements:

  • Site Owner – Full Control across the site but cannot see the Site Collection Administration section (shown above). There will normally be only two to four Site Owners. Site Owners are responsible for managing their sites.
  • Site Member – Update and edit.
  • Site Visitor – Read only.

All content on a SharePoint site inherits the default permissions above however at any point the default permission inheritance can be broken and unique permissions applied. This is a manual process for document libraries (via Advanced Permissions) but automatically applied if a folder or document is shared with someone who is not in a default permission group.

Note, one of the leading support issues in SharePoint is understanding and unravelling complex permissions, especially when applied to individual documents that are placed ‘under’ folders with unique permissions, in libraries with unique permissions.

Retention and disposal

Generally speaking, a SPO site collection will consist of multiple libraries, each (ideally) containing content that is specific to the activity that it relates to. For example, a library for ‘Meetings’ and one for local forms. Consequently, the records stored in document libraries may require different retention.

If O365 Classification labels are used for retention, and depending on how these are configured, these must be applied per library; the individual documents stored in the library – not the entire library as such, are then governed by the retention requirement.

It is also possible to apply a policy to an entire site collection via site policies. This option will only be useful if the entire site can be subject to a single retention requirement – for example, inactive old sites that have a range of content all likely to be covered by the same retention period, or project sites.

Once retention policies are applied to a library, users cannot delete any content in the library so it may be prurient to apply them when the libraries are no longer used instead. Hopefully you will have implemented year-based libraries, which will facilitate this. Alternatively, the retention period trigger can start when the actual policy is applied.

It may be useful for the records manager to review the content of document libraries, and perhaps export the metadata of the library, before the content is disposed of via the O365 Security and Compliance area as any unique metadata is not visible in the Dispositions area.

When records are due to be disposed, an email is automatically send to whoever is in the Records Management role in the O365 Security and Compliance admin portal. The activity of reviewing and approving disposals happens in the O365 Security and Admin portal.

It may be useful to set up an ‘Archives’ SPO site to keep records of all disposal activities, including metadata from document libraries.

Note the library will remain even after the documents are destroyed. An alternative and perhaps better disposal model would be to use the notifications to alert the records manager to the records due for disposal; the records manager may then export and save the metadata in a SharePoint archives site, and then delete the library entirely.

Note that the retention of records in Exchange Online mailboxes and OneDrive may be managed differently by the organisation.

Minimising duplication of content

SharePoint allows organisations to have a single source of truth, to avoid the duplication of using NFS and then uploading to a document management system.

Users can create the record within a SharePoint library, upload it there, use the ‘save as’ option (where you will see all your SPO sites to choose from).

The ability to share with external users (when this is enabled) also helps to reduce duplication and email attachments.

Hybrid records

As noted above, links can be created in any SPO document library to point to resources in a different location. If paper records are managed in a SPO list, the document library can include a link to that SPO list.

Syncing document libraries to File Explorer

Users with an O365 licence and Windows 10 may use the ‘sync’ option available on the ribbon menu of every library. This option syncs the document library to the user’s File Explorer from where they may continue to access and work on the documents.

Note, as discussed in this post, if there is any mandatory metadata in the library, the synced library will become read only.

End users like using the sync option as, although it doesn’t (yet) display any unique metadata on the library, it allows them to work the way they have always worked and they get the added bonus of being able to do it on any device.

eDiscovery

eDiscovery cases are created in the O365 Security and Compliance portal. Essentially, an eDiscovery case uses search and other options to find records. Once found, these records can be placed on Legal Hold, which prevents their disposal.

If a document library has no retention label applied, and all or some of the content is identified as part of the eDiscovery case with a Legal Hold, and a user deletes a record, that record remains in a hidden library but still visible to the eDiscovery case manager. Once the Legal Hold is lifted, the record will resume the 90 day deletion process after which it will no longer be available.

Search

Search in SPO, and across all of Office 365, is very powerful. A single click in the Search box in the user’s SPO portal will result in suggestions before anything is entered.

Searches will return anything the user has access to. The access limits plus the Artificial Intelligence (AI) engine will return different search results for different users.

Users may also take advantage of the Office Graph-powered Delve (E3 licences and above) or the Discovery option in OneDrive to see information that may be of interest to the user. This works on the basis of the various ‘signals’ between users and objects, as depicted in the graphic below.

microsoft-graph_hero-image.png

 

Office 365 admin and Security and Compliance portals – records management options and settings

August 1, 2019

If you plan, or want to understand how, to manage records ‘out of the box’ in the Office 365 ecosystem including in SharePoint Online, Exchange Online and MS Teams, you will need to know the available options and settings. These would normally be set by the Office 365 Global Admins (GAs) or, in some cases, devolved to Customised Administrators. GAs have access to all parts of the Office 365 environment including SharePoint Online, Exchange, OneDrive and Microsoft Teams.

See the next post for a list of the options and settings available in SharePoint Online to manage records.

Note, the description below is for a typical E3 licenced level organisation. E5 licences provide additional capability some of which is referenced below with a comment.

Office 365 admin portal options and settings

The options and settings in the Office 365 admin portal required to manage records are listed below.

Customised administrator

In addition to the GAs, the Office 365 admin portal is where customised administrators are set up. Typically these admins will have log ons that are different from their normal user log on and will not need the full range of licence options. The SharePoint Admin role is a customised administrator.

Records managers could potentially be SharePoint Admins if they are suitably skilled. Otherwise, at the very least they should be Site Collection Administrators and work closely with the SharePoint Admins to ensure that SharePoint Online (SPO) is configured correctly.

Office 365 Groups

Records managers need to understand how Office 365 Groups work.

Most people know that Distribution Lists (DL) are used to send emails to multiple people. However, DLs cannot be used to control access to IT resources; this is achieved by using Security Groups (SG). SGs, on the other hand, are not email enabled.

Office 365 (O365) Groups are ‘kind of’ a mix of DG and SG functionality in that they can be used to control access to certain resources in Office 365 (including SPO) AND they can be used to contact all members of the Group.

But O365 Groups are much more. They are in many respects central to Office 365.

  • Every new O365 Group creates a SharePoint site (this is not optional).
  • If the creation of O365 Groups is not controlled, every new Team in MS Teams creates an O365 Group that in turn creates a SPO site.
  • If you use Yammer, every new Yammer group also creates an O365 Group that creates a SPO site.
  • Again, if not controlled, any user can create a new O365 Group from Outlook.

In short, you need to either allow their creation and expect to see multiple uncontrolled SPO sites, or control their creation. There is no middle path.

Additionally, if the creation of O365 Groups is not controlled, the Owners of the new O365 Group (usually the person who created it and anyone else they invite) will become the Site Collection Administrators, locking the SharePoint Admins out of the site. They will need to call on the O365 GAs to give them access to the site.

External Sharing for SharePoint and O365 Groups

Although it relates more to security, external sharing is a option and setting that may require input from the information or records management area. External sharing is initially enabled in the O365 Admin portal in the Settings – Services and Add-ins section.

O365_Admin_SPOExternalSharing

Note, even if this setting is enabled, SPO sites don’t have this enabled by default. The setting is controlled from the SharePoint Admin portal.

External access for Office 365 Groups is set in the following setting:

O365_O365GroupsExternalSharing.JPG

Office 365 Security and Compliance admin portal options and settings

The options and settings in the Office 365 Security and Compliance admin portal required to manage records are listed below.

Permissions – Roles – Records Management (and others)

The Security and Compliance admin centre includes several roles in the ‘Permissions’ section that may be required by records and/or information management staff, especially to establish records retention schedules, manage dispositions, check audit logs and manage eDiscovery cases and legal holds.

Classification – Labels (Records Retention labels)

Records retention policies in O365 are set in the O365 Security and Compliance Portal in the Classifications section. These retention policies may be applied across SPO, Exchange Online, Teams.

Some thought needs to go into this including potentially grouping policies that have the same retention requirement (e.g., 7 years), or using the File Plan (see below) and other options now available to group them. This requires records management input.

O365_FilePlanDescriptors

Classification policies used for records retention will be applied across all of the O365 environment, not just SPO. However, your IT department may want to implement different rules for Exchange (e.g., using the default MRM policy to keep all emails ‘forever’) or OneDrive (e.g., a 7 year retention for everyone’s content after they leave).

Click this link for more details about Retention Policies.

Records Management – Dispositions

The O365 Security and Compliance Centre includes a ‘Records Management’ section that has three options: File Plan, Events, Dispositions. Records Managers should have access to these areas; this is achieved by them having the ‘Records Management Role’ in the ‘Permissions’ section.

The ‘File Plan’ section displays a list of retention policies (labels) with any details added to the ‘File Plan’ section (shown above), thereby providing the records manager with a view of all labels and any added details, for example by numbering, citation and so on.

The ‘Events’ section shows any events that have been defined for use in retention policies.

The Dispositions section has two parts, a basic dashboard that shows all retention policies and the number of records covered by those policies:

O365_RM_DispositionsDashboard

If the records manager clicks on any of the policies it displays the records due for disposal and provides the various options for disposal. It also shows records that have been disposed on a separate tab.

O365_Dispositions_Pending

Search

The search section has two options: Content search, and Audit log Search. Access to both may be controlled but records managers may need to have the ability to ask for information from either from the GAs.

eDiscovery

The eDiscovery section is where eDiscovery cases are established. Cases are a form of content search that, once completed, puts any retention policies on hold (legal hold) under the case has been removed.

eDiscovery cases may includes searches across all of Office 365 (Exchange email, O365 Group email, Teams messages, To-Do, Sway, Forms, SPO, OneDrive, O365 Group SPO sites, Teams sites, Exchange public folders) or selected parts only. They may also be used to search mailboxes for specific individuals or selected SPO sites.

Governance

All of the above (and all other settings) should form part of a governance document that details the O365 environment. Settings should only be changed with agreement of everyone in a governance team.

How SharePoint column settings can affect libraries synced with File Explorer

July 23, 2019

In my previous post I described how end users might adopt SharePoint (Online) more quickly if they can work the way they always have – in File Explorer. And this is certainly the case if there are no mandatory columns in the source SharePoint library.

However, the presence of mandatory columns (‘Required’ in Content Type settings) may result in different outcomes, in both the synced library in File Explorer and also in the SharePoint Online library. This post describes the various options.

Thanks to Francis Laurin for pointing out a particular issue, described below.

Synced library outcomes in File Explorer

There are five potential outcomes when a SharePoint Online document library is synced with File Explorer via the newest version of the OneDrive sync tool and Windows 10.

The first option is when there are no mandatory columns in the SharePoint document library, including when the document library contains a Content Type that contains a Site Column that is not set as mandatory in the Site Column settings.

In this option, When the document library is synced, the documents appear with a cloud icon under the Status column. If the document is edited, the Status icon changes to a green circle with a tick.

O365_SyncedLibrary_nonMand1.JPG

The second option is when the SPO document library has a ‘local’ column that is mandatory. When this happens, the OneDrive for Business Sync (ODBS) client alerts the end user that the synced library is now read only. No documents can be added or edited via File Explorer.

O365_SyncedLibrary_LocalMand1

The third option is when (a) a Site Content Type (CT) is added to the library (after enabling the management of content types via the Advanced Library settings), (b) that CT has a site column that is made mandatory via the Site Columns settings, and (c) the CT is not the default or only CT.

In this case, the synced library in File Explorer is made read only (same as the second option above).

The fourth option is when (a) a Site Content Type (CT) is added to the library, (b) that CT has a site column that is made ‘Required’ via the CT settings, and (c) the CT is not the default or only CT.

In this case, the synced library in File Explorer is the same as the second option above. That is, the library in File Explorer is locked so end users can NOT upload documents or editing existing ones.

The fifth option is when (a) a Site Content Type (CT) is added to the library , (b) that CT has a site column that is made ‘Required’ via the CT settings, and (c) the CT is the default or only CT.

In this case, when documents are uploaded to the synced document library in File Explorer, the document is added but is automatically checked out in the SPO document library and ONLY visible to the person who uploaded it.

  • The end-user can see their uploaded document in File Explorer but anyone else who syncs (or has synced) the library cannot see the newly uploaded documents.
  • The end user can also see the document in the SharePoint library, which now has the ‘checked out’ and ‘required info’ missing notifications (see screenshot below). No-one else can see the document, which now sits in the ‘Checked out documents’ part of the library.

O365_SyncedLibraryCheckedOutDocSPO

It is quite likely that an end user will not know that the document they have uploaded is only visible to themselves and checked out in the SPO document library. They will also not know, when they upload a document, that a special CT may exist. Accordingly, care must be taken with the fifth option.

Recommendations for the use of File Explorer

If you are planning to suggest that users sync their document libraries to File Explorer, try to ensure:

  • There are no mandatory columns in the library, including Site Columns that are made mandatory, including in added Content Types. This will lock the library in File Explorer.
  • There are no columns in a Content Type that have been changed from Optional to Required AND where that CT is the only or default CT for the library. Although this option allows end users to upload documents, it will cause documents to be automatically checked out when they are uploaded. Consider instead leaving the default ‘Document’ CT in place or using non-mandatory columns.

 

Using the Sync option to work smarter and reduce duplication, and increase end user acceptance of SharePoint

July 18, 2019

Note: A correction was made to this post on 20 July 2019, relating to if a document library contains mandatory metadata.

Perhaps the single most common complaint about using electronic document management (EDM) systems over the last two decades has been the requirement to save a copy of a record stored on a network file share to the EDM system.

Network file shares are littered with documents, many of them duplicated in other locations, on personal drives (and removable drives), and attached to email messages. Some of these documents may also have been saved in the EDM system. 

It is a known fact that legal discovery activities rarely focus solely on the records in an EDM system, no matter how good that system may be. As long as network file shares (and personal drives) have existed (and continue to exist) alongside EDM systems, the latter has always been the poorer sibling in terms of information value.

Various attempts over the years by EDM vendors to ‘integrate’ their products with network file shares (often via WebDAV – see below) have rarely been successful not the least because the folder structure of the network file share is inevitably more useful and flexible than the often rigid structure of the EDM.

*WebDAV, or ‘Web Distributed Authoring and Versioning’ (RFC 4918) is ‘an extension to HTTP, the protocol that web-browsers and web servers use to communicate with each other’. WebDAV facilitates collaborative authoring, editing and file management. The most common usage of WebDAV is to map cloud storage as a network drive. (Source: WebDAV: What it is, where it turns up, and its alternatives, retrieved 18 July 2019)

The old ‘Groove-y’ way

Microsoft Office Groove 2007, or ‘Groove’, was a Microsoft Office component that used WebDAV to synchronise with a SharePoint library, allowing the library to be opened from Windows Explorer. (Source: Understanding and troubleshooting the SharePoint Files tool in Groove 2007, retrieved 18 July 2019)

While this method worked, it was clumsy and difficult to use. Duplication on network file shares continued.

2018 – The new OneDrive for Business sync client

The previous Groove OneDrive for Business sync client (Groove.exe) was included with the Windows 10 Operating System that was released in mid 2015.

The new SharePoint Online became widely available from 2016 and has continued to evolve. Initially, it was only possible to synchronise a SharePoint Online document library using WebDAV methods.

The new OneDrive sync client (OneDrive.exe), also known as the Next Generation Sync Client (NGSC), appeared in early 2018. The new sync client allowed users (with Windows 10 devices) to sync their SharePoint document libraries to File Explorer.

A mostly unnoticed but significant change

The sync option on SharePoint document libraries (in addition to OneDrive and OneDrive for Business) is possibly one of the least noticed changes that has the potential to have – ironically – both a major and also minor impact on the way people work.

It is a minor impact because – provided the synced document library does not have mandatory metadata (see below) – the change effectively allows users to continue working the way they always have, in File Explorer, going only to SharePoint Online when they need to.

It is a major impact because, coupled with the ability to ‘share’ content easily (directly from File Explorer), the potential for duplication – except for the duplication between ‘work’ and ‘personal’ spaces – has been removed. Everyone with access to it can sync the same document library and multiple people can work on documents in the library at the same time.

Instead of creating a ‘working’ document on a drive and perhaps emailing it to everyone, there now only needs to be a single copy that multiple people can access – via File Explorer, at the same time. Everyone with access can see when any other person is editing.

That is, end users can continue to work in File Explorer, the way they have always done. In that sense, the ability to sync a document libraries makes redundant the need to open a browser and access SharePoint that way. (This in turn impacts on the way change is managed and perhaps how each SharePoint site might be configured).

How it works

As a start it should be emphasized that this works best with Windows 10 as Windows 7 devices may still have the old ‘Groove’ client installed.

Please note also that this only works if there is no mandatory metadata on the document library. If there is, the users will be unable to add new content to the synced library, or edit existing documents. See below for more information.

Users need to go to the SharePoint site first and click on the library they want to sync. Users need to have edit rights on the library to sync it.

They should then see the Sync option:

O365_SyncRibbon

The OneDrive for Business client notifies the user that the library will be synced.

O365_Sync_ODfBClientB

The library is then synced to the user’s File Explorer.

Note: If the document library has any mandatory metadata, the user will be notified via a pop-up that the library has been synced in ‘read only’ mode.

A new icon (with the Office 365 tenant name) appears on the left, and each document library that is synced is shown as a folder beneath it.

If the document library has any mandatory metadata columns OR the library requires check out (via Versioning settings), an additional ‘lock’ appears to the right of the sync status. This means the documents cannot be edited and new documents cannot be added. (Source: Sync SharePoint files with the new OneDrive sync client)

SPO_FileExplorerLockMandatoryMetadata

If neither condition applies, end users can work directly in the synced document library in File Explorer, including adding new folders and documents.

End users may also select which folders they wish to sync either by opening a folder in SharePoint and syncing from there, or by right clicking on the folder that was synced, clicking on ‘Settings’ and removing any unwanted folders. This, of course, could mean that users don’t see new folders they really should see and may as a result attempt to create one with the same name (which will be rejected).

Documents are not downloaded to the user’s computer until they open them. This can be seen below in the first document with a circle/tick icon (downloaded) and the three others with cloud icons (not downloaded).

O365_Sync_FileExp_Docs.JPG

The user can right-click and use the Share option (the same as in SharePoint Online) to share the document with colleagues which (as long as the person sharing has the permission to do so) gives the other person access if they didn’t have it before. The three dots at the top right of the dialogue box provide the option to manage access to the document.

O365_Sync_FileExpl_ShareOption

Note: End users cannot copy and paste a link to sync a library, the sync runs from a user’s computer and is personal to their log on and their device.

End user reactions

Personal experience supporting thousands of end-users with access to SharePoint Onine indicated that this was perhaps one of the most useful features ever released.

Several people noted that they regarded the sync option as a ‘cloud-based backup’. Some indicated that they rarely returned to the browser version of SharePoint for their key document libraries (which may be problem).

What about metadata and content types?

Presently, document libraries synced to File Explorer do not display any metadata associated with the document or document library, only the icon, name, date, type and size.

However, Microsoft Office documents (Word, Excel, and PowerPoint) retain any original metadata in the document properties (the ‘metadata payload’) and these properties may be changed on the document itself via the ‘File’ option.

Any metadata columns are also ignored; a user may add a document directly to the synced document library in File Explorer without having to add metadata. Note that this is the same behavior in SharePoint Online; if a document is added to a library with a metadata column, a warning appears (see screenshot below) but the document can still be uploaded. (This paragraph was corrected on 20 July 2019 to remove reference to mandatory columns, which make the synced library read only).

SPO_ExampleMissingMetadata

Note also that new options coming soon to SharePoint Online, which will also be seen via the ‘Share’ option in File Explorer, is the ability to set restrictions such as the ability to print or download, or expiry dates.

The new way of working

The old way of working was to create and manage documents on network file shares and personal drives, emailing copies as required. Adding documents to EDM systems was an additional and disliked step that in most cases created a copy of a document that still remained on a drive somewhere. (And, in many cases, the EDM system had a linked file share where the documents were stored).

The new way of working minimises the need for duplication.

  • Users create a new Office document (including directly from OneDrive or SharePoint, where it is automatically saved in the library from which it was created)
    • If the document was not created from OneDrive or SharePoint, the ‘save’ dialogue presents the following locations by default: OneDrive (personal); SharePoint (any SharePoint site the user has access to – including the synced document library on File Explorer); or ‘browse’ to another location.
    • If the document is saved to the synced document library in File Explorer, it is then automatically copied to the SharePoint Online document library (and a green circle and tick appears).
    • If the document is saved to a SharePoint Online library directly, it will appear in a synced folder in File Explorer initially with a cloud icon.
  • The document may then be shared, either from File Explorer or in SharePoint Online (the same Share dialogue on both).
  • The recipient of the Share invitation can then open the document directly and edit it (if given those rights).
  • Any edits of the document will be recorded in the version history of the document. Other actions (e.g., changes to security) will be recorded in the audit logs.

However, if the library contains any mandatory metadata, the synced library will read only.

One document, stored in a single location, accessed by many. A new, much smarter, way of working.

Office 365 – Security and Compliance – Records Management section

May 30, 2019

Microsoft have introduced a new ‘Records Management’ section in the Security and Compliance portal of Office 365. In many respects, this is simply a re-ordering of what was already in place however it keeps logical elements together, including the new ‘File Plan’ option.

O365_CompliancePortal_RecordsManagementetc

File Plan

The new File Plan option appears when a new retention label is created as shown below.

O365_Classifications_Labels_FilePlan1

Editing the file plan descriptors section brings up the following options, which allows organisations to ‘map’ a File Plan (or BCS) to retention and disposal policies.

O365_Classifications_Labels_FilePlan2

Each of these sections allow you to choose from an existing option or add a new one.

If retention policies have been mapped to a file plan, these mapped policies can be viewed when clicking on the ‘File Plan’ section under Records Management, which displays the File Plan according to the Label, allowing the records manager to view these labels as part of a File Plan …

O365_Compliance_RecordsManagement_FilePlan1

… or just the Policies:

O365_Compliance_RecordsManagement_FilePlan2

Events

The Events section will only have content if any of the retention policies is linked with a pre-defined event. These events will be listed in this section.

O365_Compliance_RecordsManagement_Events

 

Office 365 Records Management update

May 30, 2019

In my post Applying retention periods to SharePoint document libraries and disposal/disposition actions I included a series of screenshots including one that showed the list of records due for disposal and an option to filter this by site URL.

The site URL filter option has been replaced with ‘Type’ (documents or emails) and ‘Search’ options as shown in the screenshot below. To filter by the site URL, simply enter all or part of that URL in the search option as shown. Actions can then be taken on all documents in that site library.

O365_CompliancePortal_RecordsManagement_DispositionListf

Note that you can Export this date, but also note that the ‘Pending disposition’ section does not display any additional metadata that may be been associated with the documents. Accordingly, it may still be necessary to return to the original library, export all the metadata, and then save that manually to keep a record of what was destroyed.

The ‘Disposed items’ shows a list of records that have been disposed of. It is not yet clear how long this information will remain in this area. Also note that the Disposed items section does not include the ability to search, thereby to refine the list of documents to a site or library.

O365_CompliancePortal_RecordsManagement_DispositionListe

Metadata Payloads in the Digital World

March 19, 2019

For at least twenty years, a core tenet of both document and records management has been the metadata that defined records. A number of metadata schema were developed over the years, including the well-known Dublin Core (http://dublincore.org/documents/dces/) that defined 15 core metadata elements for digital content:

  • Contributor
  • Coverage
  • Creator
  • Date
  • Description
  • Format
  • Identifer
  • Language
  • Publisher
  • Relation
  • Rights
  • Source
  • Subject
  • Title
  • Type

Introduction of XML based documents

Parallel with the development of metadata schema, the introduction of XML-based documents (e.g., .docx, odb) from the early 2000s introduced a new way of both structuring and describing documents. Instead of being external to the document, metadata could be embedded within the document, making it effectively a type of ‘metadata payload’.

Around the same time that XML-based documents were introduced, I wrote about the ‘Semantic Office’. The Semantic Office drew on the same ideas developed and implemented for the ‘Semantic Web’. Conceptually, the idea was quite simple – just as web pages would contain their own embedded metadata in the form of Resource Description Framework (RDF) triples (subject – predicate – object, e.g., sky – is – blue), common office documents such as Outlook, Word and Excel could carry their own embedded metadata ‘payload’.

Some of this metadata is visible in the Properties pane of a records but only as descriptive terms not as metadata defined against a specific schema.

The (mostly overlooked and under-reported) outcome of the introduction of XML-based documents was that a document could be stored anywhere and be found again based on the embedded metadata – as opposed to finding it through  metadata that was created and managed separately from the record (for example, in a document management system). For some reason, however, the predominant and persistent model for document management has been to store metadata about a document separately from the document.

In most document and records management systems since the late 1990s, digital records (emails included, if they are saved to the DRMS) were/are stored in secure file shares while the metadata about the record (including its ‘file’ or ‘container’ identifier) was stored in a separate database. Visually this gives the user the illusion that the records are stored ‘in’ a container even though they are actually stored in a network file share.

This pervasive document management model is conceptually similar to the way computers record metadata about documents stored in a Windows NT File System (NTFS) in the Windows Master File Table (MFT). MFT entries include details of the size, time and date stamps, permissions, and so on. It assumes that the actual location of the record is recorded in the metadata.

How XML-based documents embed metadata

XML-based Office documents (as well as PDFs and image files), however, retain core metadata information within the document itself. The information is accessible regardless of where the document is stored.

Ironically (perhaps) it may be different from any external metadata used to describe the document.

To view the embedded metadata in a Word document you only need to rename it to .zip and then unzip it. Extracting a zipped Word document reveals (in most cases) several folders and one XML file:

  • [trash] – contains ‘dat’ files (may not be present in all documents)
  • _rels – contains the ‘.rels’ XML document
  • customXml – contains a number of ‘item’ and ‘itemProps’ XML documents
  • docProps – contains three very small files: app.xml, core.xml, custom.xml
  • word – contains a range of XML files and additional folders with other XML files.
  • [Content_Types].xml

In one example Word document downloaded from a SharePoint library, the file ‘item4.xml’ in the ‘customXml’ folder contained both XML namespace (xmlns) information as well as the embedded document management elements (highlighted in bold):

A separate xml document also located in the ‘customXML’ folder contained the following core properties, including most of the Dublin Core elements listed above (but note that they are all blank).

Arguably, the body of the record is also a form of metadata, enclosed by the terms <body>text</body>. In the example document downloaded from SharePoint, the body of the document is contained in the file ‘document.xml’ under the ‘word’ folder of the package.

  • xmlns:wps=”http://schemas.microsoft.com/office/word/2010/wordprocessingShape&#8221; mc:Ignorable=”w14 w15 w16se wp14″>
  • <w:body>
  • <w:p w14:paraId=”195D8795″ w14:textId=”77777777″ w:rsidR=”0001502C” w:rsidRDefault=”00880316″>
  • <w:r>
  • <w:t>Test document</w:t>
  • </w:r>
  • </w:p>
  • <w:p w14:paraId=”195D8796″ w14:textId=”77D86E32″ w:rsidR=”006832E2″ w:rsidRDefault=”006832E2″ w:rsidP=”006832E2″>
  • <w:r>
  • <w:t>Lorem ipsum (and the rest of the text, deleted for brevity)</w:t>
  • </w:r>
  • <w:bookmarkStart w:id=”0″ w:name=”_GoBack”/><w:bookmarkEnd w:id=”0″/>
  • </w:p><w:sectPr w:rsidR=”006832E2″>
  • <w:pgSz w:w=”11906″ w:h=”16838″/>
  • <w:pgMar w:top=”1440″ w:right=”1440″ w:bottom=”1440″ w:left=”1440″ w:header=”708″ w:footer=”708″ w:gutter=”0″/>
  • <w:cols w:space=”708″/>
  • <w:docGrid w:linePitch=”360″/>
  • </w:sectPr>
  • </w:body>
  • </w:document>

Other core metadata elements are contained in the ‘core.xml’ file:

Why is this important?

The existence of – and ability to make use of – embedded metadata seems to have been overlooked since the introduction of these types of records over 15 years ago. This may have been primarily because no-one had a system in place to access or use that data in any meaningful way.

Instead, most records continued to be defined by metadata that is created or captured and managed separately from the record itself.

The problems with storing metadata separately from the record are that: (a) the external metadata may be different from the embedded metadata, and (b) the external metadata may unnecessarily limit or restrict the ability to see the record in different contexts.

For example, one person may assign a specific metadata term, such as a function from the Business Classification Scheme (BCS) to the digital record, or assign it to a specific ‘container’. Some time later, another person may try to find the same record but discover it is not in the same file, or assigned to the same function term. They are likely to be looking for the record in or from a completely different context.

The only way they may be able to find it is by doing a general search that includes the body or content of the records, something I found to be the case in real life scenarios where users couldn’t find the records they were looking for based on metadata searches.

Of course, metadata is still important, but my point is the difference between embedded metadata that can be added when the document is saved to a document library, and external metadata that is stored separately from the digital record.

Being able to leverage the metadata embedded in records, wherever they are stored, provides a much more powerful ability to leverage this information, similar to the way the application of metadata to web pages facilitates access.

Records Description Framework

A core part of the world wide web is the application of metadata to web pages to facilitate their discovery in a highly connected world. The core elements of this metadata are defined in the World Wide Web Consortium (W3C)’s Resource Description Framework, or RDF.

To quote the World Wide Web (W3) consortium:

‘RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.’ (Source: https://www.w3.org/RDF/)

It is perhaps not surprising that Microsoft named the analytic engine behind Office 365 the Microsoft Graph.

According to Microsoft:

‘Microsoft Graph is made up of resources connected by relationships. For example, a user can be connected to a group through a memberOf relationship, and to another user through a manager relationship. Your app can traverse these relationships to access these connected resources and perform actions on them through the API. You can also get valuable insights and intelligence about the data from Microsoft Graph. For example, you can get the popular files trending around a particular user, or get the most relevant people around a user.‘ (Source: https://developer.microsoft.com/en-us/graph/docs/concepts/overview)

microsoft_graph

The RDF model is also used in knowledge management applications such as Protege that supports the creation and use of RDF/XML ontologies.

Implications

In my opinion, the implications of XML-based office content (which has been around for over 10 years now) are quite important for records management theory and practice.

While, like traditional EDRM systems, documents are visually displayed ‘in’ the document library, each document retains its own originally assigned metadata even if it is downloaded – unless the user uses the ‘Check for Issues’ – ‘Inspect Document’ option from the Info panel to remove them.

The ability to store metadata properties directly in the document facilities that ability to locate and retrieve documents that have the same, similar or related properties, via the Microsoft Graph, in the same way that web pages use RDF triples, allows otherwise unconnected resources to be linked and presented to the user (subject to any security controls) automatically based on their specific context.

In other words, instead of records being locked to a specific container based on their metadata being stored in a database, records could be discovered and linked wherever they are located based on their embedded metadata.

Relevance of W3 XML schema to Office 365 content

The use of RDF-based metadata embedded in Office documents in Office 365 means that this data can be used to link resources in a way that supports the discovery of the resources. It allows for cross-linking of information. Documents with metadata payloads are one of the many resources that can be connected in this way.

For example, ‘… a user can be connected to a group through a ‘memberOf’ relationship, and to another user through a manager relationship. Your app can traverse these relationships to access these connected resources and perform actions on them through the API. You can also get valuable insights and intelligence about the data from Microsoft Graph. For example, you can get the popular files trending around a particular user, or get the most relevant people around a user.’ (Source: https://developer.microsoft.com/en-us/graph/docs/concepts/overview)

‘Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.’ (Source: https://www.w3.org/RDF/)

Four observations about Office 365/SharePoint Online and records management

March 19, 2019

The following is a slightly modified version of four points I made recently to a records management professional, responding to the point that ‘many CIOs are rolling out Office 365 and SharePoint Online to replace traditional recordkeeping  systems such as TRIM/CM etc’.

First, generally speaking, records managers have traditionally not had a strong technical knowledge and/or weren’t close to the IT team.

Even if they managed TRIM/CM/other EDRM it was usually as the front end admin, not the back end technical IT admin, which remained with IT. Conversely, IT people have generally never had much knowledge of how to manage records (it not usually part of their skill set).

There was almost always a gap (technical, organisational, communication etc) between the records area and IT; consequently, IT departments have rolled out SharePoint and more recently Office 365 without reference to (or the feeling they even needed to refer to) records managers, and often without a solid architecture and planning for implementing and managing SharePoint (or Office 365).

Into the space between IT and records (but usually closer to IT) are various vendors who offer products that they say does the records management they claim that SharePoint does not do.

This by the way is not a criticism of those vendors as such, but there has been a tendency to buy their products without really understanding what the base product can do. This has almost always been the case for many IT products – back in 2006/7 I was part of a team looking to acquire a major ECM product and was a trained system administrator. The product itself could do exactly what was required without any modifications, the problem was the client (the company I worked for) wanted modifications that required consulting work. Close to a million dollars later in consulting fees, the product was still unused.

I’m also concerned at the way some vendors pitch the suitability or ‘compliance’ of their products in relation to add-ons to SharePoint for managing records. I had one telling me in all sincerity that their product ‘complied with ISO 15489’, which was interesting to hear since their is no compliance framework. The same vendor’s salesman was not aware of ISO 16175 when I asked about it.

Second, from SharePoint 2010 onwards, Microsoft implemented a range of new records management functionality to meet minimum (mostly corporate rather than government) requirements for managing records.

That new functionality included a great deal more features than most people knew about. One Australian consultant (John Wise) identified that SharePoint 2010 met 88% of the requirements of the then ICA standard that became ISO 16175 Part 2. For most non-government organisations that didn’t need the level of information security found in government, it was closer to 95%, and the 5% remaining was not particularly important for most organisations. With the introduction of both retention/disposal policy management, and information security classifications, via the Security and Compliance Centre in the Office 365 admin portal, SharePoint meets almost all requirements listed in ISO 16175 that do not refer to legacy systems.

In many respects, by ignoring ‘traditional’ ways that other EDRM systems have managed records, Microsoft introduced a brand new paradigm for managing records, underlined by the idea that digital records do not work the same way as paper records.

In my view, many older EDRM products failed to adapt to the new digital world and continued to enforce the concept that records must be ‘moved’ (saved to) a container in the recordkeeping system just as paper records had to be saved onto a single subject file. As long as Exchange and network files shares remained completely separate, this meant (and continues to mean) that the original versions of those records always remained in Exchange/network files even after they were copied to the EDRM.

A much smarter model, which SharePoint Online offers via both the create and save processes, is to allow people to save non-email records directly to SharePoint, including in syncronised document libraries in File Explorer; the document libraries can have default metadata applied to content types, and retention policies can be applied to those libraries. Emails can be moved automatically via Flow, or retained in the mailboxes with Office 365 retention policies applied. Recordkeeping happens in the background, people don’t have to fill in a form every time they want to save a record to the system.

Microsoft have centralised records management across the Office 365 environment. For example, the creation and management of records disposal/retention classes (called ‘classification policies’) is now carried out in the Security and Compliance Admin centre of the Office 365 portal. Records managers need to be assigned specific roles to do what they need to do (and I would argue, the corporate records managers should also be Site Collection Administrators on every site, preferably via a Security Group).

It doesn’t matter if the record is in Exchange or in SharePoint (or some of the other Office 365 applications), a classification policy can be applied wherever it is. When implemented correctly (based on a good architecture model), classification policies can provide the recordkeeping context required to link records over time.

Third, just like a home subscription to Office 365 with cloud storage is more cost effective than buying the product as before, most IT organisations have seen the benefits of moving their enterprise agreement licencing from per-device licence (where the licence is based on the computer) to a per-user licence (where the user can use the product on multiple machines including mobile devices or from home). This has also allowed them to shift storage (and the costs of maintaining servers, including technical staff) from their own or hosted data centres to the Microsoft cloud (which, ironically, may be in the same hosted data centre).

One large organisation that I’m familiar with had around 30TB of storage in the data centre; by acquiring Office 365 E3/E1 licences, they had 45TB – PLUS, 1TB for each user’s OneDrive. I suspect this point is not known to most records managers (first point above), who simply see the CIO’s introducing or rolling out Office 365 for no obvious reason.

Fourth, SharePoint has traditionally been many things to different people because it has always had a dual nature – publishing/intranet and team sites.

This is no different in SharePoint Online but the options to customise are now fewer (thankfully). Communication sites are a simple and elegant way to publish information, while team sites (including Office 365 Group-based team sites) are more or less the functional replacement for network drives (OneDrive for Business replaces personal drives).

In my opinion, it is important for anyone getting involved with SharePoint to understand this – that SharePoint Online is NOT the same as the ‘old’ SharePoint on-premise that could be customised to do just about anything.

Keep it simple, using the very rich ‘out of the box’ options, and it begins to make more sense. Plus, as noted already, users can synchronise SharePoint document libraries to File Explorer and work from there, so their experience can be more or less exactly what it is now using network drives.

Can you manage records in SharePoint Online? Absolutely, keeping in mind that SharePoint Online is very much a part of the Office 365 ecosystem and should not be considered a standalone application as it was when installed in an on-premise server.

Records managers need to get up to speed (quickly, in my opinion, although I’ve been saying it for years) with not only the recordkeeping functionality already in SharePoint Online and be SharePoint System Administrators (to give them access to the SharePoint Admin portal) and Site Collection Administrators, but also really need to understand the Office 365 portal and the relevant parts of the Security and Compliance Admin Centre including classification policies, ediscovery options and audit options.