Archive for the ‘Governance’ Category

Applying (new) Retention Policies to Office 365 Content

April 30, 2017

From time to time I’m asked about the way records retention policies ‘work’ in SharePoint. A common criticism has been that SharePoint’s retention model is based on applying retention policies to individual records (e.g., documents in a library or individual emails) rather than to aggregations of records, the most obvious of which is a document library.

The idea of storing and managing related records together in a single aggregation derives from the management of paper records – in files, boxes, and series. This model (of aggregations containing all records relating to a given subject) was largely replicated in electronic document management systems (EDMS – many of which were used to register paper files and boxes previously) when they appeared or were modified to manage digital records in the late 1990s.

In fact, many EDM systems did not actually manage records in an aggregation; the actual digital records were stored in a secure network file stored, and presented in the EDMS user interface though a common ‘file number’ (or similar) ID.

In any case, the ability to store all digital records on the same subject together in the one system (e.g., EDMS) was always hampered by the fact that (a) email and documents were created by different systems, (b) stored in different locations (servers), and (c) use of network file shares continued more or less unabated.

The increasing complexity and types of digital records underlines the difficulty of ever storing, let alone managing or applying retention and disposal actions, to them in a single aggregation.

Until recently, Microsoft’s retention and disposal options reflected the fact that applications used to create digital records stored them in different locations (servers) – Exchange and SharePoint. Retention policies targeted individual records stored in those applications, rather than aggregations.

In March 2017, Microsoft introduced a new, single central way to create and apply retention and disposal policies to most Office 365 content, wherever it was stored – Exchange, SharePoint, OneDrive for Business, Office 365 Groups, and Skype for Business.

This post:

  • Summarizes the existing ‘out of the box’ retention and disposal options in SharePoint, but not Exchange (see my earlier post on this subject).
  • Discusses issues with existing retention and disposal options in SharePoint.
  • Describes how the new centrally-managed retention policies and labels can be applied to most content in Office 365.
  • Discusses why applying retention policies to individual records rather than aggregations may be a better option in the digital world.

Records managers working in organisations that use Office 365 to manage records should familiarize themselves with the way these new retention policies work.

Note: The details in this post are based on the Australian recordkeeping context, which may be different from your specific location.

SharePoint out of the box (OOTB) retention and disposal options

Until recently, the only available OOTB options to apply retention and disposal actions to SharePoint were to:

  • Apply an information management policy to an entire site via the Site Collection Settings. This option is suitable for short-lived sites such as project or closed, archived sites, but less suitable for long-lived team sites which might have a range of different content.
  • Create a retention policy using the information management policy settings in Content Types. This option applies the policy to individual records. Content Types also include the ability to ‘transfer’ (actually copy) records after a defined period to another location, such as a Records Center.
  • Use a folder-based information management policy. This option requires the default Content Type-based policy on a document library to be changed via Library Settings – Information Management Policy Settings, to Library and Folders.

Another option was to adopt a form of ‘retention in place’ and regard each library as a logical aggregation of records, the equivalent of a ‘file’, and manage retention and disposal manually or using PowerShell scripts to identify libraries for potential disposal based on the last modified date of the records. Some vendors have developed a similar model to manage retention policies on libraries using a central ‘console’.

Applying retention and disposal actions to individual records

Both the Content Type and folder-based options noted above apply the retention policy to individual records in the library, not the library (aggregation/container) as a whole.

That is, disposal was based on a time period after which each individual record was created, modified, or declared a record. The logic behind this model appears to be that a document library may store multiple record types each with different retention requirements. This may not be true for all document libraries, but it usually is for many.

Applying automated disposal actions on individual records (rather than an aggregation of records) is probably counter-intuitive for most records managers. The main concerns, from a recordkeeping (and possibly also archival) point of view are the absence of (a) a documented review and approval process before the records are destroyed, and (b) a metadata record of what was destroyed. That is, the records simple disappear from the document library, removing records that may would be relevant to the context of the original aggregation. This, of course, assumes that all records relating to the subject were stored in a single aggregation which, as noted above, may not always be the case.

Global Retention Policies and Labels in Office 365

In March 2017, Microsoft introduced two new ‘global’ retention options – retention policies and labels – to Office 365. The two options allow organisations to apply centrally set and apply retention policies to the same type of record, in whatever form and wherever they are stored – emails in Exchange, documents and lists in SharePoint, conversations (in Office 365 Groups and Skype)..

Examples of ‘types’ of information could include:

  • Corporate records that must be kept for the life of the company.
  • Financial records that need to be kept for 7 years.
  • ‘Working records’ that could be deleted after a minimum period of time.
  • Personnel records or staff files that had to be kept indefinitely.

As Tony Redmond noted in this recent article, these new retention policies build on the type of retention policies first released in Exchange 2010 using folder, system, personal and default tags. The article suggests that organisations that have applied Exchange retention policies may need to consider the impact of these new types of policies. In particular, the ability to move email to archive mailboxes is lost, replaced with a retention policy.

How Retention Policies work

Retention policies in Office 365 are created by authorized users (ideally, records managers) in the Retention section of the Security and Compliance Center.

Creating a new retention policy

Each policy has the following options: Name, Settings, Locations and Preservation Lock.

Name

The name of the retention policy should reflect the class name or number in the records retention schedules so that it can be easily identified and applied to content wherever it can be applied in Office 365 (see below for ‘Locations’).

Settings

The two Settings options are based on two questions:

  • Do you want to retain the content? 
    • If ‘Yes, I want to retain it’ is selected, the choices are either ‘Forever’ or a configurable ‘n days/months/years’ (e.g. 7 years). The administrator must then decide if, once it reaches that point, the record should be deleted or not. If ‘Yes’ is selected, the content will be deleted from where it is currently stored as described in the next two points.
    • >>For SharePoint content there are two options when the retention period expires. (1) If the record has not been modified or deleted it will be deleted from the original library where it was stored, and then remain in the two-stage Recycle Bin for up to 90 days. (2) If the content has been modified or deleted, it is transferred to the hidden Preservation Hold library that is created when the retention policy is applied to a SharePoint site and deleted from that library. In this case, the administrator has only 7 days to recover the content before it is deleted permanently.
    • >>For Exchange content there are also two options. (1) If the item is modified or permanently deleted by the user during the retention period, the item is copied (if modified) or moved (if deleted) to the Recoverable Items folder. The retention policy process identifies and deletes items whose retention period has expired within 14 to 30 (configurable) days of the end of the retention period.  (2) If the item is not modified or deleted during the retention period, the same process runs on all folders in the mailbox and identifies items whose retention period has expired. These items are also permanently deleted within 14 to 30 days of the end of the retention period. (Note: If a user leaves the organization, and their mailbox is included in a retention policy, the mailbox becomes an inactive mailbox. ‘The contents of an inactive mailbox are still subject to any retention policy that was placed on the mailbox before it was made inactive.)
    • If ‘No’ is selected, the content will be left in place and must be manually deleted at some point.
  • No, just delete the content that’s older than … The options are to delete: (a) after ‘n days/months/years’, and (b) based on when it was created or modified.

The (subtle) difference between these two options is that the first option (Yes) ensures that records are not permanently deleted before the end of the retention period, while the second option (No) just deletes records permanently at the end of the retention period.

Advanced retention settings are also available these allow the administrator to create a search query with specific words phrases, or link the policy with the same sensitive information options found under DLP policies, e.g., financial, medical and health, privacy, and custom.

Locations

The Locations section sets where the policy will be applied. By default this is all locations across Office 365, including content in Exchange, SharePoint, OneDrive, Office 365 Groups and Skype for Business.

  • Office 365 has a limit of 10 organisation-wide policies and entire-location policies combined per tenant. Therefore, careful consideration should be given to what specific types of record need a global policy, especially given that not all types of records will be found globally across the organisation.

The alternative option is to apply the policy only to specific locations or users. In most cases this is likely to be Exchange and SharePoint where the majority of key records are created and stored.

  • A retention policy that includes or excludes over 1,000 specific users can contain no more than 1,000 mailboxes and 100 sites. A tenant can contain no more than 1,000 such retention policies. According to Microsoft ‘… you can get over these limits by applying either an org-wide policy or a policy that applies to entire locations’.

Retention policies applied to a SharePoint site or OneDrive account result in the creation of a hidden Preservation Hold library as noted above.

Retention policies applied to Exchange user mailboxes apply the policy to the mailbox. For public folders, the retention policy is applied at the folder level.

Preservation Lock

Finally, the administrator has the option to apply a Preservation Lock, which prevents anyone from changing or deleting the policy after it is turned on. This option should only be applied in specific circumstances as it cannot be turned off or made less restricted (by anyone, including the administrator) after it has been applied. .

Review and save

Finally, the new retention policy should be reviewed, may be saved for later, or published.

Labels

A separate option for managing retention and disposal is to use (retention) labels, which should not be confused with security labels. This option is designed to replace the following:

  • Exchange Online retention tags and retention policies, also known as messaging records management (MRM).
  • In SharePoint Online and OneDrive for Business: (a) in-place records management, (b) the Records Center, and (c) information management policies.

Labels are used to manage retention policies for specific types of content across the Office 365 environment. Labels can be applied automatically to content if it matches certain conditions or keywords (E5 licence only), or manually by users to emails, documents, or Office 365 Group conversations.

See below for the relationship and priority between retention policies and labels.

Who can create labels

Labels are created by individuals (ideally records managers or similar) assigned to a compliance role in the Security and Compliance Admin portal in Office 365.

Creating Labels

Labels are created in the Security and Compliance Admin Portal under ‘Classifications’. Labels may also be created without having an associated retention policy; that is, a label can be created and applied to content as no more than a visual ‘tag’. A policy can be added to it at a later stage.

If the ‘Retention’ option is enabled for labels (on/off switch), a new section appears titled ‘When users apply this label to content’. This section is where the retention policy is defined with two options:

  • Retain the content. The choices are either ‘Forever’ or ‘n days/months/years’ (e.g., 7 years). The administrator must decide if, once it reaches that point, the labelled record should be deleted or not. The ‘Yes’ and ‘No’ options are the same as for retention policies, described above.
    • If ‘Yes’ is selected, the record will be deleted from where it is stored. Administrators have 93 days to recover records that have not been edited or deleted, or 7 days to records that have been edited or deleted (and moved to the Preservation Hold library).
    • If ‘No’ is selected, the content will be left in place and must be manually deleted.
  • Don’t retain the content. The choices are to delete (a) after ‘n days/months/years’, and (b) based on when the record was created, modified, or labelled.

If the first option (‘Retain the content’) above is selected a check box option allows the administrator to use the label to classify content as a record. If the content is classified as a record, users are unable to change or delete the content or change or remove the label. They may still, however, edit the metadata.

The final step in the process is to review the settings. Once created, the administrator is returned to the main Labels screen which displays the label that has been created, allowing the administrator to then publish it.

Label limitations when used on a SharePoint document library

There are some limitations to applying a default label to a SharePoint document library:

  • It applies the label to all records except those that already have a label and those contained in document sets.
  • If the default label is removed, it removes the label from all records except those that have a label and those contained in document sets.
  • Labels cannot be applied to folders in SharePoint or OneDrive (but can be applied to folders in Exchange).
  • If the record is moved to a different library that has a different default label, it will inherit that label. Conversely, if it is moved to a library with no label, the existing label will be removed.

Note: When labels are published to an Office 365 group, the labels appear in both the group site and group mailbox in Outlook on the web. The experience of applying a label to content is identical to that shown above for email and documents.

What about legal holds?

eDiscovery in Office 365 is based around the creation of ‘cases’ in a SharePoint eDiscovery site. Cases are generally established in response to litigation (or potential litigation) and can be used to search across a range of sources. Once found, the information that forms part of the case can then be placed on hold, overriding any retention policy. However, once the hold is released, retention policies on records continue.

For more information on this subject, see:

https://support.office.com/en-gb/article/Add-content-to-a-case-and-place-sources-on-hold-in-the-eDiscovery-Center-54d70de9-1ec2-4325-84f3-aeb588554479?ui=en-US&rs=en-GB&ad=GB

What’s the relationship between retention policies and labels?

Retention policies and labels do the same thing but the former is more likely to be set centrally, while the latter is set by the end user. This means that a record could have more than one retention policy applied to it.

According to Microsoft’s documentation (link below), records will be retained until the end of the longest retention period applied to it, regardless of whether that policy was based on the retention policy or the label.

Are retention policies and labels better than previous retention options?

One of the primary benefits of the new retention policy regime in Office 365 is that it enables organisations to apply retention policies centrally rather than do this separately for each application (e.g., Exchange, SharePoint) as was the case until recently. It also allows end users to apply retention policies via labels.

Retention and disposal continues to be based on the individual record, or type of record (as defined by the policy or label), not logical aggregations or containers of records such as a document library.

As noted above, the concept of an aggregation that contains all the records on a given subject is ill-suited to the digital world. The reality is that records may be created using different applications (e.g., email in Exchange, document, list item or page in SharePoint, conversation in Groups, discussions in Skype etc) and stored in multiple application locations (e.g. in Exchange folders, SharePoint libraries, etc).

The dilemma for many records managers using Office 365 is how to store or manage records together in context, including based on the organisation’s File Plan or Business Classification Scheme (BCS) terms. The need to keep records together has been the driver behind the integration of EDRM systems with email applications, allowing email to be ‘captured’ in the EDRM along with other types of documents. This has rarely been successful in practice and, in most cases, emails are duplicated and remain stored in the email server.

The new Office 365 retention policies, including those applied as labels to specific types of content, may well be the answer to this dilemma. Rather than try to capture all types of records (e.g, document email, list item, conversation) in a single aggregation or container, Office 365 allows the option for them to be stored wherever the user prefers, with the same retention policy applied.

If necessary, all records with the same label can then be found using a content search in the ‘Search and Investigation’ section of Office 365.

In my view, there are still some shortcomings in basing retention policies on individual record types:

  • Individual documents, rather than logical aggregations of documents, will be continue to be subject to disposal actions.
  • Records that may provide context to other records (including those stored in different locations) may be destroyed.
  • Appraisal options may be limited and appropriate review and approval steps before disposal may not be possible.
  • Disposal actions may be automatic and unrecoverable.
  • There may be no record kept, including the metadata, of the individual records that were destroyed.
  • It is not known how courts might view the automatic disposal of records without prior review and approval.

Final thoughts

The new Office 365 records retention policy and label options centralise the management of retention and disposal for most types of records across Office 365, reducing complexity.

Retention and disposal continues to be based on individual records rather than aggregations, but this may be better suited to the digital world in which aggregations of records may not always be achievable.

Records managers working in organisations using Office 365 need to understand and provide guidance to IT on how records retention schedules can be applied as retention policies, and how they can be directly involved in decisions regarding the new options.

For more information: –

https://support.office.com/en-us/article/Overview-of-retention-policies-5e377752-700d-4870-9b6d-12bfc12d2423

https://support.office.com/en-us/article/Overview-of-labels-af398293-c69d-465e-a249-d74561552d30

 

Applying information security and protection capabilities in Office 365 & SharePoint Online

March 12, 2017

Office 365 includes a range of information security and protection capabilities. These capabilities are first set in Azure and then applied across the Office 365 environment, including in Exchange and SharePoint Online. This post focuses on the application of these capabilities and settings to SharePoint Online.

AzureInfoProtClassLabels

Enterprise E3 and E4 plans include the ability to protect information in Office 365 (Microsoft Exchange Online, Microsoft SharePoint Online, and Microsoft OneDrive for Business). If you don’t have one of those plans you will need a subscription to Microsoft Azure Rights Management.

Enabling Information Protection in Azure

The following steps must be carried out the first time Information Protection is enabled on Azure:

  • Log on to Azure (as a Global Administrator).
  • On the hub menu, click New. From the MARKETPLACE list, select Security + Identity.
  • In the Security + Identity section, in the FEATURED APPS list, select Azure Information Protection.
  • In the Azure Information Protection section, click Create.

This creates the Azure Information Protection section so that the next time you sign in to the portal, you can select the service from the hub ‘More Services’ list.

Default Azure Information Protection policies

There are four default levels in Azure Information Protection:

  • Public
  • Internal
  • Confidential
  • Secret

Once set, these levels can be applied as labels to information content. Sub-labels and new labels may also be created, as necessary via the ‘+ Add a new label’ option.

The configuration settings are shown below:

AzureInfoProtClassPortal.png

Each of these label/level settings may:

  • Be enabled or disabled
  • Be colour-coded
  • Include visual markings (the ‘Marking’ column)
  • Include conditions
  • Include additional protection settings.

Each includes a suggested colour and recommended tip, which are are accessed via the three dot menu to the right of each label.

Markings

When selected, this option will place a label watermark text on any document when the label is selected.

Conditions

Conditions may be applied, for example, if credit card numbers are detected in the text. It allows the organisation to define how conditions apply, how often (Occurrences), and whether the label would be applied automatically or is just a recommended option.

AzureInfoProtClass2

Global Policy Settings

In addition to the settings per level, there are three global policy settings:

  • All documents and emails must have a label (applied automatically or by users): Off/On
    • When set to On, all saved documents and sent emails must have a label applied. The labeling might be manually assigned by a user, automatically as a result of a condition, or be assigned by default (by setting the Select the default label option).
  • Select the default label:
    • This option allows the organisation the default label to be be assigned to documents and emails that do not have a label.
    • Note: A label with sub-labels cannot be set as the default.
  • Users must provide justification to set a lower classification label, remove a label, or remove protection: Off/On [Not applicable to sub-labels]
    • This option allows you to request user justification to set a lower classification level, remove a label, or remove protection. The action and their justification reason is logged in their local Windows event log: Application > Microsoft Azure Information Protection.

Custom Site

A custom site may be set up for the Azure Information Protection client ‘Tell me more’ web page.

Unique ‘Scoped’ Policies

In addition to the default policies listed above, a unique policy may be created. These are called Scoped Policies.

Enabling (and Disabling) Azure Information Protection

The steps above are used to set up the labels. They must then be enabled to provide protection. The steps below also allow protection to be removed.

From the Azure Information Protection section, click on the label to be set, then click on Protect. This action opens the Permission settings section.

Select Azure RMS and ‘Select template’, and then click the drop down box and select the default label template. This will probably show as, e.g., ‘(Your Company Name) – Confidential’.

Click ‘Done’ to enable this label and repeat for the others.

Note: If a new template is created after the Label section is opened, you will need to close this section and return to step 2 (to select the label to change), so that the newly created template is retrieved from Azure.

Removing Protection

Users must have the appropriate permissions to remove Rights Management protection to apply a label that has this option. This option requires them to have the Export (for Office documents) or Full Control usage right, or be the Rights Management owner (automatically grants the Full Control usage right), or be a super user for Azure Rights Management. The default rights management templates do not include the usage rights that lets users remove protection.

If users do not have permissions to remove Rights Management protection and select this label with the Remove Protection option, they see the following message: Azure Information Protection cannot apply this label. If this problem persists, contact your administrator.

Additional notes

If a departmental template is selected, or if onboarding controls have been configured:

  • Users who are outside the configured scope of the template or who are excluded from applying Azure Rights Management protection will still see the label but cannot apply it. If they select the label, they see the following message: Azure Information Protection cannot apply this label. If this problem persists, contact your administrator.
  • All templates are always shown, even if a scoped policy only is configured. For example, a scoped policy for the Marketing group; the Azure RMS templates that can be selected will not be restricted to templates that are scoped to the Marketing group – it is possible to select a departmental template that selected users cannot use. It is a good idea (to help troubleshoot issues later on) to name departmental templates to match the labels in the scoped policy.

Once these settings are made, they need to be published (via the ‘Publish’ option) to become active.

Enabling Information Protection in Office 365

Activating Information Protection in the Office 365 Admin Portal

Once they have been configured and published, it is then necessary to enable the required settings in the Office 365 Admin Portal (Settings > Services & add-ins > Microsoft Azure Information Protection).

To do this, log on to the Office 365 Admin Portal (as a Global Administrator) then click on ‘Services & add-ins’ under Settings. Click ‘Activate’ to activate the service.

Activating Information Protection for Exchange and SharePoint Online

Once the service is activated for Office 365, it can then be activated in the Exchange and SharePoint Admin Centres. In SharePoint Online this is done via the Admin Center section ‘Settings’ and ‘Information Rights Management (IRM)’.

Configuring SharePoint and SharePoint Libraries for IRM

As at 12 March 2017, it is only  possible to link Azure Information Protection classification policies with SharePoint Online if a new site is created via the SharePoint end user portal, as it appears as an option when enabled. Sites created via the SharePoint Admin Portal do not (yet) include the option to apply a protection classification.

If the creation of sites via the SharePoint end user portal is enabled, users with appropriate permissions (e.g., Owners with Full Control) can apply Information Rights Management to SharePoint libraries in their sites.

IRM is enabled on each individual library or list where the settings will be applied via Library Settings > Information Rights Management, under Permissions and Management.

SP_IRM_LibrarySettings.png

Check the box to ‘Restrict permissions on this library on download’. Only one policy can be set per library.

Assigning Information Protection labels to Office documents

[NOTE: for clients that have installed versions of Office, the Azure Information Protection client needs to be installed on the desktop. See this site for more information: https://docs.microsoft.com/en-us/information-protection/get-started/infoprotect-tutorial-step3%5D

When labels are configured and enabled, they can then be be automatically assigned to a document or email. Or, you can prompt users to select the label that you recommend:

  • Automatic classification applies to Word, Excel, and PowerPoint when files are saved, and apply to Outlook when emails are sent. It is not possible to use automatic classification for files that were previously manually labeled.
  • Recommended classification applies to Word, Excel, and PowerPoint when files are saved.

Applying the policies to Exchange and office

The site below describes how to apply these policies to Exchange and Office applications. These are not discussed further here.

https://github.com/Microsoft/Azure-RMSDocs/blob/master/Azure-RMSDocs/deploy-use/configure-applications.md

How Office 365 challenges traditional records management practices

September 27, 2016

If your organisation is using SharePoint on-premise now, or just starting out with Office 365, it is important to understand how the Office 365 ecosystem will challenge traditional ways of managing records practices while at the same time delivering a transformational all-digital experience for end users.

SharePoint On-Premises

When configured well, SharePoint on-premises (e.g. versions up to SharePoint 2016) allowed organisations to manage unstructured (i.e., document-based) content through a hierarchy of site collections – sites/sub-sites – document libraries – (folders/document sets) – documents.

In on-premise SharePoint environments, document libraries could be used to store and manage records, thereby becoming the logical containers or aggregations of records, similar to ‘files’ in traditional EDRM systems.

The Office 365 ecosystem

Office 365 changes and challenges the on-premises model of SharePoint by adding new ways of working to standard SharePoint team and publishing sites. These new ways of working include:

  • Office 365 Groups, each of which has a dedicated SharePoint site
  • OneDrive for Business, a personal version of SharePoint
  • Yammer
  • Skype for Business
  • Delve
  • Planner
  • Sway

Why is this important? 

SharePoint has been clearly positioned as Microsoft’s online document management engine. SharePoint, not network file shares, is the document management future. And so, by extension, it becomes the future location for the management of digital records for any organisation that subscribes to Office 365.

From both the business and end-user points of view, SharePoint provides easy-to-use and more efficient content management and collaboration capabilities allowing users to access and use a range of content anywhere, anytime, on any device. Coupled with collaboration options such as Office 365 Groups, Yammer, and Skype for Business, information is now available across a number of different applications within the same single ecosystem.

From a records management point of view, this new way of working challenges the idea that information can be stored in the context of a single function, activity or transaction that created it. Instead, it supports the concept that digital information cannot truly be assigned to a single function or context; its context may also depend on the context of the person seeking to access it.

That is, how one person stores information is not necessarily how others may expect to find or use it. Think of the parallels with eBay, Facebook, LinkedIn and similar products – algorithms present information to you, often in a ‘feed’, based on what the application knows about you, not how other people store that information.

‘Modern’ Team Sites

The most striking change with ‘modern’ team sites in SharePoint Online (compared with SharePoint 2013 and earlier) is the disappearance of the ribbon menu and the simplification of the user-experience to be more or less identical with OneDrive for Business.

When any library is selected (and before a document is selected), the user is presented with the common options: New (Folder, Word, Excel, PowerPoint, OneNote, Link), Upload, Quick Edit, and Sync.

o365sp1

When a document is selected, the user is presented with a context-specific menu offering again commonly used options: Open, Share, Get a link, Download, Delete, Pin to top, Move to, Copy to, Rename, Version History, Alert me, and Check out.

O365SP2.JPG

O365SP3.JPG

The familiar Library Settings, previously located on the ribbon menu, are now found via the Office 365 settings ‘cog’.

O365SP4.JPG

Microsoft have also changed the look of SharePoint Online sites and provided a new ‘SharePoint’ landing page to help users access all the sites they are following, and also present suggestions for sites to follow. In other words, the system understands the user’s context and presents content suggestions, the same way Facebook users are invited to befriend people.

From a records management point of view, little has changed with document libraries in team sites. SharePoint Online continues to offer all the same features as before:

  • Almost unlimited metadata options allowing multiple metadata-based views to be set up
  • Unique, persistent document IDs
  • Folders and document sets (although the latter are even harder to set up than they were)
  • Versioning (and more efficient storage of versions)
  • Popularity trends and per-document views
  • Detailed audit trails
  • Access/permission controls
  • Legal compliance/retention and disposal
  • Powerful search
  • Full integration with Office but now allowing users to save directly by default to SharePoint and OneDrive by default.
  • Hyperlinkable documents
  • Easy sharing

While it is still possible in SharePoint Online to manage records out the box, the other elements that make up the Office 365 ecosystem provide a much broader and complex environment for the storage and management of records. SharePoint Online is just one component of this environment.

Office 365 Groups

Office 365 Groups provide a way for a group of people within the organisation – as well as external users – to discuss and share information.

  • They are similar to Active Directory (AD) Distribution Groups in the sense that they are a pre-defined organisational group designed to receive information.
  • They are different in that, instead of being just the recipient of information, users (and people who join the group at a later date) can see all discussions that have been sent to all members and access any Group documents.

Office 365 Groups are made up of two main content elements: ‘Conversations’ email-based threads and ‘Files’.

O365SP5.JPG

  • Conversation threads are based on simple email exchanges presented in Outlook – currently it is not possible to create folders in the group.
  • The Files option in Office 365 Groups is a SharePoint site that allows the group to store, share and collaborate on any unstructured content.

Groups also include a calendar and a group Notebook (which opens OneNote Online in the Group SharePoint site).

Office 365 Groups content is stored either within the context of the Group’s email-based conversations or in unstructured content stored in an associated SharePoint site.

Office 365 Groups SharePoint sites are visible in the user’s list of SharePoint sites, making it easy to get back to the Group’s site or its conversations.

OneDrive for Business

OneDrive for Business is built on the SharePoint engine. The consumer version of OneDrive has been around for a few years and is a direct competitor to the likes of Google Drive, iDrive, DropBox, Box and so on.

OneDrive for Business, the online replacement for ‘personal’ network drives, allows users to store, synchronise and share ‘personal’ work information through an interface that in Office 365 is now almost identical with modern team sites (less the Library Settings).

As with personal drives on network drives, content stored by users on OneDrive for Business is inaccessible unless shared with others. Organisations have only 30 days by default to do something about the user’s OneDrive for Business content when they cease to be an employee, before the content is deleted.

Options to manage the otherwise hidden content of a departed user’s OneDrive for Business account include allowing the user’s manager to review and if necessary move or delete it, allowing an authorised person in IT to review it, and/or backing it up to other storage so it is not deleted.

Yammer

While the long-term future of Yammer is unclear in the face of Office 365 Groups, Yammer may still exist and capture information and records for a time to come.

Skype for Business

In addition to Yammer and the conversation options provided through Office 365 Groups, Skype for Business provides yet another option to discuss and share information including via voice and/or video calls.

Delve

All the options described above provide a function-rich environment to store and manage unstructured content and collaborate with other people both within and external to the organisation. But how to make sense of all this information?

Depending on licensing, Delve provides a way to find content that may be relevant to the user.

O365SP6.JPG

Delve suggests a range of content that may be of interest (based on the user’s profile, connections and content created or accessed), and provides an analysis of the user’s activity as recorded in Outlook, the calendar and other actions.

Challenges with managing records in Office 365

While Office 365 provides a transformative digital experience for end users, managing the records created and stored in various parts of Office 365 presents new challenges for records managers.

For example, there is far less ability to control the way content is stored or described in specific, pre-defined and/or metadata-driven aggregations and contexts. Users are likely to use whatever application is the most appropriate or convenient. For example, they may use OneDrive for Business to create and store large volumes of content, hidden away from corporate view. They may even share content from this application, including with external users.

The default settings in SharePoint, if not disabled, provide end-users with considerable latitude to create new SharePoint sites and Office 365 Groups, in addition to their personal OneDrive for Business sites, to store, manage and share rich digital content including with external users. In reality, these settings probably need to be disabled to prevent uncontrolled growth in the environment.

Even if records managers (as Site Collection Administrators) have oversight and control of the creation of SharePoint Online team sites, some questions arise:

  • How will they extend this control to SharePoint sites created to support Office 365 Groups, or the conversations that take place within those groups?
  • What about content stored in and shared from OneDrive for Business?
  • How will it be possible in the future to bring together all information about a given function/activity for disposal or disposition actions, especially if it’s not all stored in the one aggregation?

Good SharePoint (and Office 365) governance requires a good balance of control. Too much control and users will be put off using and benefiting from the ecosystem. Too little and the ecosystem may become uncontrollable but possibly very ‘lively’ in terms of content profusion.

Ideally, users should feel that they have the ability to manage their information within a lightly controlled environment – for example, SharePoint site owners cannot create new Sites (to prevent the massive proliferation of sites) but they can create document libraries (thereby reducing IT administrative controls).

Can analytics help with managing records?

Analytics via the Office Graph may provide a way to bring together information and records in context, a context (or contexts) which may be unforeseen by the person who created the content in the first instance. For example, a user may store information in a document library, unaware of its relevance or similarity to others in the organisation. Analytics may be able to connect the two, or the different people doing similar things.

At this stage, Analytics does not seem to provide the ability to bring together all information about a given subject. The model, instead, appears to be about presenting or making information accessible in any context at any time to users depending on their context at the time.

eDiscovery?

eDiscovery, a feature available from SharePoint 2013, has the potential ability to bring together all information about a given subject from across the Office 365 ecosystem. However, the primary purpose of eDiscovery is to support legal processes, not records management.

New ways of thinking are necessary

Records managers need to think differently about how they will approach the management of all types of digital records and other content (conversations, discussions, photographs, videos, Sway presentations) created and stored by users across the complex ecosystems that is Office 365.

It will no longer be possible to assume that all records relating to a given function/activity pair, subject, or context can or will be stored in the same aggregation of records. Instead, records managers need to find other ways to manage digital content, including to manage disposition activities.

Artificial Intelligence (AI) may provide the clue to this. Microsoft CEO Satya Nadella made this very clear in a keynote presentation to the Microsoft Ignite conference on 26 September where he noted that AI would be able to: “… to reason over large amounts of data and convert that into intelligence”. He also noted Microsoft’s ambition is to create an intelligent assistant that “… can take text input, can take speech input, that knows your deeply. It knows your context, your family, your work. It knows about the world.”

Nadella also noted that: ” The most profound shift is in the fact that the data underneath the applications of Office 365 is exposed in a graph structure. And in a trusted, private-preserving way, we can reason over this data and create intelligence. That’s really the profound shift in Office 365.” (Source: https://techcrunch.com/2016/09/26/microsoft-ceo-satya-nadella-on-how-ai-will-transform-his-company/)

(Note, the last two paragraphs were added on 29 September to include comments made by Satya Nadella about Microsoft’s AI ambitions).

 

 

Information Security in SharePoint Online

May 24, 2016

Until now, the security of information stored in SharePoint on-premise implementations was largely based on access control groups that gave or restricted access to the content on the site. Access to the content, and ability to do anything with it (e.g., edit, read) depending on what group you belonged to. The main five access control groups are:

  • SharePoint Administrator/s: Access to everything.
  • Site Collection Administrator: (Usually) access to everything, but this can be disabled.
  • Site Owners: ‘Full Control’ access to everything (except for the Site Collection Administration elements in Site Settings).
  • Site Members: ‘Contribute’ or add/edit access.
  • Site Visitors: Read only.

Other groups such as Designer and Reader existed for specific purposes.

At any point from the top level Site Collection downwards through all the content, these inherited permissions could be stopped and unique permissions – including for both individuals and new access groups – could be created and applied to control access to content.

Audit logs supplemented access controls by providing details of who did (including changing security permissions) or accessed what, and when. While the SharePoint Administrator and Site Collection Administrator’s names are not visible to Site Owners, Members or Visitors, they appear in the audit logs if any activity is recorded. System account activity is also recorded in the logs.

New Security Controls in SharePoint Online

SharePoint Online brings a range of new options to protect the security of information, in addition to access controls. These options, some of which are included with SharePoint 2013 an onwards, are:

  • Information security classifications
  • Data Loss Prevention (DLP)
  • Audited sharing
  • Information Rights Management (IRM)
  • Shredded storage (new from SP 2013)

Two of these options can be seen in the following Microsoft diagram:

mt718319.001.png

Source: ‘Monitoring and protecting sensitive data in Office 365’ https://msdn.microsoft.com/en-us/library/mt718319.aspx

Information Security Classifications

According to a number of online sources, from at least March 2011, Microsoft has classified its own information into three categories: High Business Impact (HBI), Moderate Business Impact (MBI), and Low Business Impact (LBI).

  • High Business Impact (HBI): Authentication / authorization credentials (i.e., usernames and passwords, private cryptography keys, PIN’s, and hardware or software tokens), and highly sensitive personally identifiable information (PII) including government provided credentials (i.e. passport, social security, or driver’s license numbers), financial data such as credit card information, credit reports, or personal income statements, and medical information such as records and biometric identifiers.
  • Moderate Business Impact (MBI): Includes all personally identifiable information (PII) that is not classified as HBI such as: Information that can be used to contact an individual such as name, address, e-mail address, fax number, phone number, IP address, etc; Information regarding an individual’s race, ethnic origin, political opinions, religious beliefs, trade union membership, physical or mental health, sexual orientation, commission or alleged commission of offenses and court proceedings.
  • Low Business Impact (LBI): Includes all other information that does not fall into the HBI or MBI categories.

Source: ‘Microsoft Vendor Data Privacy – Part 1’ (March 2011) https://www.auditwest.com/microsoft-vendor-data-privacy/

Microsoft released code (via Github) to apply these classifications to SharePoint on-premise deployments in 2014.

Source: https://github.com/OfficeDev/PnP/tree/master/Solutions/Governance.TimerJobs

In 2016 Microsoft released a Technical Case Study highlighting how it migrated all its SharePoint content to SharePoint Online – and how information classification formed part of that process.

Source: ‘SharePoint to the Cloud – Learn how Microsoft ran its own migration’ (Case Study – 2016)  https://msdn.microsoft.com/en-us/library/mt668814.aspx

In May 2016, Microsoft announced that this form of classification would be added to new SharePoint Online site collections during 2016.

The application of security classifications to SharePoint Online sites has two elements:

  • Security and compliance policies, set by the SharePoint Administrator via either the ‘Security policies’ or ‘Data management’ section of the Office 365 Security & Compliance Center. [As of 23 May 2016 the only policies are ‘Device management’ and ‘Data Loss Prevention’. While the DLP policies appear to allow the inclusion of security classifications, it is expected that Microsoft will add more options to support the application of security classifications during 2016. See below for more information on DLP.]
  • A new drop-down, three choice (LBI, MBI, HBI) option in the ‘Start a new site’ dialogue box under the question ‘How sensitive is your data?’ The choice of classification invokes the relevant security and compliance policies.

Microsoft provides examples of the types of information that would be covered by each of these at this interactive site: https://www.microsoft.com/security/data/

The application of these policies will enable organisations to control what happens to information stored in sites assigned these classifications. Among other things, this can prevent users from sending (or trying to send) MBI or HBI classified information to people not allowed to receive or view it, including through DLP policies discussed in the next section.

Data Loss Prevention (DLP)

Data Loss Prevention policies allow organisations to:

  • Identify sensitive information across both SharePoint Online and OneDrive for Business sites (and in Exchange, through the same settings).
  • Prevent the accidental sharing of sensitive information, including information classified MBI or HBI.
  • Monitor and protect sensitive information in the desktop versions of Word, Excel and Powerpoint 2016.
  • Help users learn how to stay compliant by providing DLP tips.
  • View reporting on compliance with policies.

 

DLP Conditions

DLP works by giving Site Administrators the ability to create and apply DLP policies in the Security & Compliance Center for SharePoint (which includes OneDrive for Business; there is a separate Center for Exchange). In the Center, the Administrator navigates from ‘Security policies’ to ‘Data loss prevention’.

The DLP policy area includes a range of ‘ready-to-use’, financial, medical and privacy templates for a number of countries including the US, UK and Australia. Examples of pre-defined Australian sensitive information types include: bank account numbers, driver’s licence numbers, medical account numbers, passport numbers, and tax file numbers.

You may also create a custom DLP policy.

Sources: https://technet.microsoft.com/en-us/library/ms.o365.cc.newpolicyfromtemplate.aspx  https://support.office.com/en-gb/article/Send-notifications-and-show-policy-tips-for-DLP-policies-87496bc5-9601-4473-8021-cb05c71369c1

DLP Actions

Specific actions must be set for every DLP policy; that is, what happens if the policy conditions are met. The default actions are:

  • Block access to content (for everyone except its owner, the person who last modified the content, and the owner of the site where the content is stored AND send a notification by email.
  • Suggest a Policy Tip to users. Options are (a) Use the default Policy Tip or (b) Customise the Policy Tip.
  • Allow override options. There is one main checkable option (‘Allow people who receive this notification to override the actions in this rule’) and two sub options:
    • A business justification is required to override this rule, and
    • A false positive can override this rule.

In addition to these actions, where the DLP policy identifies sensitive content in a document stored in SharePoint Online or OneDrive for Business it displays a small warning ‘stop’ sign icon on the document icon. Hovering over the item displays information about the DLP policy and options to resolve it.

DLP Incident Reports

Incident reports are designed to alert a compliance officer to details of events triggered by the DLP conditions, and provide reporting on those events.

Sources:

https://technet.microsoft.com/en-US/library/ms.o365.cc.DLPLandingPage.aspx

Audited Sharing

Information sharing is a common activity in SharePoint and in SharePoint 2016 and SharePoint Online it is actively encouraged through a new Share option.

In addition to other existing audit options, sharing activity can now be audited in SharePoint Online. The audit logs for Office 365 (which must be enabled) are accessed through the Office 365 Admin Center > Security & Compliance Center > Search & investigation > Audit log search.

Source: https://support.office.com/en-us/article/Use-sharing-auditing-in-the-Office-365-audit-log-50bbf89f-7870-4c2a-ae14-42635e0cfc01?ui=en-US&rs=en-US&ad=US]

Information Rights Management (IRM)

Microsoft’s Information Rights Management capability provides an additional layer of protection for a number of document types at the list and library level in SharePoint Online sites.

Supported document types include PDF, the 97-2003 file formats for Word, Excel and PowerPoint (e.g., Office documents without the ‘x’ at the end of the file extension – ‘word.doc’, the Office Open XML formats for Word, Excel, and PowerPoint (e.g. with the ‘x’ at the end – ‘word.docx’), the XML Paper Specification (XPS) format.

According to Microsoft, IRM:

‘… enables you to limit the actions that users can take on files that have been downloaded from lists or libraries. IRM encrypts the downloaded files and limits the set of users and programs that are allowed to decrypt these files. IRM can also limit the rights of the users who are allowed to read files, so that they cannot take actions such as print copies of the files or copy text from them.’

IRM is enabled via the Office 365 Admin Center > Admin > SharePoint > Settings > Information Rights Management > ‘Use the IRM service specific in your configuration’ and then ‘Refresh IRM Settings’.

Microsoft_IRM

Image source: ‘Apply IRM to a List or Library’ https://support.office.com/en-us/article/Apply-Information-Rights-Management-to-a-list-or-library-3bdb5c4e-94fc-4741-b02f-4e7cc3c54aa1

 

When IRM is activated on a library, any file that is downloaded is encrypted so that only authorised people can view them. Again, according to Microsoft:

‘Each rights-managed file also contains an issuance license that imposes restrictions on the people who view the file. Typical restrictions include making a file read-only, disabling the copying of text, preventing people from saving a local copy, and preventing people from printing the file. Client programs that can read IRM-supported file types use the issuance license within the rights-managed file to enforce these restrictions. This is how a rights-managed file retains its protection even after it is downloaded.’

Source:

https://support.office.com/en-us/article/Set-up-Information-Rights-Management-IRM-in-SharePoint-admin-center-239ce6eb-4e81-42db-bf86-a01362fed65c

Shredded storage

Shredded storage, as the name suggests, describes the way documents are stored in SharePoint, starting from SharePoint 2013. Instead of storing a document as a single blob, documents are stored in multiple blobs.

This is a more efficient – and possibly more secure – way to manage documents when they are updated by only updating the element/s that were changed. According to a Microsoft presentation on 4 May 2016:

‘… every file stored in SharePoint is broken down into multiple chunks that are individually encrypted. And, the keys are stored separately to keep the data safe. In the future, we would like to give you the ability to manage and bring your own encryption keys that are used to encrypt your data stored in SharePoint. If you want, you can revoke our access to the keys. And we will not be able to access your data in the service’.

Source:

https://blogs.technet.microsoft.com/wbaer/2012/11/12/introduction-to-shredded-storage-in-sharepoint-2013-rtm-update/

Other Information Security related options

The Microsoft website ‘Monitoring and protecting sensitive data in Office 365’ provides further information about other Information Security options in Office 365, including reporting options to support auditing of activity in the tenant.

Source: https://msdn.microsoft.com/en-us/library/mt718319.aspx

 

Managing documents in SharePoint 2013 and SharePoint Online – what’s changed?

September 20, 2015

We are in the process of upgrading our extensive ‘out of the box’ SharePoint 2010 environment to SharePoint 2013 and, at the same time, setting up a limited SharePoint Online presence. So, what’s changed in relation to managing documents as records?

In SharePoint 2013, the short answer is ‘not much’. However, there are some new things that will change some parts of our technical design model. The things that have remain unchanged include:

  • Document libraries as the primary ‘container’ to store documents, using folders, document sets, or metadata-based categorisation, to ‘group’ documents.
  • Document IDs, set in the Site Collection Administration – Document ID settings section.
  • Document versioning (major/minor, major, or none), set in the Library Settings – Versioning Settings. Other settings in this section, which we generally do not use, have also not changed: Content Approval, Draft Item Security, and Require Check Out.
  • Use of Content Types (and disabling of Folders), enabled via Library Settings – Advanced Settings, and then added in the general Library Settings – Content Types area. All other options in this section, which again we rarely use, are still there: Custom Send to Destination, Search visibility, Offline Client Availability, Site Assets Library, and Dialogs. The old familiar ‘Datasheet view’, which we use to ‘bulk upload’ or update metadata has been renamed ‘Quick Edit’.
  • Almost unlimited metadata options via pre-defined Content Types or library-specific columns, both of which can point to the centrally controlled metadata in the Managed Metadata Service or local ‘look-up’ lists.
  • Multiple list views, each with their own linkable URL.
  • The ability to copy documents, via drag and drop or copy/paste, using ‘Open with Explorer’. This, coupled with the ‘Quick Edit’ option, allows documents to be copied to SharePoint document libraries in bulk and metadata added easily.
  • Access controls that can be applied right down to the document level.
  • The ability, from a document-specific drop down menu, to view or edit the properties of a document, check it out or in, view the version history (and restore versions), run workflows, download a copy, share (the former ‘manage permissions’ – see below), and delete (where enabled).
  • Out of the box simple workflows for Review and Approval.
  • Site collection audit trails, accessed via the Site Collection Administration area. Unlike some other products, SharePoint audit trails are not ‘attached’ to individual documents, but are centralised in one place.

So, all in all, not much change really, except for the ‘Share’ option. In many respects, the way the simple ‘Share’ has been designed is a more intuitive process than ‘Manage Permissions’. When you add a user you decide if that person should have Edit or Read privileges. If you maintain the default Owner, Member and Visitor groups, then those with Edit rights are added to the Member group, those with Read rights are added to the Visitor group.

For more complex permission management, including to stop inheriting the default permissions (or to add them back, which is now called ‘Delete unique permissions’), or creating new groups, you need to select the ‘Shared With’ option from the drop down menu and then select Advanced.

What about disposal/disposition management?

Again, not much has changed in relation to document libraries or lists. Out of the box, your options are as follows:

  • Using centrally defined, document-based, Content Types, using Information management policy settings. Not a bad idea if (a) you have a way to ensure that these Content Types are always added to libraries, and (b) you are happy to manage the disposal of documents one by one.
  • Changing the default Information management policy settings option in document libraries from ‘By Content Types’ to ‘By Libraries and Folders’ and then applying retention policies on the folders you create. The main negatives of this option are that it means you have to use folders, and you have to manage disposal by document.
  • Leaving documents in document libraries, and having a way to review these, across the farm, in a centralised manner. This requires some kind of script to be written to (a) list all libraries across the farm and (b) work on the basis that the ‘Last Modified Date’ is the last action on any document in the library, but it seems the more logical and simplest way to achieve the outcome you seek, and keeps all the documents in the same container.

It remains surprising to me why Microsoft does not provide the option to set a disposal period on an entire document library.

Of course, SharePoint 2013 now allows you to set a disposal period on a Site, but this isn’t likely to work for sites that contain a range of diverse content that may be useful over a long period of time.

So, what about SharePoint Online?

The first thing to remember about SharePoint Online (SPO) is that it’s not SharePoint On-Premises. Seems obvious, but the natural instinct is to wonder if or how the two environments can be connected. In most cases they can’t, so it’s not worth thinking along those lines. SPO is a way to manage content in the cloud in addition to, or instead of, on-premises.

SPO has all the same document management features you find in SharePoint On-Premises, described above – document IDs, versioning, content types, metadata, multiple list views, open with Explorer and Quick Edit, access controls, document-specific menu options, simple workflows and audit trails.

O365_SPO_LibraryRibbon_DocMenu

The options for disposal/disposition management using SPO ‘out of the box’ (should that be ‘out of the cloud’?) is the same as for the on-premises version.

You didn’t mention Records Centers …

Records Centers (or in Australia, Centres) were in many respects designed to be the ‘send to’ archival repository for other sites. Great idea if it works in your world, it doesn’t work in ours. The main drawbacks are that documents ‘sent’ to a Records Centre are in fact copied. Custom metadata is lost, versions are lost, audit trails are lost. And, you can retrieve the document.

But Records Centres (or in the US, Centers) can be useful on their own as a repository of specific types of records that aren’t accessed too much. We have several Records Centres and we use them in a separate web application for specific purposes, including to store scanned documents. We don’t use them as they were originally intended for the reasons stated above.

And yes, you can create Record Centers in SPO, too!

Understanding and managing access permissions in SharePoint 2010

March 19, 2013

By default, access permissions are set at the Site Collection level and then inherited downwards, to all libraries (and document sets), lists and documents on the main page and to all subsites and their libraries (and document sets), lists, and documents.

The default permissions are usually:

  • Site Owner (full control of the site)
  • Site Members (can add and edit)
  • Site Visitors (can view only)

Breaking the inheritance model of access permissions is relatively simple to do but can create confusion and, if not done correctly, make content completely inaccessible even to the Site Owner. Breaking the inheritance model on documents is particularly dangerous as there is no easy way to identify or manage access restrictions applied across the farm.

Simple access controls via de-inheritance

The simplest way to limit access to a site or the content on a site is to de-inherit the access permissions. To change this on:

  • Sites, go to Site Actions – Site Permissions
  • Libraries/Lists, go to Library/List – Library/List Permissions
  • Document Sets or documents, click on the down arrow next to the name and click on Manage Permissions

… then choose ‘Stop Inheriting Permissions’. If that option is not there, then the Site, Library/List, Document Set or Document may already have permissions on it. (You may see the following statement: ‘Some content on this site has unique permissions which are not controlled from this page. Show me uniquely secured content’).

But there’s a catch, creating the first layer of confusion. When you stop inheriting permissions, the same permission groups remain on the page. But didn’t you just STOP inheriting those permissions?

The reason I think Microsoft left the default permission groups there is so you don’t inadvertently lock yourself out of the Site, Library/List, Document Set or Document – if no group is left and you navigate away from that page, you will almost certainly be denied access. The really good thing to note is that, if you have realised you are about to make something inaccessible (and before you navigate away), you can click on ‘Inherit Permissions’.

So, after you stop inherited permissions the next things you need to do are (a) remove any groups you no longer want to access the site, and then (b) add or create a group you want to access the site. To do that, you click on ‘Grant Permissions’. The dialogue box that appears asks you to select users or groups, and then grant the specific permissions. A group must exist to add it and these are added at the Site Collection or Site Level.

Note that a created group does not on its own have specific permissions, it is only a group of names. You create the permissions when you give that group access to the site, library/list, document set, or document. If you have a group already, you can add new names to that group.

I’d recommend you create a group at the Site Collection level because it will appear there anyway and you need to understand what impact that has – any new group you create will have access to anywhere else in the site by default UNLESS you break the inheritance model.

Slightly complex access controls via de-inheritance and groups

The most common use case for slightly complex access controls are at the library/list, document set or document levels. That is, there is a business requirement to restrict access to one of those, or provide access to a specific document set and nothing else. For the sake of this posting, we will consider the case of a library, in a second level sub-site, that contains multiple document sets, each with multiple documents. The business area wants to restrict access to one of the document sets to a specific group of people.

This is where you need to exercise great care as, without careful planning, you could inadvertently allow all the members of that group to access anything else across the entire site collection where access is inherited. This is because, when you create an access group, the group will appear across the entire site collection.

To allow access to a document set only within a site collection (and assuming there are multiple sub-sites each of which inherit from the top level), you need to first understand access permissions already set.

First, break inheritance on all sub-sites; by default this will leave the default groups plus the new one you have created, so you you only need to remove that new group on all sub-site access permissions. This will remove the new group from all libraries/lists, document sets and documents on each site.

Second, you need to add the group to the specific document set. To do that, stop inheriting the permissions, which leaves the default access permissions, then add the new group by clicking on Grant Permissions.

Now, if you go to the site permissions, you will see the new group listed (which can be a bit disconcerting), and the statement (against a yellow background): ‘Some content on this site has unique permissions which are not controlled from this page. Show me uniquely secured content’.

What this means is that members of the group you have added:

  • Cannot access the site, or site collection (they will get an ‘Access Denied’ message).
  • Can see the document set they have been given access to (but no other document set or document in the same library)
  • Can see the site’s libraries and lists but cannot see any content in those lists. This is a good reason for being careful about naming those libraries.

As noted already, access permissions can be very difficult to manage and very easy to get wrong. Careful planning will help to ensure you don’t lock yourself out.

Can predictive coding be used to classify records?

November 6, 2012

A recent legal case in the United States, Plaintiffs v Peck, may set precedents for the way in which documents are categorised for e-Discovery (through ‘predictive coding’), a development that I think could ultimately impact on the way records are classified.

The presiding judge (Andrew Peck, a United States magistrate judge for the Southern District of New York) wrote an article in Law Technology News in October 2011 titled ‘Search, Forward: Will manual document review and keyword searches by replaced by computer-assisted coding?’.

In his article, Peck described the problems associated with the traditional, manual way of document review. He then refers to ‘… two recent research studies that clearly demonstrate that computerised searches are at least as accurate, if not more so, than manual review’. (The details of these studies are included in the article).

Peck discussed the use of keywords applied to electronic documents, and the poor results that often result (‘average recall was just 20% … (a) result (that) has been replicated … over the past few years’). He notes the generally negative judicial reaction to the use of keywords in e-discovery, partially because the manual process is the ‘gold standard’.

Given the increasingly digital nature of e-discovery, Peck noted the increasing use of ‘computer-assisted coding’, more commonly known as ‘predictive coding’. This methodology is described as ‘tools that use sophisticated algorithms to enable the computer to determine relevance based on interaction with a human reviewer’, using a set of documents to ‘train’ the system.

He stated that, unlike keywords, the (better) results achieved by auto-classification systems (predictive coding) are likely to make the latter approach more appealing in US Courts in the near future.

The specific case referred to (Moore v. Publicis, a ‘high profile employment discrimination case’) involved 3 million electronic documents, and the need to cull them. The plaintiffs objected to the use of predictive coding and criticised ‘… the use of such a novel method of discovery without supporting evidence or procedures for assessing reliability’.

Another Judge was asked to review the case; Judge Carter upheld the decision ‘after finding it to be well reasoned and therefore not subject to reversal.’

For background to the case, see: Plaintiffs v Peck – A Worthy Addition to your Summer Reading List. 10 July 2012. ELLBLOG . http://ellblog.com/?p=2999

Recommind has produced a short booklet titled ‘Predictive Coding for Dummies’, available as a free, 36-page, pdf. The Dummies Guide notes that ‘Real living, breathing legal experts are essential to predictive coding. These experts use built-in search and analytical tools — including keyword, Boolean and concept search, category grouping, and more than 40 other automatically populated filters — collectively referred to as predictive analytics — to identify documents that need to be reviewed and coded.’ Replace ‘legal experts’ with ‘records managers’ and the role of the records manager is clear.

Of course, finding all the correct documents within a classification is only one part of the requirement. The classification needs to be persistent and connected with other recordkeeping requirements including retention management.

I co-wrote an article for the November issue of IQ, the RIMPA industry quarterly, with Umi Mokhtar from Universiti Kebangsaan Malaysia (UKM) in Malaysia asking the question as to whether technology can classify records better than a human can.

The article noted apparent success rates when the technology was used for legal review, and questioned whether the same technology could be used to classify (or apply classification terms) to records instead of expecting: (a) ‘containers’ to have the right classification terms given their content or (b) users ‘filing’ documents against the correct classification.

I was fortunate to have the chance to sit with one of the predictive coding vendors last week to discuss these issues and my concerns about the effectiveness of the technology to classify records correctly.

I had a similar discussion with a reseller of the same product almost 8 years ago, at the height of EDRMS implementations, when this technology was only seen for its value in finding information.

Roll forwards 8 years, with even more massive amounts of digital information being captured and stored, and EDRMS systems capturing only a small fraction of that information, and the technology looks more appealing as a tool that can support digital recordkeeping.

What struck me most about the technology was the way it presents the results to a user. If you didn’t know it was an advanced search and categorisation engine, you might be forgiven for thinking that the screen of results was actually from an EDRMS.

  • On the left hand side is the classification scheme that I could browse.
  • Click on one of the activities, and I was taken to the subject.
  • The list of results shows all or most of the same basic metadata you would expect in your EDRMS; more if it was added when the record was saved.
  • From the results listed I could find similar documents, see similar or related search results, and add public or private tags.

The technology doesn’t just figure out the classification by itself – it has to be trained, and who better to train the system than records managers? Start with the business classification scheme, find 100 records that match the classification, ask the system to find 1000 and confirm (and manage exceptions). And so on, until all the digital records you have allowed the technology to search (including network drives, email etc) classifies all your digital records.

The link with disposal is there too. The technology classifies all the digital records you point it at. If your classification system is linked to your retention schedules, you are presented with all the information that can be then subject to a retention rule. Keeping the metadata about the records that you destroy is then a matter of capturing the metadata found in the search and applying additional metadata about the disposal action.

Options for using technology like this might include:

  • Applying it against legacy digital stores, to clean up old digital records.
  • Applying it in conjunction with an EDRMS (or SharePoint), so that the technology assigns the correct classification regardless of where the user puts it (or can file it according to that classification).
  • Applying it against legacy and active digital stores, instead of using an EDRMS.

Criticism of predictive coding

The online legal site law.com published an article on 31 October 2012 ‘Pitting Computers Against Humans in Document Review‘. The article:

  • Examined whether a frequently cited (‘TREC’) study (quoted in the article by Peck above) ‘is sufficient to support the conclusions’ that technology is better than humans.
  • Noted several weaknesses in the original TREC study, pointing out that some of the ‘technology-assisted’ teams actually performed miserably.
  • Noted an ‘inherent flaw that the TREC study was not a fair comparison of manual reviewers to the technology-assisted teams.

However, the article also concludes that ‘one should not jump to discredit the usefulness of computer assisted methods’; techological solutions depend on the expertise of the people who use it.

Craig Ball offered an interesting reply to this article leading with the comment: ‘Whoever challenges our assumptions and forces us to defend them is performing a valuable service, no matter what their motives’.

Ball noted the ‘sad fact … that human reviewers perform poorly in a consistent fashion, and we needn’t rely upon TREC or the Grossman/Cormack article alone to prove same’. He added ‘the fact is that the errors human reviewers make are rife, even when well trained and -motivated (if we are candid, an all-too-rare and exceptional circumstance)’. And the ‘errors human reviewers make are overwhelmingly not close calls (but) plainly, manifestly errors of the sort that mindless, heartless computers do not make’.

I think it’s possible to replace ‘human reviewers’ in this context with ‘end users’ who don’t understand classification terms (or, generally, recordkeeping).

My own view is that predictive coding technology has the ability to support digital recordkeeping, with the active involvement of records managers, with or without an EDRMS. It has the ability (once trained) to aggregate records by BCS terms and to apply retention rules to those records. (Incidentally, the same concept is used to put records on legal holds to prevent their disposal).

Applying recordkeeping policies to email – Microsoft Messaging Records Management (MRM)

June 1, 2012

The problem

The problem of managing emails as records is summed up in the following statements:

“Many organizations have yet to define an email retention policy. More than one‐quarter of organizations have not yet established any sort of email retention policy despite the fact that there are a growing body of statutory requirements and legal obligations to preserve business records, including those stored in email. Among the nearly three‐quarters of organizations that have established an email retention policy, only two‐thirds of these organizations indicate that their users are fully aware of the policy.” Michael Osterman, “Messaging Archiving and Document Management Markets Trends, 2009-20112”, dated May 2009.

‘Over 40 years after the invention of email, relatively few institutions have developed policies, implementation strategies, procedures, tools and services that support the longterm preservation of records generated via this transformative communication mechanism.’  Christopher J Prom, ‘Preserving Email’, DPC Technology Watch Report 11-01 Decemer 2011. www.dpconline.org/component/docman/doc_download/739-dpctw11-01.pdf

Storing business records in context

Traditional records management theory recommends that there should be a clear relationship between records about a particular subject or issue, regardless of format, and the business context that originated it. (AS ISO 15489-2002: 9.3 Records Capture)

In the paper world, this was achieved by the co-location of related records in a physical file.

In the electronic world, this is usually achieved through the application of metadata. Business classification and naming systems applied to electronic folders generally achieve this; as well, electronic systems also allow for a range of cross-subject metadata that allows records to be organised in different contexts.

Additional, business context-specific metadata can be applied to emails (including from integrated business applications – for example, an email saved to TRIM will show the TRIM record number in its email metadata properties).  However, this ability (as with Properties in Office documents) is rarely enabled or used.

Instead, and as with Office documents, we tend to let users ‘categorise’ their emails (and documents on network shares) through folders – although not all users do this.  (Interestingly, online email systems like Google’s gmail use tags instead of folders).

Are emails documents?

The short answer is yes (in the Australian legal evidence context), but they are documents that, in a way like xml-based Office documents like docx, are made up of structured data that displays as a single ‘document’.

Part of the problem with emails as records is the perception (on the part of users who have never had to face court) that they are not documents, but messages.  The ability to use the system to send or receive ‘private’ messages exacerbates this perception.

The problem of storing emails as records

Emails have been a constant problem and challenge for records managers and recordkeeping since they first appeared in the early 1990s.

The three main approaches to keeping emails have been to (a) print to paper, (b) save to a recordkeeping system, and (c) save to a drive.

Print to paper, while relatively common in many organisations even now, is probably the poorest (and some might say ‘silliest’) option in the digital world as (a) it is dependent on users, (b) emails usually lose their message headers, (c) emails are unsearchable in their electronic form, (d) emails remain on the Exchange system and are discoverable.

Saving emails to a recordkeeping system, while better than printing, is an inadequate option because (a) it is usually dependent on users to do it, (b) the email still remains in the Exchange system, and (c) it can sometimes result in the email being saved in a different format that is not necessarily suitable for long-term preservation (e.g., TRIM’s .vmbx).  There is also the problem of users saving ‘dumb’ emails with (valuable) attachments, which can make the attachment harder to find, identify or access.  Some systems (such as SharePoint 2010) include email-enabled storage locations.

Chris Prom, in a blog posting titled ‘Practical E-Records ‘Facilitating the Generation of Archives in the Facebook Age’, notes that:

‘…the formal recordkeeping systems previously used by many organizations for electronic records have died or have one foot firmly in the grave.  At the same time, the habits that individuals use in producing, consuming, storing, filing, searching, and interpreting records are themselves undergoing constant change.  People adopt new communication technologies at an ever-quickening pace.   Divergent personal practices, rather than the centralized electronic systems, are the harsh reality that confronts our profession’.

Saving to a drive is also a poor option, and is usually based on user preferences to want to ‘keep’ emails.  Emails saved to drives (a) will still remain in the Exchange system, (b) may lose their header information, and (c) are not necessarily saved in appropriate or accessible formats.

In relation to the last point, Outlook does not make it easy for an end user to decide, with usually five options to choose from – which is the right one?  Users will usually choose whatever is the default (.msg), but this isn’t necessarily the best long term option (which is MIME or EML – the latter described by the National Archives of Australia (NAA) as ‘an acceptable open file format for long term storage).

In all cases, keeping these emails in the business context to which they relate has been a constant problem for records managers.  As a consequence, there is a tendency on the part of almost all businesses to leave and manage emails where they are (i.e., in Exchange).

Microsoft Exchange 2010 – Messaging Records Management

To try to address this problem, Microsoft introduced ‘Mailbox Manager Policies’ in Exchange Server 2003.

This was followed by ‘Message Records Management’ with Managed Folders in Exchange Server 2007 (a feature that remains in Exchange 2010).

Exchange Server 2010 includes a new model of managing emails as records, called ‘Messaging Records Management’.  Microsoft describe it as follows:

‘Messaging records management (MRM) is the records management technology in Microsoft Exchange Server 2010 that helps organizations reduce the legal risks associated with e-mail. MRM makes it easier to keep the messages needed to comply with company policy, government regulations, or legal needs, and to remove content that has no legal or business value. This is accomplished through the use of retention policies or managed folders’. (Source: http://technet.microsoft.com/en-us/library/dd335093)

As Microsoft notes, however (on the same page), MRM does not prevent users from deleting messaging; it is really only designed to remove them at the end of a given period.  Microsoft recommend ‘journaling’ emails where there are specific business reasons to keep them for longer (such as legal proceedings or the need to ensure specific email is kept), or applying the Legal Holds functionality.

The key elements of MRM are Retention Policy Tags (RPTs) and Retention Policies.

There are three types of Retention Tags: (1) Default Policy Tags (DPT), (2) Retention Policy Tags, and (3) Personal  Tags (which are an ‘opt-in’ on the email client).

  • Retention Policy Tags (RPTs) are used on default folders (e.g., inbox, junk mail, sent, deleted). Users cannot change the RPT but can override it with a Personal Tag.
  • Default Policy Tags can be applied by users to untagged items.  A Retention Policy can contain only one default policy tag.
  • Personal Tags can be applied by users to their own custom folders or individual emails.

In most cases, users make the decision, and the retention applies on where the email is located.  If there is actual or anticipated litigation, a Retention Hold can be applied to the user’s mailbox; however, this does not prevent users deleting emails, it only overrides any retention policies.  The Legal Hold option should be applied to prevent deletion.  Once this option is applied, Legal Hold ‘captures any deleted or edited items into a special folder that’s neither accessible nor changeable by the user’.

All retention tags include: a Tag Name, a Tag Type, an age limit (in days) with an action to take, and comments.

The actions available are:

  • Delete And Allow Recovery – This action will perform a hard delete, sending the message to the dumpster. The user will be able to recover the item using the Recover Deleted Items dialog box in Outlook 2010 or Outlook Web App.
  • Mark As Past Retention Limit – This action will mark an item as past the retention limit, displaying the message using strikethrough text in Outlook 2007, 2010 or Outlook Web App.
  • Move To Archive – This action moves the message to the users archive mailbox.(see below)
  • Move To Deleted Items – This action will move the message to the Deleted Items folder.
  • Permanently Delete – This action will permanently delete the message and cannot be restored using the Recover Deleted Items dialog box.

Once the tags are created, they can be added to a Retention Policy and this policy, in turn, is then applied to specific mailboxes – one policy per mailbox.

The ‘auto-tagging’ feature, once 500 items have been tagged, will automatically tag items in a user’s mailox based on their past tagging activities.

So, is MRM the answer to managing emails as records? 

Yes and no.  From a recordkeeping perspective, MRM:

  • Does nothing to ensure that records are kept in the business activity or functional context to which they relate, unless (of course) the emails are the only form of record that exists for the business activity.
  • Does not stop users from deleting emails.

On a positive note, MRM:

  • Attempts to address the problem of email retention.
  • Allows the application of a retention policy to emails that might be stored in a business context Outlook mailbox or fold As well, Exchange features like Legal Hold and Journaling allow further controls to be implemented.

Archiving

Exchange 2010 now includes a ‘personal archives feature’, which allows users to save emails to their own archive instead of saving emails to drives or using Personal Storage (.pst). A good article on this subject can be found at this location: http://mohamedridha.com/2011/11/07/exchange-2010-online-archiving-and-retention-tagspolicies-a-practical-example/

Sources (all retrieved 1 June 2012)