Archive for the ‘Information Management’ Category

Using the ‘sync’ option to work smarter and reduce duplication, and increase end user acceptance of SharePoint

July 18, 2019

Perhaps the single most common complaint about using electronic document management (EDM) systems over the last two decades has been the requirement to save a copy of a record stored on a network file share to the EDM system.

Network file shares are littered with documents, many of them duplicated in other locations, on personal drives (and removable drives), and attached to email messages. Some of these documents may also have been saved in the EDM system. 

It is a known fact that legal discovery activities rarely focus solely on the records in an EDM system, no matter how good that system may be. As long as network file shares (and personal drives) have existed (and continue to exist) alongside EDM systems, the latter has always been the poorer sibling in terms of information value.

Various attempts over the years by EDM vendors to ‘integrate’ their products with network file shares (often via WebDAV – see below) have rarely been successful not the least because the folder structure of the network file share is inevitably more useful and flexible than the often rigid structure of the EDM.

*WebDAV, or ‘Web Distributed Authoring and Versioning’ (RFC 4918) is ‘an extension to HTTP, the protocol that web-browsers and web servers use to communicate with each other’. WebDAV facilitates collaborative authoring, editing and file management. The most common usage of WebDAV is to map cloud storage as a network drive. (Source: WebDAV: What it is, where it turns up, and its alternatives, retrieved 18 July 2019)

The old ‘Groove-y’ way

Microsoft Office Groove 2007, or ‘Groove’, was a Microsoft Office component that used WebDAV to synchronise with a SharePoint library, allowing the library to be opened from Windows Explorer. (Source: Understanding and troubleshooting the SharePoint Files tool in Groove 2007, retrieved 18 July 2019)

While this method worked, it was clumsy and difficult to use. Duplication on network file shares continued.

2018 – The new OneDrive for Business sync client

The previous Groove OneDrive for Business sync client (Groove.exe) was included with the Windows 10 Operating System that was released in mid 2015.

The new SharePoint Online became widely available from 2016 and has continued to evolve. Initially, it was only possible to synchronise a SharePoint Online document library using WebDAV methods.

The new OneDrive sync client (OneDrive.exe), also known as the Next Generation Sync Client (NGSC), appeared in early 2018. The new sync client allowed users (with Windows 10 devices) to sync their SharePoint document libraries to File Explorer.

A mostly unnoticed but significant change

The sync option on SharePoint document libraries (in addition to OneDrive and OneDrive for Business) is possibly one of the least noticed changes that has the potential to have – ironically – both a major and also minor impact on the way people work.

It is a minor impact because the change effectively allows users to continue working the way they always have, in File Explorer, going only to SharePoint Online when they need to.

It is a major impact because, coupled with the ability to ‘share’ content easily (directly from File Explorer), the potential for duplication – except for the duplication between ‘work’ and ‘personal’ spaces – has been removed. Everyone with access to it can sync the same document library and multiple people can work on documents in the library at the same time.

Instead of creating a ‘working’ document on a drive and perhaps emailing it to everyone, there now only needs to be a single copy that multiple people can access – via File Explorer, at the same time. Everyone with access can see when any other person is editing.

That is, end users can continue to work in File Explorer, the way they have always done. In that sense, the ability to sync a document libraries makes redundant the need to open a browser and access SharePoint that way. (This in turn impacts on the way change is managed and perhaps how each SharePoint site might be configured).

How it works

As a start it should be emphasized that this works best with Windows 10 as Windows 7 devices may still have the old ‘Groove’ client installed.

End users need to go to the SharePoint site first and click on the library they want to sync. Users need to have edit rights on the library to sync it.

They should then see the Sync option:

O365_SyncRibbon

The OneDrive for Business client notifies the user that the library will be synced.

O365_Sync_ODfBClientB

The library is then synced to the user’s File Explorer. A new icon (with the Office 365 tenant name) appears on the left, and each document library that is synced is shown as a folder beneath it. End users can now work directly in the synced document library in File Explorer, including adding new folders and documents.

End users may also select which folders they wish to sync either by opening a folder in SharePoint and syncing from there, or by right clicking on the folder that was synced, clicking on ‘Settings’ and removing any unwanted folders. This, of course, could mean that users don’t see new folders they really should see and may as a result attempt to create one with the same name (which will be rejected).

Documents are not downloaded to the user’s computer until they open them. This can be seen below in the first document with a circle/tick icon (downloaded) and the three others with cloud icons (not downloaded).

O365_Sync_FileExp_Docs.JPG

The user can right-click and use the Share option (the same as in SharePoint Online) to share the document with colleagues which (as long as the person sharing has the permission to do so) gives the other person access if they didn’t have it before. The three dots at the top right of the dialogue box provide the option to manage access to the document.

O365_Sync_FileExpl_ShareOption

Note: End users cannot copy and paste a link to sync a library, the sync runs from a user’s computer and is personal to their log on and their device.

End user reactions

Personal experience supporting thousands of end-users with access to SharePoint Onine indicated that this was perhaps one of the most useful features ever released.

Several people noted that they regarded the sync option as a ‘cloud-based backup’. Some indicated that they rarely returned to the browser version of SharePoint for their key document libraries (which may be problem).

What about metadata and content types?

Presently, document libraries synced to File Explorer do not display any metadata associated with the document or document library, only the icon, name, date, type and size.

However, Microsoft Office documents (Word, Excel, and PowerPoint) retain any original metadata in the document properties (the ‘metadata payload’) and these properties may be changed on the document itself via the ‘File’ option.

Any metadata columns that are mandatory are also ignored; a user may add a document directly to the synced document library in File Explorer even if there is a mandatory metadata column. Note that this is the same behavior in SharePoint Online; if a document is added to a library with a mandatory metadata column, a warning appears but the document can still be uploaded.

Note also that new options coming soon to SharePoint Online, which will also be seen via the ‘Share’ option in File Explorer, is the ability to set restrictions such as the ability to print or download, or expiry dates.

The new way of working

The old way of working was to create and manage documents on network file shares and personal drives, emailing copies as required. Adding documents to EDM systems was an additional and disliked step that in most cases created a copy of a document that still remained on a drive somewhere. (And, in many cases, the EDM system had a linked file share where the documents were stored).

The new way of working minimises the need for duplication.

  • Users create a new Office document (including directly from OneDrive or SharePoint, where it is automatically saved in the library from which it was created)
    • If the document was not created from OneDrive or SharePoint, the ‘save’ dialogue presents the following locations by default: OneDrive (personal); SharePoint (any SharePoint site the user has access to – including the synced document library on File Explorer); or ‘browse’ to another location.
    • If the document is saved to the synced document library in File Explorer, it is then automatically copied to the SharePoint Online document library (and a green circle and tick appears).
    • If the document is saved to a SharePoint Online library directly, it will appear in a synced folder in File Explorer initially with a cloud icon.
  • The document may then be shared, either from File Explorer or in SharePoint Online (the same Share dialogue on both).
  • The recipient of the Share invitation can then open the document directly and edit it (if given those rights).
  • Any edits of the document will be recorded in the version history of the document. Other actions (e.g., changes to security) will be recorded in the audit logs.

One document, stored in a single location, accessed by many. A new, much smarter, way of working.

Advertisements

Office 365 – Security and Compliance – Records Management section

May 30, 2019

Microsoft have introduced a new ‘Records Management’ section in the Security and Compliance portal of Office 365. In many respects, this is simply a re-ordering of what was already in place however it keeps logical elements together, including the new ‘File Plan’ option.

O365_CompliancePortal_RecordsManagementetc

File Plan

The new File Plan option appears when a new retention label is created as shown below.

O365_Classifications_Labels_FilePlan1

Editing the file plan descriptors section brings up the following options, which allows organisations to ‘map’ a File Plan (or BCS) to retention and disposal policies.

O365_Classifications_Labels_FilePlan2

Each of these sections allow you to choose from an existing option or add a new one.

If retention policies have been mapped to a file plan, these mapped policies can be viewed when clicking on the ‘File Plan’ section under Records Management, which displays the File Plan according to the Label, allowing the records manager to view these labels as part of a File Plan …

O365_Compliance_RecordsManagement_FilePlan1

… or just the Policies:

O365_Compliance_RecordsManagement_FilePlan2

Events

The Events section will only have content if any of the retention policies is linked with a pre-defined event. These events will be listed in this section.

O365_Compliance_RecordsManagement_Events

 

Office 365 Records Management update

May 30, 2019

In my post Applying retention periods to SharePoint document libraries and disposal/disposition actions I included a series of screenshots including one that showed the list of records due for disposal and an option to filter this by site URL.

The site URL filter option has been replaced with ‘Type’ (documents or emails) and ‘Search’ options as shown in the screenshot below. To filter by the site URL, simply enter all or part of that URL in the search option as shown. Actions can then be taken on all documents in that site library.

O365_CompliancePortal_RecordsManagement_DispositionListf

Note that you can Export this date, but also note that the ‘Pending disposition’ section does not display any additional metadata that may be been associated with the documents. Accordingly, it may still be necessary to return to the original library, export all the metadata, and then save that manually to keep a record of what was destroyed.

The ‘Disposed items’ shows a list of records that have been disposed of. It is not yet clear how long this information will remain in this area. Also note that the Disposed items section does not include the ability to search, thereby to refine the list of documents to a site or library.

O365_CompliancePortal_RecordsManagement_DispositionListe

Metadata Payloads in the Digital World

March 19, 2019

For at least twenty years, a core tenet of both document and records management has been the metadata that defined records. A number of metadata schema were developed over the years, including the well-known Dublin Core (http://dublincore.org/documents/dces/) that defined 15 core metadata elements for digital content:

  • Contributor
  • Coverage
  • Creator
  • Date
  • Description
  • Format
  • Identifer
  • Language
  • Publisher
  • Relation
  • Rights
  • Source
  • Subject
  • Title
  • Type

Introduction of XML based documents

Parallel with the development of metadata schema, the introduction of XML-based documents (e.g., .docx, odb) from the early 2000s introduced a new way of both structuring and describing documents. Instead of being external to the document, metadata could be embedded within the document, making it effectively a type of ‘metadata payload’.

Around the same time that XML-based documents were introduced, I wrote about the ‘Semantic Office’. The Semantic Office drew on the same ideas developed and implemented for the ‘Semantic Web’. Conceptually, the idea was quite simple – just as web pages would contain their own embedded metadata in the form of Resource Description Framework (RDF) triples (subject – predicate – object, e.g., sky – is – blue), common office documents such as Outlook, Word and Excel could carry their own embedded metadata ‘payload’.

Some of this metadata is visible in the Properties pane of a records but only as descriptive terms not as metadata defined against a specific schema.

The (mostly overlooked and under-reported) outcome of the introduction of XML-based documents was that a document could be stored anywhere and be found again based on the embedded metadata – as opposed to finding it through  metadata that was created and managed separately from the record (for example, in a document management system). For some reason, however, the predominant and persistent model for document management has been to store metadata about a document separately from the document.

In most document and records management systems since the late 1990s, digital records (emails included, if they are saved to the DRMS) were/are stored in secure file shares while the metadata about the record (including its ‘file’ or ‘container’ identifier) was stored in a separate database. Visually this gives the user the illusion that the records are stored ‘in’ a container even though they are actually stored in a network file share.

This pervasive document management model is conceptually similar to the way computers record metadata about documents stored in a Windows NT File System (NTFS) in the Windows Master File Table (MFT). MFT entries include details of the size, time and date stamps, permissions, and so on. It assumes that the actual location of the record is recorded in the metadata.

How XML-based documents embed metadata

XML-based Office documents (as well as PDFs and image files), however, retain core metadata information within the document itself. The information is accessible regardless of where the document is stored.

Ironically (perhaps) it may be different from any external metadata used to describe the document.

To view the embedded metadata in a Word document you only need to rename it to .zip and then unzip it. Extracting a zipped Word document reveals (in most cases) several folders and one XML file:

  • [trash] – contains ‘dat’ files (may not be present in all documents)
  • _rels – contains the ‘.rels’ XML document
  • customXml – contains a number of ‘item’ and ‘itemProps’ XML documents
  • docProps – contains three very small files: app.xml, core.xml, custom.xml
  • word – contains a range of XML files and additional folders with other XML files.
  • [Content_Types].xml

In one example Word document downloaded from a SharePoint library, the file ‘item4.xml’ in the ‘customXml’ folder contained both XML namespace (xmlns) information as well as the embedded document management elements (highlighted in bold):

A separate xml document also located in the ‘customXML’ folder contained the following core properties, including most of the Dublin Core elements listed above (but note that they are all blank).

Arguably, the body of the record is also a form of metadata, enclosed by the terms <body>text</body>. In the example document downloaded from SharePoint, the body of the document is contained in the file ‘document.xml’ under the ‘word’ folder of the package.

  • xmlns:wps=”http://schemas.microsoft.com/office/word/2010/wordprocessingShape&#8221; mc:Ignorable=”w14 w15 w16se wp14″>
  • <w:body>
  • <w:p w14:paraId=”195D8795″ w14:textId=”77777777″ w:rsidR=”0001502C” w:rsidRDefault=”00880316″>
  • <w:r>
  • <w:t>Test document</w:t>
  • </w:r>
  • </w:p>
  • <w:p w14:paraId=”195D8796″ w14:textId=”77D86E32″ w:rsidR=”006832E2″ w:rsidRDefault=”006832E2″ w:rsidP=”006832E2″>
  • <w:r>
  • <w:t>Lorem ipsum (and the rest of the text, deleted for brevity)</w:t>
  • </w:r>
  • <w:bookmarkStart w:id=”0″ w:name=”_GoBack”/><w:bookmarkEnd w:id=”0″/>
  • </w:p><w:sectPr w:rsidR=”006832E2″>
  • <w:pgSz w:w=”11906″ w:h=”16838″/>
  • <w:pgMar w:top=”1440″ w:right=”1440″ w:bottom=”1440″ w:left=”1440″ w:header=”708″ w:footer=”708″ w:gutter=”0″/>
  • <w:cols w:space=”708″/>
  • <w:docGrid w:linePitch=”360″/>
  • </w:sectPr>
  • </w:body>
  • </w:document>

Other core metadata elements are contained in the ‘core.xml’ file:

Why is this important?

The existence of – and ability to make use of – embedded metadata seems to have been overlooked since the introduction of these types of records over 15 years ago. This may have been primarily because no-one had a system in place to access or use that data in any meaningful way.

Instead, most records continued to be defined by metadata that is created or captured and managed separately from the record itself.

The problems with storing metadata separately from the record are that: (a) the external metadata may be different from the embedded metadata, and (b) the external metadata may unnecessarily limit or restrict the ability to see the record in different contexts.

For example, one person may assign a specific metadata term, such as a function from the Business Classification Scheme (BCS) to the digital record, or assign it to a specific ‘container’. Some time later, another person may try to find the same record but discover it is not in the same file, or assigned to the same function term. They are likely to be looking for the record in or from a completely different context.

The only way they may be able to find it is by doing a general search that includes the body or content of the records, something I found to be the case in real life scenarios where users couldn’t find the records they were looking for based on metadata searches.

Of course, metadata is still important, but my point is the difference between embedded metadata that can be added when the document is saved to a document library, and external metadata that is stored separately from the digital record.

Being able to leverage the metadata embedded in records, wherever they are stored, provides a much more powerful ability to leverage this information, similar to the way the application of metadata to web pages facilitates access.

Records Description Framework

A core part of the world wide web is the application of metadata to web pages to facilitate their discovery in a highly connected world. The core elements of this metadata are defined in the World Wide Web Consortium (W3C)’s Resource Description Framework, or RDF.

To quote the World Wide Web (W3) consortium:

‘RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.’ (Source: https://www.w3.org/RDF/)

It is perhaps not surprising that Microsoft named the analytic engine behind Office 365 the Microsoft Graph.

According to Microsoft:

‘Microsoft Graph is made up of resources connected by relationships. For example, a user can be connected to a group through a memberOf relationship, and to another user through a manager relationship. Your app can traverse these relationships to access these connected resources and perform actions on them through the API. You can also get valuable insights and intelligence about the data from Microsoft Graph. For example, you can get the popular files trending around a particular user, or get the most relevant people around a user.‘ (Source: https://developer.microsoft.com/en-us/graph/docs/concepts/overview)

microsoft_graph

The RDF model is also used in knowledge management applications such as Protege that supports the creation and use of RDF/XML ontologies.

Implications

In my opinion, the implications of XML-based office content (which has been around for over 10 years now) are quite important for records management theory and practice.

While, like traditional EDRM systems, documents are visually displayed ‘in’ the document library, each document retains its own originally assigned metadata even if it is downloaded – unless the user uses the ‘Check for Issues’ – ‘Inspect Document’ option from the Info panel to remove them.

The ability to store metadata properties directly in the document facilities that ability to locate and retrieve documents that have the same, similar or related properties, via the Microsoft Graph, in the same way that web pages use RDF triples, allows otherwise unconnected resources to be linked and presented to the user (subject to any security controls) automatically based on their specific context.

In other words, instead of records being locked to a specific container based on their metadata being stored in a database, records could be discovered and linked wherever they are located based on their embedded metadata.

Relevance of W3 XML schema to Office 365 content

The use of RDF-based metadata embedded in Office documents in Office 365 means that this data can be used to link resources in a way that supports the discovery of the resources. It allows for cross-linking of information. Documents with metadata payloads are one of the many resources that can be connected in this way.

For example, ‘… a user can be connected to a group through a ‘memberOf’ relationship, and to another user through a manager relationship. Your app can traverse these relationships to access these connected resources and perform actions on them through the API. You can also get valuable insights and intelligence about the data from Microsoft Graph. For example, you can get the popular files trending around a particular user, or get the most relevant people around a user.’ (Source: https://developer.microsoft.com/en-us/graph/docs/concepts/overview)

‘Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.’ (Source: https://www.w3.org/RDF/)

Four observations about Office 365/SharePoint Online and records management

March 19, 2019

The following is a slightly modified version of four points I made recently to a records management professional, responding to the point that ‘many CIOs are rolling out Office 365 and SharePoint Online to replace traditional recordkeeping  systems such as TRIM/CM etc’.

First, generally speaking, records managers have traditionally not had a strong technical knowledge and/or weren’t close to the IT team.

Even if they managed TRIM/CM/other EDRM it was usually as the front end admin, not the back end technical IT admin, which remained with IT. Conversely, IT people have generally never had much knowledge of how to manage records (it not usually part of their skill set).

There was almost always a gap (technical, organisational, communication etc) between the records area and IT; consequently, IT departments have rolled out SharePoint and more recently Office 365 without reference to (or the feeling they even needed to refer to) records managers, and often without a solid architecture and planning for implementing and managing SharePoint (or Office 365).

Into the space between IT and records (but usually closer to IT) are various vendors who offer products that they say does the records management they claim that SharePoint does not do.

This by the way is not a criticism of those vendors as such, but there has been a tendency to buy their products without really understanding what the base product can do. This has almost always been the case for many IT products – back in 2006/7 I was part of a team looking to acquire a major ECM product and was a trained system administrator. The product itself could do exactly what was required without any modifications, the problem was the client (the company I worked for) wanted modifications that required consulting work. Close to a million dollars later in consulting fees, the product was still unused.

I’m also concerned at the way some vendors pitch the suitability or ‘compliance’ of their products in relation to add-ons to SharePoint for managing records. I had one telling me in all sincerity that their product ‘complied with ISO 15489’, which was interesting to hear since their is no compliance framework. The same vendor’s salesman was not aware of ISO 16175 when I asked about it.

Second, from SharePoint 2010 onwards, Microsoft implemented a range of new records management functionality to meet minimum (mostly corporate rather than government) requirements for managing records.

That new functionality included a great deal more features than most people knew about. One Australian consultant (John Wise) identified that SharePoint 2010 met 88% of the requirements of the then ICA standard that became ISO 16175 Part 2. For most non-government organisations that didn’t need the level of information security found in government, it was closer to 95%, and the 5% remaining was not particularly important for most organisations. With the introduction of both retention/disposal policy management, and information security classifications, via the Security and Compliance Centre in the Office 365 admin portal, SharePoint meets almost all requirements listed in ISO 16175 that do not refer to legacy systems.

In many respects, by ignoring ‘traditional’ ways that other EDRM systems have managed records, Microsoft introduced a brand new paradigm for managing records, underlined by the idea that digital records do not work the same way as paper records.

In my view, many older EDRM products failed to adapt to the new digital world and continued to enforce the concept that records must be ‘moved’ (saved to) a container in the recordkeeping system just as paper records had to be saved onto a single subject file. As long as Exchange and network files shares remained completely separate, this meant (and continues to mean) that the original versions of those records always remained in Exchange/network files even after they were copied to the EDRM.

A much smarter model, which SharePoint Online offers via both the create and save processes, is to allow people to save non-email records directly to SharePoint, including in syncronised document libraries in File Explorer; the document libraries can have default metadata applied to content types, and retention policies can be applied to those libraries. Emails can be moved automatically via Flow, or retained in the mailboxes with Office 365 retention policies applied. Recordkeeping happens in the background, people don’t have to fill in a form every time they want to save a record to the system.

Microsoft have centralised records management across the Office 365 environment. For example, the creation and management of records disposal/retention classes (called ‘classification policies’) is now carried out in the Security and Compliance Admin centre of the Office 365 portal. Records managers need to be assigned specific roles to do what they need to do (and I would argue, the corporate records managers should also be Site Collection Administrators on every site, preferably via a Security Group).

It doesn’t matter if the record is in Exchange or in SharePoint (or some of the other Office 365 applications), a classification policy can be applied wherever it is. When implemented correctly (based on a good architecture model), classification policies can provide the recordkeeping context required to link records over time.

Third, just like a home subscription to Office 365 with cloud storage is more cost effective than buying the product as before, most IT organisations have seen the benefits of moving their enterprise agreement licencing from per-device licence (where the licence is based on the computer) to a per-user licence (where the user can use the product on multiple machines including mobile devices or from home). This has also allowed them to shift storage (and the costs of maintaining servers, including technical staff) from their own or hosted data centres to the Microsoft cloud (which, ironically, may be in the same hosted data centre).

One large organisation that I’m familiar with had around 30TB of storage in the data centre; by acquiring Office 365 E3/E1 licences, they had 45TB – PLUS, 1TB for each user’s OneDrive. I suspect this point is not known to most records managers (first point above), who simply see the CIO’s introducing or rolling out Office 365 for no obvious reason.

Fourth, SharePoint has traditionally been many things to different people because it has always had a dual nature – publishing/intranet and team sites.

This is no different in SharePoint Online but the options to customise are now fewer (thankfully). Communication sites are a simple and elegant way to publish information, while team sites (including Office 365 Group-based team sites) are more or less the functional replacement for network drives (OneDrive for Business replaces personal drives).

In my opinion, it is important for anyone getting involved with SharePoint to understand this – that SharePoint Online is NOT the same as the ‘old’ SharePoint on-premise that could be customised to do just about anything.

Keep it simple, using the very rich ‘out of the box’ options, and it begins to make more sense. Plus, as noted already, users can synchronise SharePoint document libraries to File Explorer and work from there, so their experience can be more or less exactly what it is now using network drives.

Can you manage records in SharePoint Online? Absolutely, keeping in mind that SharePoint Online is very much a part of the Office 365 ecosystem and should not be considered a standalone application as it was when installed in an on-premise server.

Records managers need to get up to speed (quickly, in my opinion, although I’ve been saying it for years) with not only the recordkeeping functionality already in SharePoint Online and be SharePoint System Administrators (to give them access to the SharePoint Admin portal) and Site Collection Administrators, but also really need to understand the Office 365 portal and the relevant parts of the Security and Compliance Admin Centre including classification policies, ediscovery options and audit options.

Migrating to SharePoint Online – Part 1 (Planning)

August 25, 2018

We implemented SharePoint 2010 in early 2012 and then upgraded to SharePoint 2013 in early 2015. After acquiring Office 365 enterprise licences in April 2016 we began to play for the migration of our existing on-premise environment to SharePoint Online. After testing the migration process with inactive sites, we started to migrate active sites from early 2018. We expect to complete all the migrations by 31 December 2018.

This post, the first of three, outlines the factors that influenced and guided how we approached the migration. Our approach may not be the same as your approach, but many of the basic principles may be similar.

Overview of our SharePoint environment pre-migration

A key principle for our SharePoint environment since 2012 was to avoid customisation and dependencies, and use the product ‘out of the box’ (OOTB) as much as possible.

  • Customisation would almost always require some degree of development and ongoing maintenance. It also meant that upgrades could be more complex and expensive.
  • Dependencies of any sort – be they integration components or third-party add-ons – could also make upgrades more complex and expensive.

Governance model

We also implemented a ‘balanced’ controlled environment, following the technical design models for SharePoint 2010 described by Microsoft (extract in image above), which recommended that organisations strike balance across three key governance elements:

SharePoint2010GovernanceBalance

Source: https://docs.microsoft.com/en-us/previous-versions/office/sharepoint-server-2010/cc303422(v%3doffice.14)

  • IT Governance. Centrally managed or locally managed?
  • Information Management. Tightly managed or loosely managed?
  • Application Management. Strictly managed or loosely managed development?

In our environment, the ability to create new SharePoint sites and sub-sites required the completion of a (SharePoint) online form and was restricted to the SharePoint Administrators. This enabled us to prevent uncontrolled growth in the environment and to ensure that all new sites were created within a pre-defined – but not overly strict – architecture design model.

Upgrade to SharePoint 2013 in early 2015

Our SharePoint site collections were created across five web applications: team (approximately 120 sites), project (approx. 120 sites), publication, apps, and intranet. Most of the corporate records were stored in team or project sites, as well as a single ‘apps’ site. (Our apps sites (< 10) were set up to address small business problems that in the past might have been addressed by using Microsoft Access).

Thanks to our OOTB model, we were able to upgrade to SharePoint 2013 over a weekend, with almost no errors. The only site we could not upgrade was the intranet which remains (as at August 2018) in ‘compatibility mode’.

Note: It is not possible to migrate directly from SharePoint 2010 to SharePoint Online. It must be upgraded to SharePoint 2013 or SharePoint 2016 first.

The situation in 2016

In May 2016 we changed our Microsoft Enterprise Agreement to an Office 365 subscription model. Our reasons for going to Office 365 were driven by multiple factors, including the need for mobile access to information.

It is important to remember that SharePoint Online is only one element among many others in Office 365. That is, while it is technically possible to do it, SharePoint would not normally be migrated on its own to SharePoint Online. Any migration must take in account a range of considerations relating to the broader Office 365 environment, including (but not limited to):

  • Office 365 licences (and what this meant for our users with Office installed on existing computers which were being upgraded to new Windows 10-based devices as part of a separate project)
  • Active Directory syncing so users can access the environment.
  • Exchange mailbox migrations so SharePoint-based, email-linked Flow workflows can work.
  • OneDrive for Business, as a SharePoint service to replace ‘personal’ drives on network file shares.
  • Security controls and records retention policies, set from the Office 365 Security and Compliance admin portal, as well as audit logs in that same portal.
  • Office 365 Groups with associated SharePoint sites, Yammer groups (which can be linked with Office 365 Groups) and Microsoft Teams (which can also be linked with Office 365 Groups).
  • ‘Classic’ and modern team sites, Office 365 Group-based sites, and communication sites.
  • The SharePoint user portal.
  • The mobile app, and how sub-sites are accessed.
  • The ever-changing SharePoint Online environment in which anything described as ‘classic’ is likely to be deprecated at some point, and new features appear.

Migrating multiple web applications to one

We needed to plan our migration process, moving away from our five web applications to a new model. We new that, with the exception of our customised intranet, we would probably be able to migrate almost all of our sites relatively easily because we had always kept to the OOTB model.

Fortunately, Microsoft produced a very useful 12-page document which provided a good overview describing how it ran its own SharePoint migration, and good advice for how we might do our own migration.

SharePoint_to_the_cloud_MSpaper.JPG

Learn how Microsoft ran its own migration

We had a range of factors to take into account.

  • One of our initial decisions was not to migrate any active site until all Exchange mailboxes were migrated (and preferably, end-users had new Windows 10 devices). As it turned out, the decision to migrate mailboxes was delayed and as a result we would end up migrating most sites first.
  • We need to work out how to migrate our content as it was no longer possible to do a ‘lift and shift’. We investigated the market and made the decision to acquire a migration tool, ShareGate, to do the migrations ourselves. We would later find the same tool useful to migrate personal drives to OneDrive for Business.
  • We identified the likelihood that we would create new SharePoint Online sites in parallel with the migration of on-premise sites; this was partially because some existing on-premise sites with multiple sub-sites would be split into separate sites instead, but also because the new SharePoint was so much more versatile and would likely be popular.

The new architecture model

An important point to note is that the new SharePoint Online architecture model provided the opportunity to re-think our SharePoint model and, to some extent, clean up or leave unwanted SharePoint content behind. To quote the Microsoft site above, ‘the best migration is no migration’.

As noted above, we had five primary web applications in our SharePoint 2013 environment. These had to be migrated (or re-created, in the case of publication sites) under one of two paths (only – /teams or /sites) to one of three site option:

  • ‘Classic’ sites (the default for all team and project sites)
  • Office 365 Group-based team sites
  • Communication sites (re-created page-based content)

That is:

  • Migrated team and project sites would become classic team sites under either (a) /teams/sitename path or (b) /teams/prj_sitename path, respectively. There were some exceptions:
    • Some sites with multiple sub-sites would be split up into multiple independent sites (including using the new ‘hub’ sites).
    • A couple of team sites would become communication sites.
    • Team sites that crossed multiple organisational business areas would be created as classic team sites under the /sites/sitename path.
  • Most publication sites that used the publishing features would need to be re-created as communication sites under the /sites/sitename path. There were some exceptions:
    • Some publication sites would become team sites instead.
    • The intranet would be managed separately as, at the very least, it would need to be re-created in SharePoint Online. It could not be migrated ‘as is’.
  • Application sites would become team sites.
  • Some existing sites or sub-sites might be migrated to SharePoint sites linked to Office 365 Groups, with the naming prefix of either GRP_ or PRJ_.

The above ‘mapping’ model was an early decision that did not change.

Preparatory work

We also commenced work on the following elements of work:

  • Reviewing all existing sites to determine which sites would be migrated or discarded – see below.
  • Re-developing our SharePoint Architecture documentation for the Online version.
  • Investigating and documenting all Office 365 admin and Office 365 Security and Compliance admin configuration settings, and determining roles. This process, which required Global Admin access, included establishing records retention policies (from mid 2018) in the Security and Compliance admin portal.
  • Re-developing our existing SharePoint admin documentation for the Online version, including all the configuration settings. We included the OneDrive for Business config settings in this same document as it is a SharePoint service.
  • Understanding how the new environment worked, and would work.
  • Re-establishing our SharePoint Admin and SharePoint User Group sites in SharePoint Online.
  • We also created a range of ‘test’ sites to better understand the new environment.
  • Creating an initial schedule for the migration of sites, targeting inactive sites first.
  • Assigning the initial batches of Office 365 licences.
  • Developing a repeatable process to migrate sites using ShareGate. In our environment steps involved:
    • Identify need to migrate site
    • Register a new site request in our SharePoint Admin portal.
    • Register the task in our Jira task management system.
    • Create the SharePoint Online site (via a script linked to the request).
    • Migrate the on-premise site, make it read only with a re-direct notice on the front page (and a three month deletion notice*).
    • Prepare the migrated site, including swapping the classic default home page to a modern home page.
    • Hand over the site to the business owners and close the task

* In practice many of these sites still remained after 6 months.

As part of our review process, we identified around a dozen sites that had one or all of the following elements, that would mean we had to devote more time to their migration (‘custom workload’ in the Microsoft document above):

  • Complex workflows which would need to be re-created.
  • Integration with other systems (mostly via BizTalk).
  • Links with ETL processes.

We also identified around 50 sites that would not be migrated:

  • Sites that were unused or had no content of value (often because the original was still on a drive).
  • Sites that did not need to be migrated, for example if their content had been migrated to a different business system.
  • Test sites.

Sites that were no longer used but contained records that needed to be kept were to be migrated with the word ‘Archive’ to the end of the site URL name, assigned a site retention policy, and then made read only.

By August 2017, we had identified that 250 site collections would be migrated to SharePoint Online. We acquired ShareGate in September 2017 and were ready to start migrating.

In Part 2 of this series of posts I will describe the migration process and the lessons we learned along the way.

Office 365 – Applying retention periods to SharePoint document libraries and disposal/disposition actions

May 19, 2018

Records retention policies are created in the Security and Compliance Admin portal, Classifications section of Office 365, as noted in my previous post of 9 March 2018 on the subject.

This post describes how these are applied to document libraries and what happens when the records reach their disposal/disposition period.

Note: In Australia we refer to the disposal of records. In the US this is called disposition.

Setting up retention policies

Organisations may have complex or quite simple records retention policies. An important point to keep in mind in Office 365 is how many policies should be displayed to the end user to choose from.

Ideally, there should be fewer than a dozen classes so they are easy to choose from (see below). There is nothing stopping you creating 100 or 500 policies, but all of them will appear in the drop down list to choose from. Microsoft say they are working on ‘grouping’ policies, so this may help to fix the issue.

For some organisations, it may be useful to distill or group retention policies down to a smaller number.

  • For example, specific retention policies for certain types of records, and one (or two) for ‘all other’ records. The key, as we will see below, is naming them so they are obvious and easy to apply.

Viewing available retention policies

Retention policies that have been created appear in the Security and Compliance Admin portal, under Classifications > Labels.

O365_Classifications_Labels

Note: Labels must be published before they become visible to end users.

When you click on Labels, you can then see all the retention policies that have been created (but not necessarily published).

The screenshot below shows just the very top policy (a test/demonstration policy with a 7 day retention period) in a list of policies.

O365_Classifications_Labels_List.png

Note: Policies can be auto-applied, provided the policy has sufficient ability to identify what records they should be applied to.

Published policies appear in the Data Governance, Dispositions section:

O365_DataGovernance_Dispositions.png

The Dispositions section displays policies that have been published and are visible to end users in the Office 365 areas selected when the policy was created (e.g., Exchange, SharePoint, OneDrive etc).

O365_DataGovernance_Dispositions_List.png

Applying the policy in a SharePoint document library

To apply the policy to a SharePoint document library, go to the document library, library settings, and you will see the option to add the retention policy: ‘Apply label to items in this list or library’.

O365_RetentionPolicy_LibrarySet1.PNG

The ‘Apply Label’ dialogue shows the option to apply the label to existing items (recommended) and a drop down which shows all the published retention policies.

O365_RetentionPolicy_LibrarySet2.PNG

In this example below, there are four policies including the test policy.

O365_RetentionPolicy_LibrarySet3

The policy now applies to all records stored in that document library.

Managing disposal/disposition

When the records reach the end of the retention period configured in the policy, the person designated to be informed about the retention will receive an email notifying them of the need to review the dispositions.

O365_Dispositions_EmailNotification.pngNote, the person (or mailbox) receiving this email MUST be assigned to the Records Management role in the Security and Compliance Admin portal, Permissions section. No-one else will see the records due for disposal otherwise (not even the Global Admins, unless they have also been delegated to that role).

The records person clicks on the link ‘Go there now’ and it opens the following section in the Office 365, Security and Compliance Admin portal, showing the documents that are pending disposition. A number of options are available to sort by Type, to search, and to filter by several options.

 

O365_Dispositions_DocListing

The following options appear if a single document is selected. Note the option to extend the retention period or apply a different label, as well as the ability to delete the item permanently.

O365_Dispositions_Doc_OneDocument

Filtering options are displayed below.

O365_DataGovernance_Dispositions_Filters

Finally, the records manager can choose all the documents in the list and complete three bulk actions as shown.

O365_DataGovernance_Dispositions_BulkActions.png

Positives and negatives

The positives of this method of disposing of documents are that all records from any location will appear in a single view that can be filtered and actions taken as required.

The negatives are that potentially thousands of documents might appear in this listing every single day making it difficult to decide what can deleted or not.

However, as it’s possible to filter by the retention policy, that at least should make it relatively easy to identify what can be destroyed. The more fine-grained the policies, the fewer records should appear.

Organisations that have function-based disposal classes should find that all records relating to the same function appear for disposal under that function.

Another potential negative is that records may not always appear in the same context, whether it be subject- or function-based. For example, a collection of documents (often known as a ‘file’) may not appear in the disposition listing as a collection but as a set of records that are only connected by the disposal policy name. Does this matter?

Recording disposal actions

A key requirement for most organisations is keeping a record of what was destroyed.

At the moment the only apparent option to do this is to apply filters and export the list, using the handy ‘Export’ option to keep a record of what was destroyed. That csv file can then be stored in a control library to ensure a record is kept. This type of action requires a degree of control to ensure it happens every time.

It may also be possible to identify what was destroyed – and by whom – in the audit logs. This is being investigated.

 

Changes to security classification and records retention in Office 365

March 9, 2018

In May 2016, I wrote about the creation of security classification labels in the Azure Information Protection (AIP) portal (old post here). Quite a bit has changed since that post, in particular the naming of policies, away from ‘High’ to ‘Low’ Business Impact (e.g., HBI – LBI) to real-world words such as ‘General’ and ‘Highly Confidential’.

In October 2017, I wrote about the new retention policies that could be applied to all Exchange, SharePoint and OneDrive content in Office 365.

Changes to the Security and Compliance admin portal – Classifications section

On 23 February 2018, Microsoft’s Adam Jung posted a new article to the Microsoft Tech Community titled ‘Consistent labeling and protection policies coming to Office 365 and Azure Information Protection’.

The main outcome of this change is that information security protection and records retention policies, linked with Data Loss Prevention (DLP policies) are created from a single interface in the Security and Compliance admin centre > Classifications section (Labels). These policies are set in Office 365 are then synced to Azure (and vice versa).

To quote the Microsoft blog: ‘The upcoming experience means that the same default labels can be used in both Office 365 and Azure Information Protection, and the labels you create in either of these services will automatically be synchronized across the other service – no need to create labels in two different places!’

This post looks at the changes and some potential issues that may arise.

Security and Compliance Admin Portal – Classifications

Records retention policies for Office 365 content are set as labels in the Security & Compliance Admin portal of Office 365 under Classifications – Labels.

The Classifications area also includes a section for ‘Sensitive Information Types’, which simply lists a range of information types that are also used for DLP policies.

Note: Access to that Admin portal is restricted by default to Global Admins and anyone assigned to a specific security role. Records managers in organisations that have or are deploying Office 365 should have access to this feature.

Setting (Records Retention) Classification Labels

The options for setting a records retention label were described in detail in my post above, but for reference again, they are:

  • Name
  • Label settings
    • Disabled or enabled (off/on)
    • When enabled, the ability to set (a) a retention period, and (b) an action when the period expires.
    • Alternatively, it is possible to just delete content when it’s older than a given time.
    • An option also allows the content be to be classified as a ‘record’ when the label was applied, providing further protection against deletion, for example.
  • Review your settings

Merging of label options – Retention and Security together in a single label

The primary change to classifications is the inclusion of new options when you choose to ‘Create a Label’.

These options are now:

  • Label name
  • Protection settings (e.g., information security)
  • Retention settings
  • Advanced options settings
  • Review your settings

These options are described below.

O365ClassificationLabelsMar2018.JPG

The ‘Protection settings’ section includes the following options:

  • Enabled or disabled. (If disabled the next check box options do not appear)
  • Block users from sending email messages or sharing documents with this label
  • Show policy tip to users if they send or share labeled content (The text of the policy tip is editable)
  • Send incident reports in email
  • Advanced protection for content with this label (Customise settings option)

The ‘Retention settings’ are identical with the options already described above:

  • Disabled or enabled
  • Various settings when enabled.

The ‘Advanced options settings’ section includes the following options:

  • Enabled or disabled. (If disabled the next check box options do not appear)
  • Add a watermark (text can be customised)
  • Add a header (text can be customised)
  •  Add a footer (text can be customised)

The Microsoft article notes: ‘We are building labeling capabilities natively into the core Office apps – including Word, PowerPoint, Excel, and Outlook, and soon there will be no need to download or install any additional plug-ins.’ This comment references the problem of having to download a plug-in for the classification options to appear in installed versions of Office.

Does it make sense to merge security classifications and records retention?

In my opinion, putting information security and records retention policies in the same label doesn’t make sense.

Retention is almost never linked with the confidentiality (or otherwise) of the records but based on government or legislative requirements or business needs.

But that was probably not Microsoft’s intention; it was probably to make it as simple as possible to create and apply these policies.

It would have made more sense to have separate label options for ‘Retention policies’ and ‘Security policies’. This would potentially mean, however, having two labels (if a label is in fact required for retention purposes).

Organisations with complex retention policies might find that the mixing of both policies in the one view makes it harder to find the individual security related policies, and have the potential to cause some confusion.

For example, it is could be hard to spot the Highly Confidential label in this listing if there were more than (say) 50 retention classes:

  • Client records – 7 years
  • Confidential
  • Financial Records – 7 years
  • Highly Confidential
  • Internal Use Only
  • Meeting Records – 3 years
  • Working Paper – 1 year

It also raises the question (which I have asked and will update this post if I receive a response) as to whether two policies can (or should) be applied on a document.

If two labels cannot be applied, this could mean that organisations have to have even more labels to take account of the various combinations. For example:

  • General Financial Records – 7 years
  • Confidential Financial Records – 7 years
  • Highly Confidential Financial Records – 7 years

Not to mention the link to DLP policies, although that doesn’t appear as a label.

In my opinion, combining these two options, while perhaps making it easier at the ‘front end’, has the potential to create confusion for users, let alone complicate the administration of retention management.

Read the full Microsoft blog article in the link below

https://techcommunity.microsoft.com/t5/Security-Privacy-and-Compliance/Consistent-labeling-and-protection-policies-coming-to-Office-365/ba-p/161553

SharePoint Online and OneDrive for Business – Preventing external sharing of data

October 17, 2017

A recent (September 2017) article suggested that OneDrive for Business (ODfB) (and by extension SharePoint Online (SPO); ODfB is a SharePoint-based service), a key application in Office 365 was a potential source of data leaks and/or target for hacking attacks.

I don’t disagree that, if not configured correctly, any online document management system – not just ODfB/SPO – could be the source of leaks or the target of external attacks. Especially if these systems, and the security controls that can protect the data in them, are not properly configured, governed, administered, and monitored.

But, I would ask, what controls do most organisations have in place now for documents stored in file shares and personal file folders, not to mention USB sticks, and the ability to send document via Bluetooth to mobile devices or upload corporate data to third-party document storage systems? Probably not many, because users have no other way to access the data out of the office.

As we will see, the controls available in Office 365 are likely to be more than sufficient to allow users to access to their documents out of the office, while at the same time reducing (if not eliminating) the sharing of documents with unauthorised users.

How to stop or minimise sharing from OneDrive for Business and SharePoint Online

There is one simple way to prevent the sharing of data stored in SPO and ODfB with external people – don’t allow it.

There are several ways to control what can be shared, each allowing the user a bit more capability. All these options should be based on business requirements and information security risk assessments, and Office 365 configured accordingly.

In this article I will start with no sharing allowed, and then show how the controls can be reduced as necessary.

External sharing – on or off

This is the primary setting, found in the main Office 365 Admin centre under Settings > Services & add-ins > Sites. If you turn this off, no-one can share anything stored in SPO or ODfB.

The option is shown below:

O365_SC_Sites_SharingOnOff

If you do allow sharing, you need to decide (as shown above) if sharing will be with:

  • Only existing external users
  • New and existing external users [Recommended]
  • Anyone, including anonymous users

The second option is recommended because it doesn’t restrict the ability to share with new users. The last option is unlikely to be used in most organisations and comes with some risks.

The next place to set these options are in the SPO and ODfB Admin centres.

OneDrive admin center

If the previous option is enabled, the following options are available for ODfB. Note that BOTH SharePoint and OneDrive are included here because the latter is a part of the SharePoint environment.

  • Let users share SharePoint content with external users: ON or OFF.
    • NOTE: If this option is turned OFF, all the following options disappear.
  • If sharing with external users is enabled, the following three options are offered:
    • Only existing external users
    • New and existing external users [Recommended]
    • Anyone, including anonymous users
  • Let users share OneDrive content with external users: ON or OFF
    • This setting must be at least as restrictive as the SharePoint setting.
  • If sharing with external users is enabled, the following three options are offered
    • Only existing external users
    • New and existing external users [Recommended]
    • Anyone, including anonymous users

If sharing is allowed, there are three sharing link options:

  • Direct – only people who already have permission [Recommended]
  • Internal – only people in the organisation
  • Anonymous access – anyone with the link

You can limit external sharing by domain, by allowing or blocking sharing with people on selected domains.

External users have two options:

  • External users must accept sharing invitations using the same account that the invitations were sent to [Recommended]
  • Let external users share items they don’t own. [This should normally be disabled]

A final ‘Share recipients’ checkbox allow the owners to see who viewed their files.

SharePoint admin center

The SPO admin center (to be upgraded in late 2017) has two options for sharing.

The first option is under the ‘sharing’ section which currently has the following options:

Sharing outside your organization

Control how users share content with people outside your organization.

  • Don’t allow sharing outside your organization
  • Allow sharing only with the external users that already exist in your organization’s directory
  • Allow users to invite and share with authenticated external users [Recommended]
  • Allow sharing to authenticated external users and using anonymous access links

Who can share outside your organization

  • [Checkbox] Let only users in selected security groups share with authenticated external users

Default link type

Choose the type of link that is created by default when users get links.

  • Direct – only people who have permission [Recommended, same as above]
  • Internal – people in the organization only
  • Anonymous Access – anyone with the link

Default link permission

Choose the default permission that is selected when users share. This applies to anonymous access, internal and direct links.

  • View [Recommended]
  • Edit

Additional settings (Checkboxes)

  • Limit external sharing using domains (applies to all future sharing invitations). Separate multiple domains with spaces.
  • Prevent external users from sharing files, folders, and sites that they don’t own [Recommended]
  • External users must accept sharing invitations using the same account that the invitations were sent to [Recommended]

Notifications (Checkboxes)

E-mail OneDrive for Business owners when

  • Other users invite additional external users to shared files [Recommended]
  • External users accept invitations to access files [Recommended]
  • An anonymous access link is created or changed [Recommended]

Sharing via the Site Collections option

In addition to the options above, sharing options for each SharePoint site are set in the ‘site collections’ section as follows. Note that the default is ‘no sharing allowed’. A conscious decision must be taken to allow sharing, and what type of sharing.

O365_SPO_Sharing1

When a site collection name is checked, the following options are displayed.

Sharing outside your company

Control how users invite people outside your organisation to access content

  • Don’t allowing sharing outside your organisation (default)
  • Allow sharing only with the external users that already exist in your organization’s directory
  • Allow external users who accept sharing invitations and sign in as authenticated users
  • Allow sharing with all external users, and by using anonymous access links

If anonymous access is not permitted (setting above), a message in red is displayed:

Anonymous access links aren’t allowed in your organization

SharePoint Sharing option

The SharePoint Admin Centre has an additional ‘Sharing’ section with the same settings as shown above for ODfB. It is expected that these multiple options will be merged in the new SharePoint Admin Centre due for release in late 2017.

Additional security controls

In addition to all the above settings, there are a range of additional controls available:

  • All user activities related to SPO and ODfB, including who accessed, viewed, edited, deleted, or shared files is accessible in the audit logs.
  • SPO and ODfB content may be picked up by Data Loss Prevention (DLP) policies and users prevented from sending them externally. This is of course subject to the DLP policies being able to identify the content correctly.
  • SPO and ODfB content may be subject to records retention policies set by preservation policies. These may impact on the ability to send documents externally.
  • SPO and ODfB content may be subject to an eDiscovery case.
  • Administrators can be notified when users perform specific activities in both SPO and ODfB.
  • Sharing (and access to the documents once shared) may be subject to security controls enforced through Microsoft Information Protection.

Conclusion

In summary, the settings above allow an organisation to strongly control what can be shared. If sharing is allowed, certain additional controls determine whether the sharing is for internal users or for users external to the organisation. If the latter is chosen, there are further controls on what external users can do. Audit controls and policies may also control how users can share information externally.

The key takeaway is that organisations should ensure that the sharing options available in Office 365 are based on the organisation’s business requirements and security risk framework.

Office 365 – new data governance and records retention management features

October 7, 2017

At the September 2017 Ignite conference in Orlando, Florida, Microsoft announced a range of new features coming soon to data governance in Office 365.

These new features build on the options already available in the Security and Compliance section of the Office 365 Admin portal. You can watch the video of the slide presentation here.

Both information technology and records management professionals working in organisations that have Office 365 need to work together to understand these new features and how they will be implemented.

Some of the key catch-phrases to come out of the presentation included ‘keep information in place’, ‘don’t horde everything’, ‘no more moving everything to one bucket’, ‘three-zone policy’, and ‘defensible deletion process’. The last one is probably the most important.

How do you manage the retention of digital content?

If your organisation is like most others, you will have no effective records retention policy or process for emails or content stored across network file shares and in ‘personal’ drives.

If you have an old-style EDRM system you may have acquired a third-party product and/or tried to encourage users (with some success, perhaps) to store emails in that system, in ‘containers’ set up by records managers.

The problem with most of these traditional methods is that it assumes there should be one place to store records relating to a given subject. In reality, attempts to get all related records in the one place conjures up the ‘herding cats’ problem. It’s not easy.

What is Microsoft’s take on this?

For many years now, Microsoft have adopted an alternative approach, one that is not dissimilar to the view taken by eDiscovery vendors such as Recommind. Instead of trying to force users to put records in a single location, it makes more sense to use powerful search and tagging tools to find and manage the retention of records wherever they are stored.

Office 365 already comes with powerful eDiscovery capability, allowing the organisation to search for and put on hold records relating to a given subject, or ‘case’. But it also now has very powerful records retention tools that are about to get even better.

This post extends my previous posting ‘Applying New Retention Policies to Office 365 Content‘, and won’t repeat all of it as a result.

Where do you start?

A standard starting point for the management of the retention and disposal of records is a records retention schedule. These are also known in the Australian recordkeeping context as disposal authorities, general disposal authorities, and records authorities. They may be very granular and contain hundreds of classes, or ‘big bucket’ (for example, Australian Federal government RAs).

Records retention schedules usually describe types of records (sometimes grouped by ‘function’ and ‘activity’, or by business area) and how long they must be retained before they can be disposed of, unless they must be kept for a very long time as archival records.

The classes contained in records retention schedules or similar documents become retention policies in Office 365.

Records retention in Office 365

It is really important to understand that records retention management in Office 365 covers the entire environment – Exchange (EXO), SharePoint (SPO), OneDrive for Business (OD), Office 365 Groups (O365G), Skype for Business. Coverage for Microsoft Teams and OneNote is coming soon. Yammer will not be included until at least the second half of 2018.

That is, records retention is not just about documents stored in SharePoint. It’s everything except as noted.

Records managers working in organisations that have implemented (or are implementing) Office 365 need to be on top of this, to understand this way of approaching and managing the records retention process.

Retention policies in Office 365 are set up in the Security and Compliance Admin Centre, a part of the Office 365 Admin portal. Ideally, records managers should be allocated a role to allow them to access this area.

There are two retention policy subsections:

  • Data Governance > Retention > Policy
  • Classification > Labels > Policy

The settings in both are almost identical but have slightly different settings and purposes. However, note all retention policies that are set up are visible in both locations.

The difference between the two options is that:

  • Retention-based policies are (according to Microsoft) meant for IT to be used more for ‘global’ policies. For example, a global policy for the retention of emails not subject to any other retention policy.
  • Label-based policies map to the individual classes in a retention schedule or disposal authority.

Note: Organisations that have many hundreds or even thousands of records retention classes will need to create them using Powershell.

Creating a retention-based policy

Retention-based policies have the following options:

O365_RetentionLabelSettingsA

Directly underneath this are two options:

  • Find specific types of records based on keyword searches [COMING > also label-based]
  • Find Data Loss Prevention (DLP) sensitive information types. [COMING > label-based DLP-related polices can be auto-applied]

A decision must then be made as to where this policy will be applied – see below.

Creating a label-based policy

To create a classification label manually, click on ‘Create a label’.

O365_CreateClassLabel

Note:

  • Labels are not available until they are published.
  • Labels can be auto-applied

The screenshot below shows the options for creating a new label.

O365_ClassLabelSettingsA

Label- based policies have the following settings:

  • Retain the content for n days/months/years
  • Based on Created or Last Modified [COMING > when labelled, an event*]
  • Then three options: (a) delete it after n days/months/years (b) subject it to a disposition review process (labels only), or (c) don’t delete.

* Such as when certain actions take place on the system.

 

Applying the policies

Once a policy has been created it can then be applied to the entire Office 365 environment or to only specific elements, for example EXO, SPO, OD, O365G.

  • IT may want to establish a specific global policy
  • Most other policies will be based on the organisation’s records retention schedule

Once they have been published, labels may then be applied automatically or users can have the option to apply them manually.

In EXO, a user may create a folder and apply the policy there. All emails dragged into that folder will be subject to the same policy.

In SPO, retention policies may be applied to a document library and can be applied automatically as the default setting to all new documents. [COMING > also to a folder and a document set]. Adding a label-based policy to a library also creates a new column so the user can easily see what policy the documents are subject to.

Note: Individual documents stored in the library will be subject to disposal, not the library. 

What about Content Types?

Organisations that have used content types to manage groups of records including for retention management will be able to continue to do so, but Microsoft appears to take the view (in the presentation above) that this method should probably replaced by labelling. This points needs further consideration as content types are usually used as a way to apply metadata to records.

Note: If the ability to delete content (emails, documents) is enabled, any deleted content subject to a retention policy will be retained in a hidden location. The option also exists when a label-based policy is created to ‘declare’ records based on the application of a label. 

What happens when records are due for disposal?

Once the records reach the end of their retention period, they will be:

  • Deleted
  • Subject to a new disposition review process [COMING in 2017 – see below]
  • Remain in place (i.e., nothing happens)

In relation to the second option above, a new ‘Disposition’ section under Data Governance will allow the records manager or other authorised person to review records (tagged for Disposition Review) that have become due for disposal.

This is an important point – only records that had a label with the option ‘Disposition Review’ checked will be subject to review. All other records will be destroyed. Therefore, if the organisation needs to keep a record of what was destroyed, then the classification label must have ‘Disposition Review’ selected.

Records that are reviewed and approved to be destroyed are marked as ‘Completed’. This means there is a record of everything (subject to disposition review) that has been destroyed, a key requirement for records managers.

Other new or coming features

A number of other new features demonstrated at the Ignite conference, are coming.

  • Labels will have a new ‘Advanced’ check box. This option will allow records marked with that label to have any of the following: watermark, header/footer, subject line suffix, colour.
  • Data Governance > Records Management Dashboard. The dashboard will provide an overview of all disposition activity.
  • Data Governance > Access Governance. This dashboard, which supports data leakage controls, will show any items that (a) appear to contain sensitive content and (b) can be accessed by ‘too many’ people.
  • Auto-suggested records retention policies. The system may identify groups of records that do not seem to be subject to a suitable retention policy and make a recommendation to create one.
  • For those parts of the world who need it, new General Data Protection Regulations (GDPR) controls
  • Microsoft Information Protection, to replace Azure Information Protection and provide a single set of controls over all of Microsoft’s platforms.