The ability to manage records ‘in place’ in SharePoint has existed since around 2013. But this is not the same thing as leaving records where they were created or captured and managing them there – ‘in place’.
This post explains the difference between the two ‘in place’ options. In brief:
The Microsoft ‘in place’ model is based on making the distinction between non-records content and content declared as records (as per DOD 5015.2), that may be stored in the same SharePoint site, or using Exchange in-place options.
The other ‘in place’ model is simply based on leaving records and other content where they were created or captured, and managing it there – including (where necessary) by applying the ‘in place’ options in the previous point.
The Microsoft in-place model
The Microsoft in-place model for managing records in SharePoint is based on the requirement to comply with the US Department of Defense (DOD) standard titled ‘Design Criteria Standard for Electronic Records Management Software Applications’, usually known by its authority number – DOD Directive 5015.2, Department of Defense Records Management Program, originally published in 11 April 1997.
Section C2.2.3 ‘Declaring and Filing Records’ of the standard defines 26 specific requirements for declaring and filing records, including the following points:
The capability to associate the attributes of one or more record folder(s) to a record, or for categories to be managed at the record level, and to provide the capability to associate a record category to a record
Mandatory record metadata.
Restrictions on who can create, edit, and delete record metadata components, and their associated selection lists.
Unique computer-generated record identifiersfor each record, regardless of where that record is stored.
The capability to create, view, save, and print the complete record metadata, or user-specified portions thereof, in user-selectable order.
The ability to prevent subsequent changes to electronic records stored in its supported repositories and preserving the content of the record, once filed
Not permitting modification of certain metadata fields.
The capability to support multiple renditions of a record.
The capability to increment versions of records when filing. Linking the record metadata to the record so that it can be accessed for display, export.
Enforcement of data integrity, referential integrity, and relational integrity.
Microsoft’s initial guidance on configuring in place records management describes how to activate and apply this functionality primarily in SharePoint on-premise. It is still possible to apply this in SharePoint Online (but see below). The SharePoint in place model refers to a mixed content approach where both records and non-records can be managed in the same location (an EDMS with RM capability):
Managing records ‘in place’ also enables these records to be part of a collaborative workspace, living alongside other documents you are working on.
The same link above, however, also refers to newer capability that was introduced with the Microsoft 365 Records Management solution in the Compliance admin portal. This new capability allows organisations to use retention labels instead to declare content as records when the label is applied, which ‘effectively replaces the need to use the Records Center or in-place records management features.’
The guidance also noted that, ‘… moving forward, for the purpose of records management, we recommend using the Compliance Center solution instead of the Records Center.’
A form of in-place management has also been available for Exchange on-premise mailboxes, with in place archiving based on using archive mailboxes – see the Microsoft guidance ‘In-Place Archiving for in Exchange Server‘.
One draw-back of this model is that the (email) records in these mailboxes were not covered by the same DOD 5015.2 rigor as those in SharePoint, but they could at least be isolated and protected against modification or deletion, for retention, control and compliance purposes.
Microsoft Exchange Online Archiving is a Microsoft 365 cloud-based, enterprise-class archiving solution for organizations that have deployed Microsoft Exchange Server 2019, Microsoft Exchange Server 2016, Microsoft Exchange Server 2013, Microsoft Exchange Server 2010 (SP2 and later), or subscribe to certain Exchange Online or Microsoft365 plans. Exchange Online Archiving assists these organizations with their archiving, compliance, regulatory, and eDiscovery challenges while simplifying on-premises infrastructure, and thereby reducing costs and easing IT burdens.
The new ‘in place’ model
A newer form of in-place records management has appeared with Microsoft 365.
Essentially, the new model simply means leaving records where they were created or captured – in Exchange mailboxes, SharePoint sites, OneDrive or Teams (and so on). and applying additional controls where it is required.
The newer model of in place records management is based on the assumptions that:
It will never be possible to accurately or consistently identify and/or declare or manage every record that might exist across the Microsoft 365 ecosystem. For example, it is not possible to declare Teams chats or Yammer messages.
Only some and mostly high value or permanent records, will be subject to specific additional controls, including records declaration and label-based retention.
The authenticity, integrity and reliability of a some records may be based more on system information (event metadata) about its history, than by locking a point-in-time version.
Microsoft appear to support this dual in place model with their information governance (broader controls) and records management (specific controls, including declaration) approach to the management of content and records across Microsoft 365, as described in the Microsoft guidance ‘Information Governance in Microsoft 365‘, which includes the graphic below, modified to show the relationship between the two in place concepts.
In place co-existence
Can both in place models exist? Yes. There is nothing to prevent both in place models existing in the same environment, in which some records may need to be better managed or controlled than others, but it is important to understand the distinction between the two when it comes to applying controls.
Image: Quarantine Building, Portsea, Victoria Australia. Andrew Warland 2021
Microsoft 365 has become one of the world’s most accessed products for office collaboration. Jeff Teper, the ‘father of SharePoint’ at Microsoft, tweeted on 27 April 2021 that Teams had 145 million daily active users. (As reported in by the team at Office365ITPros.com.) According to the website ‘Microsoft Office Statistics and Facts (2021) | By the Numbers‘ , Microsoft Teams usage grew 40% during the COVID lockdowns.
Although organisations create, capture and store a range of records using the various Microsoft 365 applications, most records are likely to be created or captured in Exchange mailboxes or SharePoint (including OneDrive).
The volume and range of records has, in many respects, overwhelmed traditional standards-based models that required records (including emails) to be copied to a central electronic document and records management system (EDRMS) or identified and then ‘declared’ as records.
Given the reality of the new paradigm, organisations have tried various ways to manage records in Microsoft 365, including by retaining the EDRMS (for high value records), acquiring third-party products that promise to address the ‘gap’ in recordkeeping functionality, or working with the ‘out of the box’ capability.
Whichever method is chose, records managers need to have a very good understanding of where the records are in Microsoft 365 and how they can be managed. In many cases, leaving and managing them where they were created or captured (‘in place’ management), and using new and emerging capability to leverage the power of AI-based tools is likely to be the future state.
With the above in mind, and regardless of which method you follow, the following are ten points that I think are important to consider when managing records in Microsoft 365.
What are your recordkeeping obligations?
It is perhaps the most obvious question but organisations have sometimes rolled out Microsoft 365 without consideration of their obligations for managing records.
Records provide evidence of business activities and accordingly need to be protected to ensure their authenticity, integrity and reliability as evidence. The most common way of achieving this outcome until now has been to ‘lock’ digital records from change. Is this practical in the digital world? How do you lock Teams chats or stop a new thread in an email exchange? Could the same outcome be achieved by recording all changes and ensuring these are retained?
Records are usually subject to minimum retention requirements and understanding what these are is essential. Where there are minimum retention requirements, records cannot be destroyed before a specific period of time based on a particular trigger or event. These requirements are defined in legislation (sometimes based on statutes of limitation) or, for government agencies, in records disposal authorities or schedules (as shown in the example above).
As a starting point, look at these retention requirements and consider how these will be applied to Teams 1:1 chats, or team chats, or emails still in Exchange mailboxes, or OneDrive content. And then extend this to the content stored in Teams/Group-based SharePoint sites and sites not linked with Microsoft 365 Groups.
Consider also what is required to manage the outcome of retention. Do you need to review records due for destruction? Do you need to keep a record of what was destroyed? For all records?
There may be a requirement to categorise or classify records (especially to group them by context and/or for retention purposes where retention is based on classification). How will this outcome be achieved for Teams chats or emails that remain in Exchange mailboxes, or OneDrive content? What other metadata do you need for records?
Your recordkeeping obligations, in particular records retention requirements, should guide the management of records in Microsoft 365.
Where are the records created or captured in Microsoft 365?
Neither Microsoft 365 nor SharePoint is a dedicated recordkeeping system like an EDRMS (see my post ‘SharePoint is not an EDRMS’). Rather, it is an ecosystem of multiple applications that are used to create, capture, store and manage records.
Most records are likely to be stored in either SharePoint (OneDrive is a SharePoint service) or Exchange mailboxes.
A compliance copy of Teams chats are stored in a hidden folder of Exchange mailboxes. Content stored in the ‘Files’ tab is either stored in SharePoint or (for 1:1 chat) in OneDrive.
Of course, there will be some other records – Yammer conversations, tasks and plans, communication sites, calendar entries, forms and even WhiteBoard sessions. But, more than 95% are stored in Exchange mailboxes or SharePoint/OneDrive.
Knowing your recordkeeping obligations and the location of records are the main starting points. In fact, you can map your recordkeeping obligations (especially metadata and retention) to the location of the records.
Do you control the creation of Teams and SharePoint sites or not?
There two, broadly speaking, two approaches to controlling the creation of Teams and SharePoint sites:
Yes, we have controls – There is some kind of control or decision ‘gate’ for the creation of Teams and SharePoint sites.
No, we don’t have controls – End-users can create Teams and SharePoint sites whenever they want. In this case, the points below may not be of much use. You will likely rely on the built-in ‘records management’ capability to manage the records.
If your answer to the question above was ‘No’, then you will probably need third-party products and/or rely heavily on AI-based solutions to manage the records (which is the default Microsoft position).
Map sites and Teams to business functions – don’t mix content
Almost every organisation has a range of business functions. Some of these are common to all or most organisations (e.g., information technology, human resources, financial management, legal, property, etc) while others are ‘core’ (e.g., engineering, manufacturing, research, sales and marketing, etc).
Many organisations are structured around these business functions, and most records retention schedules are based on function (or business function).
If you can map new Teams and SharePoint sites to these functions, this will facilitate the management of records down the track. Mixing content across multiple functions – except where it makes sense to do this, such where there are related but smaller numbers of records in project sites – is going to make it harder to manage the records in the longer term – and more or less the same as letting end-users put whatever they want into a paper box for long-term storage.
A common example where records might be mixed are ‘Corporate Services’ areas that create or capture records across multiple functions, including property, IT, finance and so on. Unless all the records in a Team-based site can be kept for the same period of time, it may be a good idea to separate the records into different sites.
Also keep in mind that some business areas may exist for a long time; having a single (Teams-based) document library that has folders linked to channels may not be the best way to manage records long-term.
The suggestions above don’t take into account Exchange mailboxes, Teams chats or OneDrive accounts because these cannot be mapped to functions.
Naming conventions for sites and teams, and libraries
The main reasons for having naming conventions for SharePoint sites and Teams are:
It is good practice, to avoid acronyms and other less than useful names.
To prevent unnecessarily long names that end up creating very long URLs (e.g., ‘https://tenantname.sharepoint.com/sites/ExecutiveCommittee20202021MeetingsHeldDuringLockdownandrecordedviaMSTeamsSeniorManagersOnly‘.) It is important to know the difference between the URL name and the display name.
Ideally, the original names of Teams and SharePoint sites should be restricted to no more than 14 characters so that Document IDs (that have a 12 character prefix) can be the same as, or very close to, the URL name of the site. For example:
Aside from the default ‘Documents’ library of every Teams-based site, library names should also be subject to naming conventions and restricted to around 20 characters. There are several reasons for this.
The first is how they appear in the left hand navigation of a SharePoint site. There isn’t much point having multiple library names that aren’t easily visible (the two examples below have completely different names after ‘Financial Management’).
The third reason is that it is good practice to have some form of naming conventions.
Ideally, library names should map to the activities that produce the records AND include the year where this is relevant, e.g., ‘Meetings2021’.
There is NO need to repeat words in the tenant or site name – e.g., h**ps://tenantname.sharepoint.com/sites/TenantNameCorporateRecords/TenantNameFinancial%20Management%20%20Accounting%20%20Invoices%202021/Forms/AllItems.aspx
As noted above, this doesn’t apply to the default ‘Documents’ library in Teams-based SharePoint sites (the actual name is ‘Shared%20Documents’).
Metadata and Content Types
For many organisations, the minimum metadata requirements consist of (a) agent (e.g., the person who did something), (b) dates (when they did something) and (c) a unique identifier. That is, who did what and when?
If you need to add more metadata for certain types of records you can really only do this in SharePoint document libraries, including by adding them from the SharePoint Term Store (see below example). It can also be done in Outlook but these metadata terms are not linked with the SharePoint terms.
As for Content Types, do you really need them? SharePoint is made up of multiple Content Types already, including the default ‘Document’ Content Type. It is important to understand how Content Types work in SharePoint before making the assumption that they are required.
In many cases, choice metadata fields can replace the need for Content Types. Custom Content Types may only be needed for specific or high value record types.
Document retention policies and labels
In the first section about recordkeeping obligations, it was noted that most records will be subject to minimum retention requirements. Retention labels and policies are created in the Compliance admin portal of Microsoft 365.
Unfortunately, the current Compliance admin portal provides very little information to show what label or policy was applied where. The only way to do this is to document it yourself. One way to do this is to create a spreadsheet that lists on each row:
The business function and activity from a File Plan or Business Classification Scheme (e.g., Financial Management – Accounting)
Each retention class for that function/activity pair, including the reference number
If that class has been created as a label, what the label name is. If it has been created as a non-label retention policy, what that retention policy name is. (Generally speaking, disposal authority classes don’t refer to Exchange mailboxes, SharePoint sites, MS Teams chats or OneDrive content, so the organisation may need to determine what this minimum retention period should be and how it will manage the retention outcomes).
(Note, the above can be created in the File Plan section of the Records Management part of the Compliance admin portal, E5 licences only. However, it only documents the above information and does not show where the label has been applied.)
Where the label has been applied ‘manually’ – to which SharePoint site/document library, Exchange mailbox or OneDrive account. This point may have multiple location references.
Where the label has been auto-applied through the basic E3 option or, for E5 licences, the document understanding model (DUM) in SharePoint Syntex, or via trainable classifiers.
When the retention will expire.
Retention outcome – If a disposition review (E5 only) exists, the records will be destroyed automatically without any record kept, or ‘do nothing’. See also below.
Remember that retention labels and policies apply to individual items (emails, Teams chat, SharePoint or OneDrive content stored in libraries), not to aggregations (e.g., the entire library or site). The aggregation will continue to exist after the content has been destroyed and no ‘stub’ (a record of what used to exist) will remain.
How will you manage retention outcomes?
Generally speaking, Microsoft 365 retention policies destroy records when they are due for destruction unless they are subject to a label that has the disposition review option enabled or the ‘do nothing’ option has been selected.
Organisations need to understand how they will manage these retention outcomes especially as, in most cases, a review process is required. (See ‘Recordkeeping obligations’ above).
Even when retention label have the disposition review option enabled (E5 only), there are two points that need to be understood:
The ‘disposition review’ interface presents individual records with no context except for the original site URL name. Some additional (default) metadata is now included (from May 2021) but not any added metadata. In most cases, it will be necessary to return to the original library to view the context of the records presented for disposal, and if there are any others.
If records are destroyed through that review process, only basic metadata is retained about what was destroyed.
Organisations that have an obligation to undertake a full review of records due for disposal will likely need to consider establishing workarounds such as exporting the full set of metadata from a document library and then using that to review whether the content of the library can be destroyed. If approval is granted, that decision should be recorded, along with the metadata extract.
Allow end-users to get on with their work
End-users generally don’t have much interest in the management of records beyond the period of time they are important to them. They want to do whatever they want, whenever they want, using the applications they have available to them.
Collaboration no longer consists of email exchanges and document-based records. Creating control gates for the creation of Teams and sites, and insisting on naming conventions for sites and libraries (and folders) may be interpreted negatively.
There needs to be a fine balance between control and freedom and this can impact the creation, capture and management of records. Some of the ways to minimise the impact of recordkeeping requirements include:
Enabling Document IDs on every site.
Creating custom metadata columns on sites or libraries with default values.
Applying non-label ‘safety net’ retention policies to all workloads. Retention policies (along with the Recycle Bin for 90 days) helps with the recovery of accidentally deleted content.
Using various communication methods to highlight useful features including sharing (instead of attaching), the Recycle Bin, versioning in SharePoint/OneDrive, and the ability to have a ‘single source of truth’. These features can be used to ‘soften’ the impact of other recordkeeping obligations in some sites.
Pro-actively monitoring activity across the Microsoft 365 ecosystem, including by monitoring the various dashboards, searches, and audit logs, and responding to events.
Learning more about the Microsoft Graph Explorer and the potential to use AI-based options to manage records.
Use the system for other recordkeeping purposes
The Microsoft 365 environment can be used for other recordkeeping purposes as well. For example:
Managing physical records stored offsite in a SharePoint list.
Keeping a record of records (and SharePoint sites or other systems) that have been destroyed, as well as ongoing destruction review and approval processes.
Publishing policies and procedures (in a SharePoint site, not necessarily a communication site).
Communicating information about managing records (communication site).
Archiving social media content (to Exchange mailboxes).
Searching for content stored in other locations or systems including File Explorer and Line of Business systems (via connectors).
Archiving network file share content, where it can be better protected and then subject to retention and disposal outcomes.
Understanding where records are stored (via dashboards and Power BI reporting).
Photographs (or, for that matter, any visual work) can be more powerful as a record of something than a written record. Some of the iconic events in history are only known from the visual record.
Photos (and videos) are ubiquitous in the digital age. They are captured in multiple ways and stored on a range of media including computers, portable drives, mobile devices and the cloud (including online storage and social media).
Digital photos can be easily changed or ‘photoshopped’, to use the more common term.
But records management requires us to ensure the authenticity, reliability and integrity of records. How can we ensure this for photographic records?
This post examines how SharePoint can be used to manage (some) photos as records.
What is a digital photo?
For the purpose of this post, a digital photo is any image that is captured and stored electronically as bits in a recognised image format. It excludes digital photos (including scanned or digitised paper records) that are embedded in PDF files.
To quote from the Wikipedia article titled ‘Digital Image‘, a digital photo is:
‘… an image composed of picture elements, also known as pixels, each with finite, discrete quantities of numeric representation for its intensity or gray level that is an output from its two-dimensional functions fed as input by its spatial coordinates denoted with x, y on the x-axis and y-axis, respectively. Depending on whether the image resolution is fixed, it may be of vector or raster type. By itself, the term ‘digital image’ usually refers to raster images or bitmapped images (as opposed to vector images).’
According to the definition, a digital photo is a collection of bits set out in a map that a computer interprets to display it on a screen. For more detail on bitmaps, see this Microsoft page ‘Bitmaps‘.
When to use SharePoint to store digital photos
Almost any digital object that can be stored electronically can be captured in SharePoint. But, just because they can doesn’t mean they should.
It is important to keep in mind that SharePoint may be not be the most suitable option for storing digital photographs in general. The volume, size and intended usage of the photos are all likely to influence the decision to use SharePoint. Organisations may need to consider other ways to store and manage digital photos (and other digital media) such as dedicated digital asset management systems.
However, SharePoint may be a good option to store digital photos as records if those photos:
Need to be kept as a record of something.
Support or relate to other content in the same document library. For example, photos that relate to or support building plans (which themselves may be scanned images in PDF form) or construction.
Are about a specific subject. For example, photos of damage to a physical object.
Are relatively low in volume and file size.
Factors to consider before storing digital photos in SharePoint
The following points should be considered before storing any digital photo (or digital media) in SharePoint.
Digital photos are usually stored on devices that capture them with meaningless, device-generated names such as ‘20210423_123192321’, ‘DSC_0330’, or ‘n594015825_1706121_4959’. These types of names provide no clue to the content when the image is saved to SharePoint, as shown below.
Ideally, the device-generated name should be replaced with a more meaningful name as shown below.
We can see from the unique Document ID that it is the same record. The version history also tells us that something was changed but the file size hasn’t changed.
Keep in mind that every change to the file name or any other metadata added to the library will create a new copy (‘version’) of the image. In the version history above, there are now four versions each 3 MB in size.
The Activity section of the information panel to the right (which also provides a preview of the image) also provides some key information to support the version history information:
Does changing the name potentially compromise a key element of metadata? Who should change the name, and when?
Organisations should consider establishing naming conventions for photographs to help with findability and context.
Almost any type of digital object can be stored in SharePoint. SharePoint also supports the ability to view almost every type of contemporary digital photo format (as described in this Microsoft page) so format should not be a problem when it comes to viewing the file.
However, if the organisation plans to store digital photos in older or less common formats, it will need to ensure that these can be accessed for as long as required. This will generally mean ensuring that the appropriate software application is available to anyone who needs to access (open/view) the photos.
Alternatively, consideration should be given to saving the photos in different, more common formats.
The size of a digital photograph may affect its usability as a record. The two photos below are of the same scene but the first photo is very low resolution (= small size, 63 KB), while the second is a section of a much large photo (= large size, 3.7 MB). The second version is clearly a more accurate record.
Digital photos used as records should provide sufficient detail to be useful as records. Accordingly, in most instances, the original photo, not a re-sized version, should be saved.
Reliability as a record
As noted above, photos can be easily modified, sometimes making it difficult to know if it accurately reflects the image it purports to capture. This is also an issue for any form of visual record, including drawings.
Metadata created when the photo was taken and stored in the photo’s properties (including EXIF metadata) can provide evidence of the reliability of the photo as a record, even when the record was stored in SharePoint. The following are the key metadata properties that are usually created and stored with a digital photo:
(Date) Created. The date and time when the photograph was created as a photo.
Date taken. This is, for most digital photos, the same date and time as the ‘created’ date.
(Date) Modified. This should be very close to the original date and time created on the original photograph, but may be several seconds different. If the photograph has been modified (including simple photo adjustments or ‘photoshopped’), it will show a different date and time which might give a clue to its reliability as a record.
Image dimensions, width, height, horizontal and vertical resolution, bit depth
Camera details including F-stop, exposure time, ISO speed, flash mode
Storing digital photos in SharePoint Online
Note that SharePoint previously had the option to create a dedicated ‘Picture Library’. This is no longer necessary as any document library can be used to store any digital object, and SharePoint Online document libraries now have additional options to view the photos.
Any digital photo can now be saved to a standard SharePoint document library, including via the ‘Files’ tab in MS Teams.
Unlike some EDRM systems, SharePoint does not ‘capture’ or extract the details of digital objects when they are saved to a document library. Instead, these remain with the original digital object.
If digital photos are to be saved to a SharePoint Online document library, consideration should be given to using existing or additional metadata columns to describe the image, for example, the ‘Image’ column (small version preview that is stored separately in the Site Assets library >’Lists’ folder > GUID-named folder).
The following screenshot shows a preview image of the main photo.
SharePoint includes three options to view the content in a document library list view: List, Compact List, or Tiles.
The following is an example of a single digital photo stored in a document library (same as the one in the other options earlier) set to the ‘Tiles’ view:
Reliability as a record
Once saved to SharePoint, the photo’s reliability as a record can be determined from version history.
Additional protections may include access/permission controls, retention and/or information security labels.
SharePoint can be used to store (some) digital photos as records, but it should not generally be used as a general storage location for digital photos. Other dedicated digital asset management systems may be more suitable for that purpose.
Before storing digital photos in SharePoint, organisations should establish procedural rules or principles including (re-) naming conventions, format and file size requirements.
SharePoint does not extract and store the metadata from digital objects, which means that digital photos should retain metadata that shows when they were created or modified, and other details about the photograph which will provide evidence to support the authenticity of the photo as a record. Some consideration should be given to adding additional metadata to help describe digital photos as records.
A combination of version history, access controls, information protection and retention labels should provide sufficient controls to ensure the reliability, integrity and authenticity of records – or at the very least provide the evidence of changes that may be made.
There are two sides to the question of authenticity, reliability and integrity:
Knowing if the photo that has been uploaded is a correct record of what it purports to be.
Preventing the photo from modification or deletion or tracking any modifications that may occur.
It might seem impossible to know if a photograph is what is purports to be, but its metadata payload may provide the detail required.
There are three main options in Microsoft 365 to apply recordkeeping classification terms to (some) records:
Metadata columns added to SharePoint sites, including those added to Content Types and/or added directly to document libraries.
Taxonomy terms stored in the central Term Store, including those added as site columns and added to site content types and/or added directly to document libraries. The only difference with the first option is that with the Term Store the classification terms are stored and managed centrally and are therefore available to every SharePoint site.
Retention labels that: (a) ‘map’ to classification terms; (b) are linked with a File Plan that includes the classification terms; (c) are either the same as (a) or (b) and are used in with a Document Understanding Model in SharePoint Syntex; or (d) the same as (a) or (b) and used with conjunction with Trainable Classifiers.
The first two options can only be applied to content stored in SharePoint. Retention labels may be applied to emails and OneDrive content. None of the three options can be applied to Teams chats. Also note that there is no connection between the SharePoint Term Store and the File Plan, both of which can be used to store classification terms.
Defines the meaning of classification from a recordkeeping point of view.
Describes each of the above options and their limits.
Discusses the requirement to classify records and other options in Microsoft 365.
What is classification?
Humans are natural-born classifiers. We see it in the way we store cutlery or linen, or other household items or personal records.
Business records also need some form of classification. But what does that mean? The 2002 version of the records management standard ISO 15489, defines classification as:
‘the systematic identification and arrangement of business activities and/or records into categories according to logically structured conventions, methods and procedural rules represented in a classification system’. (ISO 15489.1 2017 clause 3.5).
The standard also states (4.2.1) that a classification scheme based on business activities, along with a records disposition authority and a security and access classification scheme, were the principal instruments used in records management operations.
The classification of records in business is important to establish their context and help finding them.
Microsoft 365 includes various options to apply classification terms to records.
Metadata columns in SharePoint
The simplest way to classify records stored in SharePoint document libraries is to either create site columns containing the classification terms and add those columns to document libraries, or create them directly in those libraries.
Adding site or library columns is relatively simple. As classification terms are usually in the form of a (hierarchical) list, it is simple to add one choice or lookup column for function and another for activities.
A lookup column can bring across a value from another column when an item is selected; for example, if the look up list places ‘Accounting’ (Activity) in the same list row as ‘Financial Management’, selection of ‘Accounting’ will bring across ‘Financial Management’ as a separate (linked) column.
Default values (or even one value) can be set meaning that records added to a library (that only contains records with those classification terms) can be assigned the same classification terms each time without user intervention.
SharePoint choice or lookup columns do not allow for hierarchical views or values to be displayed from the list view so the context for the classification terms may not be obvious unless both function and activity are listed.
The Term Store
The Term Store, also known as the Managed Metadata Service (MMS) has existed in SharePoint as a option to create and centrally manage classification and taxonomy terms in SharePoint only for at least a decade.
Organisations can create multiple sets of taxonomies or ‘term groups’ (e.g, ‘BCS’ or ‘People’) within the Term Store. Each Term Group consists of the following:
Term Sets. These generally could map to a business function. Each Term Set has a name and description, and four tabs with the following information: (a) General: Owner, Stakeholders, Contact, Unique ID (GUID); (b) Usage settings: Submission policy, Available for tagging, Sort order; (c) Navigation: Use term set for site navigation or faceted navigation – both disabled by default; (d) Advanced: Translation options, custom properties.
Terms. These generally could map to an activity. Each Term has a name and three tabs: (a) General: Language, translation, synonyms and description; (b) Usage settings: Available for tagging, Member of (Term Set), Unique ID (GUID); (c) Advanced: Shared custom properties, Local custom properties.
In the example below, the Term Set (function) of ‘Community Relations’ has three Terms (activities).
Once they have been created in the Term Store, term set or terms can be added to a SharePoint site, either as a new site or local library/list column, as shown in the two screenshots below:
Once added as a site column, the new column may be added to a Content Type that is added to a library, or directly on the library or list.
The primary benefit using the central Term Store terms via a Managed Metadata column is that the Term Store is the ‘master’ classification scheme providing consistency in classification terms for all SharePoint sites.
As we will see below, Term Store terms may be used to help with the application of retention labels (which themselves may ‘map’ to classification terms in a function/activity-based retention schedule).
Using metadata terms from the Term Store is almost identical with using a choice or lookup column. The only real difference is that the Term Store provides a ‘master’ and consistent list of classification (and other) terms.
Term store classification terms, including in Content Types, may only be used on a minority of SharePoint sites.
It is not possible to select a Term Set (e.g., the function level), only a Term within a Term Set.
Only the selected classification Term appears in the library metadata, without the parent Term Set or visual hierarchy reference to that Term Set – see screenshot below. Technically only that Term is searchable. It is not possible to view a global listing of all records classified according to function and activity.
If multiple choices are allowed, a record may be classified according to more than one Term. This may cause issues with grouping, sorting or filtering the content of a library in views.
As we will see below, there is no connection between the classification Terms in Term Sets and the categorisation options available when creating new retention labels via a File Plan. ‘Business Function’ or ‘Category’ choices in the File Plan do not connect with the Term Store.
Term Store terms and Content Types can only be used to classify content stored in SharePoint.
Retention labels in Microsoft 365 can be used in an indirect way to classify records in SharePoint, email and OneDrive because they can be ‘mapped’ to classification elements.
For example, a label may be based on the following elements:
Function: Financial Management
Description: Accounting records
Retention: 7 years
Every retention label contains the following options:
Name. The name can provide simple details of the classification, for example: ‘Financial Management Accounting – 7 years’.
Description for users. This can be the full wording of the retention class.
Description for admins. This can contain details of how to apply or interpret the class, if required.
Retention settings (e.g., 7 years after date created/modified or label applied).
Where the classification terms map to a retention class, the process of applying a retention label to an individual record, email or OneDrive content could potentially be seen as classifying those records against the classification scheme.
The Data Classification section in the Microsoft 365 Compliance portal provides an overview of the volume of records in SharePoint, OneDrive or Exchange that have a specific retention class:
Not every record in every SharePoint document library may be subject to a retention label. Many records (for example in Teams-based SharePoint sites) may be subject to a ‘back end’ retention policy applied to the entire site (which creates a Preservation Hold library).
A retention label applied to a record doesn’t actually add any classification terms to the record.
Retention labels don’t map in any way to Term Store classification terms, except in SharePoint Syntex – see below (but this only applies to SharePoint content).
Retention labels/File Plan combination
The File Plan option (Records Management > File Plan, requires E5 licences) can also be used to add classification terms to a retention label as shown in the screenshot below. Note that there is no link with the Term Store.
Records (including emails) that have been assigned a retention label could, in theory, be regarded as having been classified in this way because the label contains (or references) the classification terms.
When applied to content in SharePoint, OneDrive or Exchange, retention labels linked with the File Plan do not show the File Plan classification terms. It may be possible to write a script that displays all records with the terms from the File Plan, but it may be easier to do this using the Data Classification option described above.
Retention labels/SharePoint Syntex combination
SharePoint Syntex provides a way to apply retention labels to records, stored in SharePoint, that have been identified through the Document Understanding Model process.
As can be seen in the screenshot above, each new DU model allows similar types of records (in the example above, ‘Statements of Work’) to be associated with a new or existing Content Type that can include a Term Store Term – for SharePoint records only – and a retention label. This provides three types of ‘classification’:
Grouping by record type (e.g., Statement of Work, Invoice)
Linking (of sorts) between the records ‘classified’ in this way and a Term Store term added as a metadata column to the Content Type.
Assigning of a retention label. This provides the same form of retention label-based classification described above.
Furthermore, if the Extraction option is also used, data extracted to SharePoint columns can be based on choices listed in the Term Store metadata.
SharePoint Syntex only works for records – and only those records that have some form of consistency – stored in SharePoint.
Trainable classifiers are another way that could be used to identify related records and apply a retention label to those records. Microsoft 365 includes six ‘out of the box’ trainable classifiers that will not be of much value to records managers for the classification of records:
Offensive language (to be deprecated)
The creation of new trainable classifiers requires an E5 licence; they are created through the Data Classification area of the Microsoft 365 Compliance admin portal. Machine Learning is used to identify related records to create the trainable classifiers.
The primary outcome (from a recordkeeping classification point of view) of using trainable classifiers is the application of a retention label to content stored in SharePoint and Exchange mailboxes. It can also be used to apply a sensitivity label to that content.
It is unlikely that every record will be classified according to every classification option.
Trainable classifiers only work with SharePoint and Exchange mailboxes.
Classifying records per workload
The options are summarised below for each main workload:
SharePoint: Use local site or library columns, Term Store terms or retention labels (mapped to a File Plan as necessary), applied manually or automatically, including via SharePoint Syntex or trainable classifiers.
Exchange mailboxes: The only feasible option to classify these records is to manually or auto-apply retention labels that are mapped to a classification, including a trainable classifer.
OneDrive: Manually or auto-apply retention labels mapped to a classification.
Teams. It is not possible to classify Teams chats with the options available.
Is classification necessary?
The classification model described in ISO 15489 and other standards was based on the idea that records would be stored in a central recordkeeping system where they would be subject to and tagged by the terms contained a classification scheme, often applied at the aggregation level (e.g., a file).
Microsoft 365 is not a recordkeeping system but a collection of multiple applications that may create or capture records, primarily in Exchange mailboxes, SharePoint, OneDrive and MS Teams (and also Yammer).
There is no central option to classify records in the recordkeeping sense. The closest options are:
The grouping of records in SharePoint sites (and Teams, each of which has a SharePoint site) and libraries that map to business functions and activities.
The use of metadata, either terms set in the central Term Store or created in local sites/libraries, to ‘classify’ individual records (including emails) stored in SharePoint document libraries. Each item in the library might have a default classification, or could be classified differently.
The use of retention labels that ‘map’ to function/activity pairs in a records disposal authority/schedule. These may be applied, manually or automatically, to content stored in SharePoint, OneDrive and Exchange mailboxes.
Neither of the above may apply, or be applied consistently, to all SharePoint sites, Exchange mailboxes, OneDrive accounts. And neither can be applied to Teams chats.
A different approach to this problem is required, one that likely will likely involve greater use of Artificial Intelligence (AI) and Machine Learning (ML) methods to identify and enable the grouping of records, and provide visualisations of the records so-classified.
Image: Werribee Mansion, Victoria, Australia stairwell (Andrew Warland photo)
Unlike Tasks, To Do items are essentially a personal list of things to do that is accessed from the separate To Do section of Outlook. They are not included in the calendar.
There are two ways to create a new item in the To Do list. The first is to click on the ‘My Day’ option in Outlook and then add a task in the To Do section.
The second is to click on the ‘To Do’ option at the bottom left of the Outlook app, which will open the ‘My Day’ section and allow a new (To Do) task to be created.
A bit confusingly, the Planned section of ‘To Do’ displays tasks that are:
Personal tasks, created as an Outlook calendar tasks but which don’t appear in the individual’s Outlook calendar, only in the To Do calendar.
Assigned to the individual from a Microsoft 365 Group or Team (including ‘Tasks by Planner’).
The difference between the two can be seen below; the first two are personal tasks from To Do, the second two are Group/Team-based tasks. The ‘Assigned to you’ items are the second two under ‘Later’. Note that the Planned section does not include any simple, non-calendar-based, ‘To Do’ items.
Planner is a task-based service originally linked directly with Office 365 Groups (announced in 2014 ‘Delivering the first chapter of Office 365 Groups‘). It was described as ‘a simple and highly visual way to organize teamwork’ within a team – which meant initially an Office 365 Group. It seemed that Microsoft’s vision was to move the creation of Tasks away from Outlook to Planner.
Initially, when a new Office 365 Group was created (including when a new Team was created in MS Teams), it created a Plan. This connection was later removed and so a new Group or Team no longer creates a new Plan.
The following is an example of an empty plan for an Office 365 Group called ‘SharePoint Admin Group’. All the members of the Group (or Team if one exists) would have access to the plan. Plans contain ‘buckets’ or groupings of tasks. The default bucket is ‘To do’ which, in the example below contains a single task ‘Create two new tasks’ (which is outstanding) A separate bucket was created named ‘New Sites’, and it has one completed task.
Changes to Planner
Several changes happened with Planner since 2018:
New Office 365 Groups and Teams did not automatically create a Plan.
Multiple plans could be created for every Office 365 Group or Team.
Tasks by Planner was introduced to Teams (see below) so that every channel can use either the ‘parent’ Office 365 Group Plan or create a new one.
These changes have created some confusing content in the Planner app.
Tasks by Planner in Teams
Microsoft announced in April 2020 that Planner would be renamed Tasks (it is still named Planner a year later).
As noted in the link (by onmsft.com), ‘… the change means that Teams users will soon be able to see their individual tasks and team tasks in a single app from across Teams channels, Planner and Outlook. For mobile users, the change also means that both a list view and a new mobile tasks experience will soon be available in within the Teams app.’
The new Tasks by Planner and To Do was visible from Teams in early 2021 but the relationship between Outlook To Do tasks and Planner Tasks via Teams remained a bit confusing.
When a Team channel is opened, the ‘Tasks by Planner and To Do’ can be added as a tab.
When ‘Tasks by Planner and To Do’ is selected, two options are visible and this is where some of the confusion starts.
Most people are likely to simply click ‘Save’, which creates a ‘Tasks’ tab in the channel and a new Plan. Ideally, they should use an existing plan, if it exists (which it probably won’t – as a result, multiple Plans may be created for each Team and channel.
This is how the new Tasks tab looks like in the Team:
Perhaps it doesn’t matter (for some organisations) how many Plans or Tasks that are set up.
We can now see there are three Plans for the SharePoint Admin Group Team, two in the same (General) channel. The end user can also see tasks that were assigned to him/her. If they click on the ‘Tasks’ option they will see the list of personal calendar- and To Do-based tasks as we saw above in Outlook:
Behind the scenes, in Planner, we can see four plans for the SharePoint Admin Group. Three of these map to the three above, but not the one titled ‘Tasks – SharePoint Admin Group’ which has two completed tasks. But where is it?
Here are the two completed tasks in SharePoint Admin Group ‘Tasks’ plan that don’t exist in the main SharePoint Admin Group Plan, or the other two.
Where are these two completed tasks?
Or, more specifically, why do they not appear in many of the Teams Tasks tabs? There are no private channels in this Team, so I know it’s not hidden in one – and, in any case, you cannot create a new Task list in a private channel.
Just to try to work this out, I created a new Task in that list of Tasks, assigned it to myself. The only place I could find it was in both Teams and Outlook in the ‘Assigned to you’ area.
Tasks, either to remind yourself of things you need to do, or what others need to do, are probably good for specific purposes or Teams. But the ability to create multiple Task lists in Teams channels is just going to create more and more confusion.
But it’s confusing and will likely result in multiple random Tasks/Plans in Planner, even for the same channel.
An article titled ‘Search, Forward‘ by Andrew Peck, then a United States magistrate judge published in Law Technology News in October 2011. Peck’s article made reference to ‘predictive coding’.
Grossman and Cormack’s article noted that ‘a technology-assisted review process involves the interplay of humans and computers to identify the documents in a collection that are responsive to a production request, or to identify those documents that should be withheld on the basis of privilege‘. By contrast, an ‘exhaustive manual review’ required ‘one or more humans to examine each and every document in the collection, and to code them as response (or privileged) or not‘.
The article noted, somewhat gently, that ‘relevant literature suggests that manual review is far from perfect’.
Peck’s article contained similar conclusions. He also noted how computer-based coding was based on a initial ‘seed set’ of documents identified by a human; the computer then identified the properties of those documents and used that to code other similar documents. ‘As the senior reviewer continues to code more sample documents, the computer predicts the reviewer’s coding‘ (hence predictive coding).
By 2011, this new technology was challenging old methods of manual review and classification. Despite some scepticism and slow uptake (for example, see this 2015 IDM article ‘Predictive Coding – What happened to the next big thing?‘), by 2021, it had become an accepted option to support discovery, sometimes involving offshore processing for high volumes of content.
‘… applies machine learning … enabling users to explore large, unstructured sets of data and quickly find what is relevant. It uses advanced text analytics to perform multi-dimensional analyses of data collections, intelligently sorting documents into themes, grouping near-duplicates, isolating unique data, and helping users quickly identify the documents they need. As part of this process, users train the system to identify documents relevant to a particular subject, such as a legal case or investigation. This iterative process is more accurate and cost effective than keyword searches and manual review of vast quantities of documents.’
It added that the product would be deployed in Office 365.
The concept of classification for records was defined in paragraph 7.3 of part 1 of the Australian Standard (AS) 4390, released in 1996. The standard defined classification as:
‘… the process of devising and applying schemes based on the business activities generating records, whereby they are categorised in systematic and consistent ways to facilitate their capture, retrieval, maintenance and disposal. Classification includes the determination of naming conventions, user permissions and security restrictions on records’.
The definition provided a number of examples of how the classification of business activities could act as a ‘powerful tool to assist in many of the processes involved in the management of records, resulting from those activities’. This included ‘determining appropriate retention periods for records’.
The only problem with the concept was the assumption that all records could be classified in this way, in a singular recordkeeping system. Unless they were copied to that system, emails largely escaped classification.
Fast forward to 2020
Managing all digital records according to recordkeeping standards has always been a problem. Electronic records management (ERM) systems managed the records that were copied into them, but a much higher percentage remained outside its control – in email systems, network files shares and, increasingly over the past 10 years, created and captured on host of alternative systems including third-party and social media platforms.
By the end of 2019, Microsoft had built a comprehensive single ecosystem to create, capture and manage digital content, including most of the records that would have been previously consigned to an ERMS. And then COVID appeared and working from home become common. All of a sudden (almost), it had to be possible to work online. Online meeting and collaboration systems such as Microsoft Teams took off, usually in parallel with email. Anything that required a VPN to access became a problem.
2021 – Automated classification for records (maybe)
The Microsoft 365 ecosystem generated a huge volume of new content scattered across four main workloads – Exchange/Outlook, SharePoint, OneDrive and Teams. A few other systems such as Yammer also added to the mix.
Most of this information was not subject to any form of classification in the recordkeeping sense. The Microsoft 365 platform included the ability to apply retention policies to content but there was a disconnect between classification and retention.
Microsoft announced Project Cortex at Ignite in 2019. According to the announcement, Project Cortex:
Uses advanced AI to deliver insights and expertise in the apps that are used every day, to harness collective knowledge and to empower people and teams to learn, upskill and innovate faster.
Uses AI to reason over content across teams and systems, recognizing content types, extracting important information, and automatically organizing content into shared topics like projects, products, processes and customers.
Creates a knowledge network based on relationships among topics, content, and people.
Project Cortex drew on technological capabilities present in Azure’s Cognitive Services and the Microsoft Graph. It is not known to what extent the Equivio product, acquired in 2015, was integrated with these solutions but, from all the available details, it appears the technology is at least connected in one way or another.
During Ignite 2020, Microsoft announced SharePoint Syntex and trainable classifiers, either of which could be deployed to classify information and apply retention rules.
Trainable classifiers sound very similar to the predictive coding capability that appeared from 2011. However, they:
Use the power of Machine Learning (ML) to identify categories of information. This is achieved by creating an initial ‘seed’ of data in a SharePoint library, creating a new trainable classifier and pointing it at the seed, then reviewing the outcomes. More content is added to ensure accuracy.
Can be used to identify similar content in Exchange mailboxes, SharePoint sites, OneDrive for Business accounts, and Microsoft 365 Groups and apply a pre-defined retention label to that content.
In theory, this means it might be possible to identify a set of similar records – for example, financial documents – and apply the same retention label to them. The Content Explorer in the Compliance admin portal will list the records that are subject to that label.
SharePoint Syntex was announced at Ignite in September 2020 and made generally available in early 2021.
The original version of Syntex (as part of Project Cortex) was targeted at the ability to extract metadata from forms, a capability that has existed with various other scanning/OCR products for at least a decade. The capability that was released in early 2021 included the base metadata extraction capability as well as a broader capability to classify content and apply a retention label.
Classification. This capability involves the following steps: (a) Creation of (SharePoint site) Content Center; (b) Creation of a Document Understanding Model (DUM) for each ‘type’ of record; the DUM can create a new content type or point to an existing one; the DUM can also link with the retention label to be applied; (c) Creation of an initial seed of records (positives and a couple of negatives); (d) Creation of Explanations that help the model find records by phrase, proximity, or pattern (matching, e.g., dates); (e) Training; (f) Applying the model to SharePoint sites or libraries. The outcome of the classification is that matching records in the location where it is pointed are assigned to the Content Type (replacing any previous one) and tagged with a retention label (also replacing any previous one).
Extraction. This capability has similar steps to the classification option except that the Explanations identify what metadata is to be extracted from where (again based on phrase, proximity or pattern) to what metadata column. The outcome of extraction is that the matching records include the extracted metadata in the library columns (in addition to the Content Type and retention label).
As with trainable classifiers, Syntex uses Machine Learning to classify records, but Syntex also has the ability to extract metadata. Syntex can only classify or extract data from SharePoint libraries.
Trainable classifiers or Syntex?
Both options require the organisation to create an initial seed of content and to use Machine Learning to develop an understanding of the content, in order to classify it.
The models are similar, the primary difference is that trainable classifiers can work on content stored in email, SharePoint and OneDrive, whereas Syntex is currently restricted to SharePoint.
On 18 March 2021, Microsoft announced the pending (April 2021) preview release of an enhanced predictive coding module for advanced eDiscovery in Microsoft 365.
The announcement, pointing to this roadmap item, noted that eDiscovery managers would be able to create and train relevance models within Advanced eDiscovery using as few as 50 documents, to prioritize review.
So, can Microsoft technology classify records better than humans?
In their 1999 book ‘Sorting Things Out: Classification and its Consequences‘ (MIT Press), Geoffrey Bowker and Susan Leigh Star noted that ‘to classify is human’ and that classification was ‘the sleeping beauty of information science’ and ‘the scaffolding of information infrastructures’.
But they also noted how ‘each standard and category valorizes some point or view and silences another. Standards and classifications (can) produce advantage or suffering’ (quote from review in link above).
Technology-based classification in theory is impartial. It categorises what it finds through machine learning and algorithms. But, technology-based classification requires human review of the initial and subsequent seeds. Accordingly such classification has the potential to be skewed according to the way the reviewer’s bias or predilections, the selection of one set of preferred or ‘matching’ records over another.
Ultimately, a ‘match’ is based on a scoring ‘relevancy’ algorithm. Perhaps the technology can classify better than humans, but whether the classification is accurate may depend on the human to make accurate, consistent and impartial decisions.
Either way, the manual classification of records is likely to go the same way as the manual review of legal documents for discovery.
One of the most confusing aspects of Teams and SharePoint in Microsoft 365 is the relationship between permission groups used to control access to both of these resources. This is especially the case as every Team in MS Teams has an associated SharePoint site (the ‘Files’ tab).
This post explains how permission groups work between MS Teams, Microsoft 365 Groups and SharePoint.
SharePoint permission groups
Before discussing how Teams permissions relate to SharePoint, here is a brief reminder of how SharePoint permissions work.
SharePoint has always had three default permission groups, prefixed by the URL name of the site, as shown in the screenshot below (the name of the site always prefixes the words Owners, Members and Visitors).
People (including in a Group, see below) added to the Owners permission group have full access (full control) to all parts of the site and are usually responsible for managing the SharePoint site. There would normally be two or three site owners.
People (including in a Group, see below) added to the Members permission group have add/edit (contribute) rights.
People added to the Visitors permission group have read-only (view) rights.
These permissions are set at the site level and inherited on everything in the site, unless that inheritance is broken and unique permission are applied. Additional permission groups can be created as necessary but most SharePoint sites only use the default Owners, Members and Visitors groups.
Microsoft 365 Groups
Microsoft 365 Groups were introduced in 2017 and control access to resources, like Security Groups.
However, unlike Security Groups, which usually provide access to individual resources (such as a single SharePoint site, or Line of Business (LOB) system), Microsoft 365 Groups control access to multiple linked Microsoft 365 resources.
Microsoft 365 groups, distribution lists, mail-enabled security groups, and security groups (collectively referred to as Active Directory (AD) groups, are all created in ‘Groups’ area of the Microsoft 365 Admin portal.
When a new group is created, the following options appear.
As noted above, Microsoft 365 groups are recommended. It is important to understand the relationship between Microsoft 365 groups, Teams and SharePoint.
When a new Microsoft 365 group is created (from the dialogue above), it creates:
At least one Owner must be specified. The Owner/s are responsible for managing the Members group.
An Exchange mailbox with the same email @ name as the Microsoft 365 group. The mailbox is visible in Outlook to the members of the Group.
A SharePoint site with the same URL name as the Microsoft 365 group.
By default (unless the checkbox is unchecked), a new Team is also created in MS Teams.
When a new Team is created from MS Teams, or a new SharePoint Team site is created, it creates:
A Microsoft 365 Group with an Exchange mailbox and a SharePoint site (‘Files’ tab).
The name of the Team becomes the name of the Group and the SharePoint site.
The mailbox is not visible in Outlook and is only used for calendaring and for the storage of Teams chats (in a hidden folder).
Importantly, when a new Microsoft 365 group or Team is created (which creates a Microsoft 365 group), the Group Owners: (a) are the same as the Team Owners and (b) are added to the SharePoint Owners permission group, as explained below. .
Group/Team Owners and Members
In other words, the Microsoft 365 group owners (group) is added to the SharePoint site owners permission group – a ‘group within a group’.
That is, the Microsoft 365 group controls access to the Team and the SharePoint site as shown in the diagram below. Security Groups may also be added to the Microsoft 365 Group site, but this does not provide access to the Team.
This ‘group within a group’ model is visible from the ‘Site Permissions’ section of the gear/cog icon as shown below (the name of the Microsoft 365 Group/Team/SharePoint site is ‘SharePoint Admin’). The SharePoint Admin Group Owners (group) is in the SharePoint site owners group, and the SharePoint Admin Group Members (group) is in the Site members group.
If a mouse hovers over the Group ‘icon’ (in the above example, GO or GM), it is possible to view the members of the Group and, for Owners, to modify that list. Confusingly, the ‘GM’ in the SharePoint site permissions group becomes ‘SG’ in the drop down list.
You can also see the ‘group within group’ model from the back-end ‘Advanced permissions’ section of the SharePoint site, but you cannot manage the Microsoft 365 Group members here.
Implementing the model
As with Security Groups, the members of Microsoft 365 Groups will usually be a logical group of people who require access to something, in this case access to the SharePoint site or the Team (for chat, files, or other resources).
The main thing to remember is that membership of the (backend) Microsoft 365 Group provides access to BOTH the Team and the Team’s SharePoint site (the ‘Files’ tab in a Team).
Every Team in MS Teams will usually consist of the members of a logical group with a common interest – a business unit, project team, or with some other work relationship, for example, the members of a committee. The Team Owners are responsible for managing the Team Members.
The Team Owners are the SharePoint site owners and are responsible for managing the site if they decide to access it directly. The Team Members are the SharePoint site members and have the ability to add or edit content, usually via the ‘Files’ tab in Teams.
Note: Security Groups with the same members as Microsoft 365 Groups (and Teams) may already exist. There is no need to add a Security Group if it has the same members as a Microsoft 365 Group.
As noted earlier, a Group/Team does not have visitors with read-only rights. Every Member of the Team has add/edit access to both the Team and its associated SharePoint site.
If there is a requirement to give specific other people either add/edit or read-only access to the SharePoint site, that outcome is achieved by adding people by name, or a Security Group, to either the SharePoint Members or Visitors group.
If there is a requirement to give everyone in the organisation either add/edit rights, or read only access, to the SharePoint site, that outcome is achieved by adding ‘Everyone except external users’ to either the SharePoint Members or Visitors group.
External guests may also be added to the Team and the Team’s SharePoint site.
Sources for this information are listed where this is known.
1973 – Plato Notes
A history of ERM and EDM systems must include reference to Lotus Notes.
Lotus Notes began its life in 1973 as Plato Notes, developed by the Computer-based Education Research Laboratory (CERL) at the University of Illinois in 1973. Elements of the Plato Notes system would be developed for PC by Ray Ozzie during the late 1970s. This was picked up by Lotus Development Corporation and in 1984 became Lotus Notes.
An early version of Lotus Notes was released (under contract to Lotus) in 1984. The original vision included on-line discussion, email, phone books and document databases. Eventually the product fell into the ‘groupware’ category. The capability of the product continued to grow and some organisations only used Notes.
Lotus acquired all the rights to Lotus Notes in 1987 and version 1.0 was released on 7 December 1989.
1974 – Compulink Management Center/Laserfiche founded
Compulink Management Center was founded in the US in 1974. It created Laserfiche, the first DOS-based document imaging system, in 1987.
1976 – Micro Focus founded
Micro Focus was founded in the UK in 1976. Its first software product was CIS COBOL, a solution for micro computers. It entered the EDRM market in 2017, see below.
1981 – Enterprise Informatics founded
Enterprise Informatics, a privately-held software company, was founded in 1981 by early pioneers of the document management industry. (Source: LinkedIn company profile) It would later be acquired by Spescom, a South African company.
1982 – FileNet founded
FileNet was founded in 1982 by Ted Smith, formerly of Basic 4. FileNet’s original focus of attention was the storage and management of scanned images but it also developed a workflow software. (Source: Wikipedia article on FileNet)
1983 – GMB/DocFind founded
GMB (named after the original founders, Gillett, Frank McKenna, and Bachmann) was formed in Australia in 1983. In 1984, GMB released DocFind 1.0. DocFind was renamed RecFind in 1986.
Tower Software was founded by Brand Hoff in Canberra in 1985 as a software development company. The company provided and supported enterprise content management software, notably its TRIM (Tower Records and Information Management) product line for electronic records management.
The ‘Tower’ in the company name derives from the telecommunications tower on top of Black Mountain (technically a hill, 812 m high) overlooking Canberra. A graphic of the tower was used in the TRIM logo until the company was acquired by HP’s Software Division in 2008 (see also below).
1986 – Autonomy founded (UK)
Autonomy was founded by Michael Lynch, David Tabizel and Richard Gaunt in Cambridge, UK in 1986 ‘as a spin-off from Cambridge Neurodynamics, a firm specializing in computer-based finger print recognition’.
Before 1987 – Saros Corp
Saros Corp was established in Washington by Mike Kennewick (a former Microsoft employee) before 1987. Saros Corp produced Saros Mezzanine, a client-server document management engine. In 1993, released Saros Document Manager.
1989 – Ymijs (later Valid Information Systems) founded – R/KYV (UK)
Ymijs was founded in the UK in 1989. It sold the R/KYV software initially as a basic document imaging processing system. The company name was changed to Valid Information Systems and R/KYV was further developed as a compliance and records management system ‘… that is widely used by major corporations as well as central and local government authorities and related governmental agencies’ (in the UK).
1989 – Provenance Systems (later TrueArc) founded (Canada)
Bruce Miller, sometimes noted as ‘the inventor of modern electronic recordkeeping software’, founded Provenance Systems in 1989 where he created ForeMost. The company name was changed to TrueArc. Bruce would go on to found Tarian Software as well in 1999.
TrueArc ForeMost RM would be acquired in 2002 by Documentum (which which it had a long-standing technology partnership).
1990 – Documentum founded (US)
According to this Wikipedia article, Documentum was founded in June 1990 by Howard Shao and John Newton who had previously worked at Ingres (a relational database vendor). They sought to solve the problem of unstructured information.
The first Documentum EDMS was released in 1993. According to the Wikipedia article, ‘This product managed access to unstructured information stored within a shared repository, running on a central server. End users connected to the repository through PC, Macintosh, and Unix Motif desktop client applications.’
1992 – Altris Software (UK)
Altris, established in 1992 (Source: Rob Liddell\’s LinkedIn profile. Rob was one of the co-founders of Altris), developed document management systems, including (according to this South African ITWeb post of 26 October 2001), eB, a ‘configuration management’ application.
This article titled ‘The Case for 11g‘ (referring to Oracle’s product, see below) noted that Optika’s original software development focus was Image and Process Management (IPM).
An undated (but likely mid to late 1990s) webpage on the Property and Casuality website titled ‘Optika and Xerox Package FilePower with Document Centre‘ noted that ‘Optika Imaging Systems, Inc. and Xerox announced that the two companies will jointly work to integrate Optika’s FilePower with Document Centre digital systems products from Xerox. The combination of the Document Centre and FilePower will provide a complete solution for capturing, managing and distributing large volumes of documents, increasing users’ productivity and significantly reducing labor and capital costs. Optika’s integrated product suite — FilePower — combines imaging, workflow and COLD technology into a unified software package. The Xerox Document Centre 220ST and 230ST combine network scanning, printing, faxing and copying into one hardware device.’
1993 – Workflow Management Coalition formed
The Workflow Management Coalition (WfMC), ‘a consortium formed to define standards for the interoperability of workflow management systems’, was founded in May 1993. Original members included IBM, Hewlett-Packard, Fujitsu, ICL, Staffware and approximately 300 software and services firms in the business software sector.
The WfMC’s Workflow Reference Model was published first in 1995 and still forms the basis of most BPM and workflow software systems in use today. (Source: Undated Gutenberg article)
1993 – Kainos Meridio (UK)
Meridio was developed in 1993 by Kainos (a Northern Ireland company and joint venture between Fujitsu and The Queens University in Belfast) as an electronic document and records management (EDRM) system based on Microsoft products. It would be acquired by HP Autonomy in 2007.
1993 – Saros (US) Document Manager
Saros Corp released Saros Document Manager in mid 1993. The product was said ‘to act as a front-end to the Bellevue, Washington-based firm’s client-server document management engine, Saros Mezzanine’. (Source: Computer Business Review article ‘Saros Sets Document Manager‘ )
ERM before the mid 1990s
Before the arrival of personal computers in offices in the early 1990s, computer mainframes and databases were the regarded by some observers as the only places where electronic ‘records’ (in the form of data in tables) were stored and managed.
A report by the United States General Account Office in July 1999 (GAO/GGD-99-94) titled ‘Preserving Electronic Records in an Era of Rapidly Changing Technology’) stated that, historically (as far back as 1972), NARA’s Electronic Records Management (ERM) guidance (GRS 20) was geared towards mainframes and databases, not personal computers.
The GAO report noted that until at least the late 1990s, there was a general expectation that all other electronic records not created or captured in ERM systems would be printed and placed on a paper file or another system. The original (electronic) records could then be destroyed.
Some early ERM (database) systems, such as TRIM from Tower Software in Australia, were originally developed in the mid 1980s to manage paper files and boxes. Similar systems were developed to manage library catalogues and the old card catalogues started to disappear.
But, although some of it was printed and filed, the volume of electronic records in email systems and stored across network file shares continued to grow. Several vendors released systems that could be used to manage electronic documents (EDM) more effectively than network drives but there was no agreed standard for managing that content as records.
1994 – The DLM Forum and MoReq
From the early 1990s, the European Council sought to promote greater cooperation between European governments on the management of archives. One of the outcomes of a meeting in 1996 was the creation of the DLM Forum. DLM is the acronym of the French term ‘Données Lisibles par Machine’, or ‘machine-readable data’.
One of the ten action points arising from the June 1994 DLM meeting was the creation of ‘Model Requirements for the Management of Electronic Records’, or MoReq, first published in 2001 (see below).
According to its website, ‘the InterPARES Project was borne out of previous research carried out at the University of British Columbia’s School of Library, Archival and Information Studies. “The Preservation of the Integrity of Electronic Records” (a.k.a. “The UBC Project”) defined the requirements for creating, handling and preserving reliable and authentic electronic records in active recordkeeping systems.’
‘The UBC Project researchers, Dr. Luciana Duranti and Professor Terry Eastwood, worked in close collaboration with the U.S. Department of Defense Records Management Task Force to identify requirements for Records Management Applications (RMA).
The work of the UBC team influenced the development of DOD 5015.2 published in 1997 (see below) and the subsequent development of a range of electronic document and records management (EDRM) systems.
Australia – intervention in business applications model
In 1996, the University of Pittsburgh published the ‘Functional Requirements for Evidence in Recordkeeping Project’, led by David Bearman. This work would influence the development of both MoReq2010 and the ICA standards that became ISO 16175-2010, both of which attempted to define a minimum set of functional requirements for a business application to be able to manage its own records. (Lappin)
1995 – IBM Acquires Notes
Lotus Notes was acquired by IBM in July 1995. By December 1996 it had 20 million users. By the end of 1999, Lotus Notes had extensive capability including ERM and EDM.
Lotus Notes continued to retain a strong presence in the market but its dominance began to be reduced by the arrival of Microsoft’s broader capabilities and other EDM solutions.
1995 – Alpharel (US) acquires Trimco (UK)
According to this Computer Business Review article of 23 November 1995, Alpharel Inc, San Diego was expected to acquire Trimco Group Plc of Ealing, London, a supplier of enterprise-wide document management systems.
1995 – FileNet acquires Saros
FileNet acquired Saros Corporation in 1995 to acquire its electronic document management capability. It was said to have pioneered ‘integrated document management’ (IDM), through a suite that offered document imaging, electronic document management, COLD and workflow. (Source: Wikipedia article on FileNet)
1996 – Australian Standard AS 4390
In February 1996 Australia issued the world’s first national records management standard, AS4390 ‘Records Management – General‘. The standard provided guidance for the implementation of records management strategies, procedures and practices.
Tower Software, the Canberra-based developers of TRIM, contributed to the development of the standard (according to its Wikipedia entry) although the standard did not prescribe requirements for the management of electronic records.
AS 4390 would become internationalised through ISO 15489 in 2002.
1996 – OpenText Corporation (US) – Livelink
OpenText Corporation was founded in 1991 from OpenText Systems. It released Livelink in 1996.
1996 – EDM solutions (UK listing)
The following is a list of EDM systems taken from the Document Management Resource Guide, 1995/96 Edition, kindly provided by Reynold Lemming in 2021. (^ = Original software author entry, all others are system resellers)
QStar: Axxess / Server / Worksgroup / Enterprise ^
1996 – Various EDM solutions
The March 1996 edition of Engineering Data Management included a number of updates on electronic document management solutions in the market at that time. Note that Trimco and Alpharel are listed separately; this may because Alpharel’s acquisition of Trimco had not been completed by that time.
Alpharel (San Diego, CA): Document Management solutions – Enabler, FlexFolder, RIPS, Toolkit API. Wisdom, a product that facilitated internet access to participating electronic document vaults.
Auto-trol Technology (Denver, CO): CENTRA 2000, document management, workflow, PDM, change management and messaging.
Cimage Enterprise Systems (Bracknell, UK): Document Manager for Windows.
Documentum, Inc. (Pleasanton, CA): Documentum Accelera for the World Wide Web and Documentum UnaLink for Lotus Notes.
Interleaf (Waltham, MA): Interleaft 6 SGML, a solution for publishing SGML doocuments. Intellecte/BusinessWeb, a document management solution that allowed organisations to access enterprise document repositories from the internet.
Trimco (Ealing, UK): Document management systems.
Alpharel changed its name to Altris Software (US) in October 1996, according to this Telecompaper article published the same month.
From 1996 – Germany’s DOMEA project
In 1996, the Coordinating and Advising Agency of the Federal Government for Information Technology in the Federal Administration (KBSt) introduced a pilot project named Document Management and Electronic Archiving in computer-assisted business processes (DOMEA).
Under the framework of DOMEA, a project group was set up in 1998 to find solutions for the disposition and archiving of electronic records. The goal was to find a suitable and efficient way for the disposition of electronic records created and maintained in office systems. Its “Concept for the Disposition and Archiving of Electronic Records in Federal Agencies,” containing recommendations for managing electronic records was published in September 1998. (Source: The Free Library article)
Late 1990s – EDMS vs ERMS
Electronic document management systems (EDMS) and electronic records management systems (ERMS) were regarded as separate types of system from the late 1990s until at least 2008.
According to Philip Bantin in August 2002:
An EDMS was said to support day-to-day use of documents for ongoing business. Among other things, this meant that the records stored in the system could continue to be modified and exist in several versions. Records could also be deleted.
An ERMS was designed to provide a secure repository for authentic and reliable business records. Although it contained the same or similar document management functionality as an EDMS, a key difference was that records stored in an ERMS could not be modified or deleted. (The concept of ‘declaring a record’ may be related to this point).
(Source: Presentation by Philip Bantin, University Archivist at the University of Indiana, dated 18 April 2001)
The difference between the two types of system endured for at least a decade. By the end of the 1990s, four main EDRMS options had emerged:
Extending an existing EDM product capability to include ERM.
Extending an existing ERM capability to include EDM.
Creating new ERM products (technically also with some EDM capability).
Integrating separate EDM and ERM products.
1997 – DOD 5015.2
According to the 1999 GAO report quoted above, for several years prior to 1997, NARA worked with the US Department of Defense, considered ‘one of the agencies that is most advanced in its ERM efforts’.
The outcome of this work was the release in November 1997 of the DOD standard titled ‘Design Criteria Standard for Electronic Records Management Software Applications’ usually known by its authority number – DOD Directive 5015.2, Department of Defense Records Management Program, 11 April 1997.
The GAO report stated that ‘ERM information systems that were in place before the approval of this standard must comply with the standard by November 1999’.
It added that US agencies ‘were confronted with many ERM challenges’ from the ever-increasing volume of digital records, including the ability to preserve and access those records over time. The ‘Year 2000 problem’ was drawing attention away from the issue.
Nevertheless, by 2 June 1999, nine companies were certified as compliant with the DOD standard. Some, it noted, were standalone ERM software, while others were an integrated solution.
An interesting small note on page 11 of the GAO report noted that ‘it is important that ERMS software requires users to make no more than two or three extra keystrokes, and that users realize there is a benefit to this additional ‘burden’.
From 1997 – SER eGovernment (Germany)
SER eGovernment was developed for the German/Austrian market following the release of the German eGovernment standard, DOMEA in 1997.
1998 – Documentum goes online
In 1998, Documentum released its Web Application Environment, a set of internet extensions for EDMS, offering web access to documents stored within an EDMS repository. Various additional products were acquired and their functionality added to the Documentum system.
1998 – Optika eMedia released
Optika released eMedia, ‘a software and methodology product designed to manage business transactions within an organization, across extranets, and throughout the supply chain’, in late 1998. (Source ‘Optika Delivers App to Manage Business Transactions‘) Optika eMedia was said to be ‘a workflow enabled replacement for an imaging solution named FilePower’.
1998 – FileNet Panagon suite released
In 1998, FileNet released its Panagon suite of products. This included Panagon Content Services that was previous Saros Mezzanine. (Source: Wikipedia article on FileNet)
1999 – International differences
The 1999 GAO report noted differences between the US, UK, Australia and Canada on their approach to ‘common ERM challenges’.
Australia was said to have ‘strong central authority (including for compliance audits) and decentralised custody’ (except when the records are transferred to permanent retention).
Canada had ‘vision statements rather than specific policies’ and also had decentralised custody, but agencies could transfer records at any time to the archives.
The UK had broad guidelines put into practice by individual agencies.
1999 – the UK PRO standard released
The UK Public Records Office (PRO, later The National Archives, TNA) released a standard in 1999 designed ‘to provide a tool for benchmarking the ability of government departments to support electronic records management’. This standard would be replaced by TNA 2003. (Source: ‘ERM System Requirements’, published in INFuture, 4-6 November 2009, by, Marko Lukicic, Ericsson)
End of the 1900s – XML
By the end of the 20th century it was becoming clear (to some) that XML would likely play a strong role in the standardisation of electronic record formats and their management over time.
XML-based record structures meant that electronic records could contain their own ‘metadata payloads’ rather than being independent objects defined in a separate system (like a library catalogue describes books on shelves).
The establishment of XML-based formats would (after about 20 years) begin to change the way in which records would be managed, although paper records and the paradigm of managing records in pre-defined containers would continue to persist, largely because of the standards developed to manage electronic records – in particular DOD 5015.2.
1999 – EDM/early ERM products
The following is a collated list of EDM (and related) products collated in November 1999:
Autonomy Portal in a Box
CompuTechnics (1990 to 1999)/Objective (from 1999)
Hummingbird (from 1999 with acquisition of PC DOCS)
Insight Technologies Knowledge Server (IKS) / Document Management System (DMS)
Intraspect Knowledge Server (IKS) (KM)
Onyx Enterprise Portal, with integration to various EDM applications
Open Text Livelink
Pitney Bowes Digital Document Delivery (D3)
PC DOCS (acquired by Hummingbird in early 1999)
ReadSoft (OCR processing)
Tower Software / TRIM Captura
1999 – Tarian Software founded
Tarian Software was founded in Canada in 1999 by Bruce Miller, the founder of Provenance Systems (later TrueArc) and creator of ForeMost. Tarian developed the Tarian eRecordsEngine, an embedded electronic recordkeeping technology for business application software. Tarian was the first e-Records technology in the world to be certified against the revised 5015.2 June 2002 standard. Tarian was acquired by IBM in 2002.
1999 – The Victorian Electronic Records Strategy (VERS)
The (Australian) Victorian government’s Public Records Office (PROV) published a standard for the management of electronic records in 1999, Standard 99/007 ‘Standard for the Management of Electronic Records’. The standard, usually known as VERS, defined the (XML-based) format required for the transfer of permanent records to the PROV.
The Standard noted that:
Records must be self-documenting. It is possible to interpret and understand the content of the record without needing to refer to documentation about the system in which it was produced
Records must be self-contained. All the information about the record is contained within the record itself
The record structure must be extensible. It must be possible to extend the structure of the record to add new metadata or new record types without affecting the interoperability of the basic structure.
Several EDRMS vendors developed the capability to create VERS encapsulated objects (VEOs) as required by the standard.
2000 – Spescom (South Africa) acquires Altris (UK)
The South African company Spescom acquired the UK firm Altris Software in 2000, as noted in this (South Africa) ITWeb article of 3 May 2000. Altris was described in the article as ‘a global leader in integrated electronic document management software, with well established channels to international markets’. As a result of this acquisition, Altris UK was renamed Spescom Ltd (UK).
The same journal announced in 2001 that Spescom KMS was ‘the UK operation of Spescom Limited’s US based subsidiary, Altris Software Inc, which specialises in the provision of asset information management software to markets including transportation, utilities and telcos’.
From 2000 – Microsoft adopts XML for Office documents
In 2000, Microsoft released an initial version of an XML-based format for Microsoft Excel, which was incorporated in Office XP.
In 2002, a new file format for Microsoft Word followed. The Excel and Word formats, known as the Microsoft Office XML formats (with an ‘x’ on the end of the document extension), were later incorporated into the 2003 release of Microsoft Office.
Microsoft’s XML formats, known as Open Office XML, later became ECMA 376 in 2006 and later ISO 29500 in 2008 ‘amid some controversy’ over the need for another XML format (see below).
Before 2001 – Intranet Solutions (later Stellent)
Intranet Solutions had developed software called’IntraDoc!’. The product was briefly renamed Xpedio! before the company and product were renamed Stellent in 2001. (Source: ‘Wikipedia article on Oracle Acquisitions‘)
2001 – EDM systems with RM functionality
The following is a list of ‘EDM systems with records management’ functionality available by early 2001:
TRIM (Tower Software, Australia) – integrated ERM and EDM.
In addition to the EDMS/ERMS differences, organisations were also seeking solutions for knowledge management (KM) and content management (CM).
CM solutions were usually portal-based options that mostly became some form of intranet.
Some of the options in the early 2000s included:
Hummingbird’s PowerDOCS for DM and CyberDOCS as the web client for the DM solution, along with Hummingbird’s (formerly Fulcrum) Knowledge Server for KM and PD Accord for web-based collaboration, with the Hummingbird Enterprise Information Portal (EIP) as the portal solution. Plumtree Corporate Portal could also be used as an Enterprise Portal.
iManage’s DeskSite for DM and WorkTeam for collaboration. For KM, WorkKnowledge Server and Concept Search (based on the Autonomy Server). The portal to link all of these was called WorkPortal.
Open Text’s Livelink for DM, KM and collaboration.
Elite’s Encompass, built on Microsoft’s new SharePoint Portal Server (SPS).
Autonomy Server for KM, with Plumtree Corporate Portal as the portal.
Documentum’s DM and CM product coupled with Plumtree Corporate Portal.
Digital Asset Management (DAM) systems, used to manage other types of digital content such as photographs, also appeared around this time.
Information Technology Decisions published a paper on DOD 5015.2 certified products in November 2001 (original source/location has been lost). It noted that there were two types of products:
Products that started life as electronic document management (EDM) systems. Examples included Documentum, Livelink, and DOCS Open.
Products that started life as electronic recordkeeping (ERK/ERM) systems. Examples given included Tower Software’s TRIM, Foremost, iRIMS, Cuadra Star.
The presentation noted that DOD 5015.2 certification was based on alternative options:
Standalone. For example, True Arc Foremost, TRIM, iRIMS, Cuadra Associates Star, Relativity Records Manager, Hummingbird RM 4.0, Tarian eRecords, MDY/FileSurf, Cimage and Access Systems, Highland Technologies Highview-RM, Open Text, Livelink
Partnership. For example, Saperion with e-Manage 2000, Impact Systems eRecords Manager, FileNet with Foremost.
The report included three interesting points:
Both EDMS and ERKS required an enterprise view of information.
An EDMS is driven by business process requirements.
An ERKS (ERMS) is driven by enterprise requirements for the long-term preservation of information.
2001 – The first MoReq
The first version of MoReq was published in 2001. Volume 1 was 500 pages long.
MoReq emphasised the central importance of an electronic records management system, or ERMS. Its stated purpose was:
To provide guidance to organisations wishing to acquire ERMS.
As a tool to audit or check an existing ERMS.
As a reference document for use in training or teaching.
To guide product development by ERMS suppliers and developers.
To help define the nature of outsourced records management solutions.
Few, if any, products were certified against this version of MoReq.
2002 – Optika Acorde
Optika eMedia was rebranded to Optika Acorde in 2002, according to this website ‘The Case for 11g‘.
A June 2002 Gartner report titled ‘Optika Acorde Document Imaging, Workflow and Collaboration Suite‘ noted that Optika Acorde was an ‘integrated software family for managing the content associated with business transactions’ leveraging ‘Optika’s core strengths in document imaging, workflow and enterprise report management.’
2002 – FileNet BrightSpire, later P8 ECM
FileNet released BrightSpire in 2002. This product ‘leveraged the experience gained from integrated document management, web content management and workflow into what became ECM. (Source: Wikipedia article on FileNet)
By 2002 – Enterprise Content Management (ECM)
The term ‘Enterprise Content Management’ (ECM) began to appear more frequently by 2002. The Wikipedia post on ECM noted that ECM technologies descended from ‘electronic Document Management Systems (DMS) of the late 1980s and early 1990s’.
The integration of records management (RM) with business practices.
The capability for integration between RM products, EDM, various other digital products (such as OCR/character recognition technologies), and web publishing products.
Incorporation of Knowledge Management (KM) concepts.
The key word here was ‘integration’ with EDM and other systems, rather than standalone systems. Web-based access became increasingly essential. IBM’s acquisition of Tarian, Documentum’s acquisition of TrueArc’s Foremost were examples of these integrations. (see below)
According to the Wikipedia article on ECM: ‘Before 2003, the ECM market was dominated by medium-sized independent vendors which fell into two categories: those who originated as document management companies (Laserfiche, Saros, Documentum, docStar, and OpenText) and began adding the management of other business content, and those who started as web content management providers (Interwoven, Vignette, and Stellent) tried to branch out into managing business documents and rich media’.
The emergence of ECM quite possibly created the first challenge to centralised ERM through the integration of multiple elements, some of which created, captured or stored records in ever increasing formats.
2002 – OpenDocument XML format
According to the Wikipedia article on the OpenDocument standard, the OpenDocument standard was developed by a Technical Committee (TC) under the Organization for the Advancement of Structured Information Standards (OASIS) industry consortium. Sun and IBM apparently had a large voting influence but the standardization process involved the developers of many office suites or related document systems. The first ODF-TC meeting was held in December 2002.
2002 – An updated GAO report into electronic records
The US General Accounting Office (GAO) released a new report in June 2002 titled ‘Information Management: Challenges in Managing and Preserving Electronic Records’ (GAO-02-586). This report, which was more detailed than the earlier 1999 one, noted among other things that:
The DOD had by March 2002 certified 31 applications against standard 5015.2.
Progress had been made on the development of the Open Archival Information System (OAIS) model which, while initially developed by NASA for archiving the large volume of data produced by space missions, could be applied to ‘any archive, digital library or repository’. XML-based solutions were considered the most likely to be accepted.
From that date, IBM released the IBM Records Manager Version 2.0 (IRM), previously known as the Tarian eRecords Engine (TeRe). Tarian’s e-Records management technology was integrated into IBM’s software offerings, including IBM Content Manager, DB2 database and Lotus software. (Source: IBM press release)
2002 – Documentum 5 and TrueArc Foremost acquisition
Documentum released Documentum 4i, its first Web-native platform, in 2000. In 2002, it launched Documentum 5 as ‘a unified enterprise content management (ECM) platform for storing a virtually unlimited range of content types within a shared repository’.
Documentum acquired TruArc’s Foremost product in October 2002. The Documentum Wikipage above noted that this acquisition ‘added records management capabilities and augmented Documentum’s offerings for compliance solutions.’ The press release cited in this paragraph noted that ‘Documentum and TrueArc are existing technology partners and have worked together to provide an integration for TrueArc’s enterprise-scalable records management solution with the Documentum ECM platform.’
2003 – TNA 2003
The National Archives (TNA) released an updated version of its PRO standard in 2003, known as TNA 2003. This standard would be superseded by MoReq2. (Source: @ZenInformation on Twitter, 12 February 2021).
In October 2003, Open Text acquired the (German) DOMEA-certified SER eGovernment Deutschland GmbH, based in Berlin, Germany as well as SER Solutions Software GmbH, based in Salzburg, Austria. (Source: Open Text Acquires SER eGovernment)
From 2003 – CNIPA (Italy)
The Italian Centro Nazionale per l’Informatica nella Pubblica Amministrazione (CNIPA) published a protocol for the management of electronic records, Protocollo Informatico in 2003.
CNIPA was renamed DigitPA in 2009 and Agenzia per I’Italia digitale (AGID) in 2012. AGID is responsible for defining standards for the management of electronic records in Italian government agencies. (Source: Protocollo Informatico)
Mid 2003 – The challenges of Enterprise Records Management
In Industry Trend Reports of May 2003, Bruce Silver (of Bruce Silver Associates) made the case for Enterprise Records Management in the wake of various ‘scandals’ involving the management of records at the time, including Enron/Anderson.
Silver argued that EDM, email archive, and back-up solutions did not meet the ‘new statutory and regulatory records management requirements’ – DOD 5015.2, SEC Rules 17a-3 and 17a-4, NASD Rules 2210, 3010, and 3110, NYSE Rules 342 and 440, ISO 15489 and MoReq.
Silver also noted that an effective (‘total’) ERM solution would ‘be implemented as an extension of the company’s ECM infrastructure’, providing for a single interface for all records stored in multiple locations ‘including third-party document management repositories in addition to the email system and network file system’.
2003 – Key integrated EDM/RM vendors
The following is a list of ‘key vendors in the (Integrated Document Management) IDM Market Space’ in October 2003. The list is believed to have come from a Butler Group report:
The report of the sale in The Register stated that Stellent’s CEO said that ‘customers are looking to consolidate their content management needs, including imaging, business process management, web content management and record management with one vendor.’ The new product line was named Stellent Imaging and Business Procss Management (IBPM). The article also noted that Oracle would probably acquire Stellent following this acquisition (see 2006, below).
2004 – ReMANO (Netherlands)
In 2004, the Netherlands government established a catalogue of software specifications for ERM systems (ReMANO) used by Dutch government bodies. (Source: ‘ERM System Requirements’, published in INFuture, 4-6 November 2009, by, Marko Lukicic, Ericsson)
ReMANO was replaced by NEN2082 – Eisen voor Functionaliteit van Informatie- en Archiefmanagement in programmatuur” in 2008. (NEN 2082:2008 nl)
2005 – C6 (France) builds D2 on top of EMC Documentum
The French ECM company C6 built a solution named D2, ‘a fully configurable web application for creating, managing, storing and delivering any type of information’, on top of EMC’s Documentum. (Source: C6 website ‘Company’ tab).
2006 – The National Archives of Australia ERMS standard
The National Archives of Australia (NAA) released its ‘Functional Specifications for Electronic Records Management Systems Software in February 2006. (ISBN 1 920807 34 9). The introduction noted that:
(The document) provided Australian Government agencies with a set of generic requirements for ensuring adequate recordkeeping functionality within electronic records management systems (ERMS) software.
Agencies were encouraged to make use of the ERMS specifications when designing or purchasing new, or upgrading existing, ERMS software. They could also be used when auditing, assessing or reviewing an agency’s existing ERMS software.
The requirements were not intended to be a complete specification, but rather provide a template of key functional requirements that agencies may incorporate into their tender documentation when preparing to select and purchase new ERMS software. Agencies were expected to assess and amend the functional requirements, and select requirements that best suit their own business and technical requirements and constraints.
Very few products met the specific requirements of the ERMS specifications which led to some suggestion at the time that it limited choice.
2006 – Rival XML Office document standards
The OpenDocument (ODF) standard was published as ISO/IEC 26300 in 2006.
Microsoft submitted initial material to the Ecma International Technical Committee TC45, where it was standardized to become ECMA-376, approved in December 2006. It was released as ISO 29500 in 2008.
According to the Wikipedia article on Open Office XML (OOXML), ‘The ISO standardization of Office Open XML was controversial and embittered’, as it seemed unnecessary to have two rival XML standards.
2006 – The world of collaboration
The Butler Group published a paper titled ‘Document Collaboration – Linking People, Process and Content’ in December 2006. The report noted that EDM systems had helped improve internal efficiency but there was now a need to ‘extend these systems to partners and stakeholders’ and deliver ‘sophisticated collaborative experiences’.
The paper listed the following EDM products:
Adobe Acrobat family
IBM Notes/Domino, Workplace collaboration services, QuickPlace
Microsoft Office 2007
Open Text Livelink ECM – eDOCS (incorporating the former Hummingbird product suite acquired by Open Text in 2006)
Oracle Collaboration Suite, Content DB and Records DB
Stellent Collaboration Management
2006 – Spescom Software Inc
A US SEC submission in January 2006 noted that Spescom Software Inc, a San Diego-based provider of computer integrated systems was the successor to Alpharel Inc and Altris Software Inc.
2006 – Oracle acquires Stellent
A 2006 Oracle press release titled ‘Oracle Buys Stellent‘ stated that Stellent was a global provider of enterprise content management (ECM) software solutions that included Document and Records Management, Web Content Management, Digital Asset Management, Imaging and Business Process Management, and Risk and Compliance. It also noted that the acquisition would ‘complement and extend Oracle’s existing content management solution portfolio’. Despite the acquisition, the ‘Stellent’ name persisted.
2006 – Google enters the online EDM productivity and collaboration market
In 2006, Google launched Google Apps for Your Domain, a collection of cloud computing, productivity and collaboration tools, software and products. Various apps and elements were acquired and/or added over the years but a key one from an EDM point of view was Google Docs (Wikipedia article). However, Google Docs had no RM capability.
A ZDNet article in June 2007 noted that Google Apps offered a tool for switching from Exchange Server and Lotus Notes, making Google a real alternative to Microsoft and IBM. Google Apps would later be rebranded G-Suite in 2016.
2007 – Spescom exits the EDM market / Enterprise Informatics
In 2007, Spescom exiting the enterprise software sector with the sale of its US operation Enterprise Informatics. (Source – Wikipedia article on Spescom, original reference no longer accessible).
Enterprise Informatics, originally founded in 1981, continued in existence as a subsidiary of Bentley Systems, Incorporated. It continued to market a suite of integrated document, configuration, and records management software products, mostly under the name eB.
2007 – Zoho enters the online EDM collaboration market
The India-based Zoho Corporation, known as AdventNet Inc from 1996 to 2009, released Zoho Docs in 2007.
2007 – HP Autonomy acquires Meridio
Meridio was acquired by HP Autonomy (a company that had had a long business partnership with Kainos) in 2007. The parent company Kainos continued to work with SharePoint-based solutions.
2007 – EDRMS vendors
Forrester released a report into electronic records management vendors in early 2007. The products that it evaluated were as follows:
CA MDY FileSurf v7.5 ^
EMC Records Manager 5.3
IBM FileNet P8 Records Manager v3.7 ^
IBM Records Manager v4.1.3 #
Interwoven Records Manager v 5.1 *
Meridio Document and Records Manager v4.4 *
Open Text Livelink ECM – Records Management v3.8 ^
Open Text Livelink – eDOCS RM (formerly Hummingbird) v6.0.1
Oracle (formerly Stellent) Universal Records Management v7.1 #
Oracle Records DB v1.0 ~
Tower Software TRIM Context v6.0 *
Vignette Records & Documents v 7.0.5
(Forrester assessment: ^ = leaders, # = close behind leaders, * = have hurdles to remain competitive’, ~ = basic functionality only)
2008 – NEN 2082
The Dutch government replaced ReMANO with NEN 2082 ‘Eisen voor functionaliteit van informatie- en archiefmanagement in programmatuur’ (‘Requirements for functionality of information and archive management in software’) in 2008 (NEN 2082:2008 nl). NEN 2082 was derived from MoReq, DOD 5015.2 and Australian standards. (See Eric Burger’s blog post ‘Nee, NEN 2082 is geen wettelijke verplichting‘ about its legal standing)
2008 – MoReq2
MoReq2 was published in 2008. It included new sections to support the testing of ERMS software for compliance with the standard. MoReq2 included the following vendors on its panel (Acknowledgements section):
EDRM Solutions, USA
ErgoGroup AS, Norway
Haessler Information, Germany
ICZ, Czech Republic
Lockheed Martin, USA
Objective Corporation, UK
Open Text Corporation, UK
SER Solutions Deutschland, Germany
Tower Software, UK
Both MoReq and MoReq2 were based on the premise of a central ERMS being acquired and implemented by organisations to manage unstructured records, the types of records that are stored across network drives and in email systems. MoReq2 specifically clearly excluded the management of ‘structured data … stored under the management of a data processing application’. (Source: MoReq2, section 1.2 ‘Emphasis and Limitations of this Specification’, page 12.)
The first software product certified against MoReq2 was Fabasoft Folio. It was the only certified product until June 2014.
2008 – EDMS and ERMS
In 2008, the International Standards Organisation, under ISO/TC171/SC2 ‘Document management applications’ proposed a framework for the integration of EDM and ERM systems. The definitions contained in that framework document noted that:
An EDMS was used to manage, control, locate and retrieve information in an electronic system.
An ERMS was used to manage electronic and non-electronic records according to accepted principles and practices of records management.
An integrated EDRMS would combine both capabilities.
Section 6 of the report described general (but fairly detailed) functional requirements for an integrated EDMS/ERMS, outlined in the following diagram:
2009 – Autonomy acquires Interwoven
In 2009, HP Autonomy acquired Interwoven, a niche provider of enterprise content management software mostly to the legal industry. It primarily competed with Documentum in this space. Interwoven became Autonomy Interwoven and Autonomy iManage.
2010 – MoReq2010
MoReq was completely revised and published as MoReq2010 in 2010. There were key differences with its predecessor versions.
It de-emphasised, but did not remove, the idea of an ERMS being the central or sole recordkeeping system or repository for organisations.
It emphasised the need for line of business systems to incorporate a minimum, defined level of recordkeeping functionality.
It brought a degree of practicality about the management of records in other systems.
It provided for interoperability between all MoReq compliant systems, based on a common XML language.
MoReq2010 established ‘… a definition of a common set of core services that are shared by many different types of records systems’. It provided a set of modules that could be incorporated into any software solution, including line of business applications, so they can be ‘MoReq compliant records systems’ (MCRS).
2010 – Google’s DM capability enhanced
In March 2010, Google acquired DocVerse, an online document collaboration company. DocVerse allowed multiple user online collaboration on Microsoft Word documents, as well as other Microsoft Office formats, such as Excel and PowerPoint. (Source – Wikipedia article on Google Docs)
‘Microsoft SharePoint 2010 is a software product with a range of uses, including website development, content management and collaboration. SharePoint allows users to collaborate on the creation, review and approval of various types of content, including documents, lists, discussions, wiki pages, web pages and blog posts. SharePoint is not a recordkeeping system (i.e. a system purposely designed to capture, maintain and provide access to records over time). When implemented ‘out of the box’, SharePoint has limited capacities for capturing and keeping records in a way that supports their ability to function as authentic evidence of business.’
Adam Harmetz, the Lead Program Manager for the SharePoint Document and Records Management engineering team at Microsoft said in a recent online interview about Records in SharePoint 2010, “We constantly get questions from around the world about how to deal with local government and industry standards for information management. Let me throw just a few at you… MOREQ2, VERS, ISO 15489, DOMEA, TNA, ERKS, the list goes on. Some of these standards are loosely based on one another and some have contradictory elements. Rather than focus our engineering efforts on addressing each of these standards in turn, we made the choice to deliver the usability and innovation required to make records management deployments successful and allow our partner ecosystem to build out the SharePoint platform to deal with specific requirements for those customers that are mandated to adhere to a specific standard.”
Despite these comments, SharePoint 2010 was assessed by at least one consultant (Wise Technology Solutions) to meet 88% of the requirements of the then ICA Standard that became ISO 16175 Part 2.
On 16 December 2011, State Records NSW published a blog post titled ‘Initial advice on implementing recordkeeping in SharePoint 2010‘. The post noted that the Wise report had concluded that ‘SharePoint is 88% compliant with the ICA requirements’. It added that the areas where full compliance could not be achieved relate to:
ease of email capture
native security classification and access control
physical and hybrid records management
The report states that third party providers are able to offer products that plug SharePoint’s gaps in these areas.
The blog posted also stated that ‘… the report very clearly makes the point that ‘we note that the achievement of these results is reliant on appropriate design and governance of implementation, configuration and set up to ensure consistency with desired records management outcomes’.’
Early 2010 – Microsoft launches Office 365
Microsoft launched Office 365 on 28 June 2011. Office 365 was designed to be a successor to Microsoft’s Business Productivity Online Suite (BPOS). (Source: Wikipedia article on Office 365). It would not be until the mid 2010s that Office 365 would become an effective counter-solution to the G Suite.
2011 – HP acquires Autonomy
In 2011, Hewlett-Packard acquired Autonomy, a deal that resulted in some interesting subsequent legal issues reading the value of the company.
As noted in the scope section of the standard, ‘(The) specification provides models for software services to support management activities for electronic records’. Further, ‘… models are provided that describe the platform independent model (PIM) that defines the business domain of Records Management and the RM services to be provided’. Three technology-specific implementations are specified:
PSM-1 – Web Services definition for Records Management Services in Web Service Description Language (WSDL). This is actually supplied as ten WSDL files; one for each Records Management Service.
PSM-2 – A Records Management Service XSD. The XSD is for use in creating XML files for import/export of Managed Records from compliant environments and to use as a basis for forming XQuery/XPath statements for the query service.
PSM-3 – An Attribute Profile XSD. The XSD is for capturing and communicating attribute profiles to permit flexible attribution of certain types of Records Management Objects.
2011 – The death of ERM systems?
In a May 2011 blog post on MoReq2010, James Lappin suggested that traditional systems used to manage electronic documents and records, while not being entirely dead in the water, had ‘lost momentum’.
James proposed two specific reasons for this situation:
The global financial crisis (GFC) from 2008 that limited the ability of organisations to acquire and implement hugely expensive ERMS solutions.
The rise of Microsoft SharePoint and particularly SharePoint 2010. In some ways, Sharepoint 2010 had the potential to take – and may have already taken – the ERMS wind from the records managers sails.
He also noted that a series of interrelated user-environment issues may have also played a part in the loss of momentum.
Usability and take up rates of the ERMS. These solutions are sometimes seen as ‘yet another system’ to manage the same records, using a classification structure that doesn’t make sense to most end users and is different from the way end users see and categorise their world.
The ongoing availability of and access to alternative places to store information, including network drives and email folders, and cloud-based storage and email solutions.
The rise and general availability of social networking tools and mobile applications used to create and share new forms of information content, and collaborate and communicate, including wikis, blogs, Twitter, Facebook, and similar solutions, often in an almost parallel ‘personal’ world to the official record.
The inability of ERMS solutions to manage structured data or to maintain and reproduce easily the diverse range of content created and stored in products like SharePoint. Indeed, one reasonably well known product has been described as an archive for SharePoint, even though the latter can quite easily manage its own archives.
The rise of search as a tool to find relevant information in context, and the related change from unstructured to structured in XML-based documents generated by products such as Microsoft Office 2007 and 2010.
From 2013 – GEVER (Switzerland)
From 2013, the Swiss Federal Chancellery was responsible for managing all activities relating to electronic records and process management (Elektronische Geschäftsverwaltung), or GEVER. GEVER consisted of a collection of five standards for the management of electronic records. (Source with current update: Gever Bund)
2015 – Hewlett Packard separates
In October 2015, the software products previously under the Autonomy banner were divided between HP Inc and Hewlett Packard Enterprise (HPE). HP Inc was assigned Autonomy’s content management software components including TeamSite, Qfiniti, Qfiniti Managed Services, MediaBin, Optimost, and Explore.
2015 – EDRMS vendors
Despite the alleged death of ERMS products in around 2010, many continued to thrive and grow. Some were acquired by others.
The following is a list of EDRMS vendors in December 2015 taken from a Gartner report diagram titled ‘Product or Service Scores for Trusted System of Record’ (with the scores included). Many of these products also appeared in the October 2016 ‘Magic Quadrant’ for Enterprise Content Management Systems as indicated)
Alfresco (2.47) (also ECM)
Open Text (4.07) (also ECM)
EMC Documentum (3.94) (as Dell EMC)(Acquired by Open Text)
IBM (3.91) (also ECM)
Oracle (3.45) (also ECM)
Laserfiche (2.45) (also ECM)
Microsoft (SharePoint) (2.38) (also ECM)
Hyland OnBase (2.37) (also ECM)
Lexmark (2.32) (also ECM)
Newgen (2.18) (also ECM)
Objective (2/10) (also ECM)
By 2015 – Oracle departing the scene?
In a July 2015 article titled ‘Looking for an Oracle IPM replacement‘ in the blog softwaredevelopmentforECM, it was noted that Oracle was ‘clearly, and publically, going in a different direction and moving away from traditional enterprise imaging and transactional content management’.
2016 – OpenText, Micro Focus
In May 2016, OpenText acquired HP TeamSite, HP MediaBin, HP Qfiniti, HP Explore, HP Aurasma, and HP Optimost from HP Inc.
The following is a list of products identified by the Victorian Public Records Office (PROV) in 2020. These products were all certified against the VERS standard, that required organisations to be able to create XML-based VERS Encapsulated Objects (VEOs) for long-term preservation.
AvePoint RevIM, Records, Cloud Records
Bluepoint Content Manager
Canon Therefore 2012
ELOprofessional / ELOenterprise
HP Records Manager
IBM Enterprise Records
IBM FileNet P8 Records Manager
MicroFocus Content Manager
OpenText eDOCS RM
OpenText Records Management
Oracle WebCentre Content
Technology One ECM
2021 – EDRMS vendors
The following is a list of dedicated vendors that offered EDRMS solutions (and more in most cases) by early 2021. Many of these vendors have a long history not necessarily reflected in the above text. Most of these vendors provide Enterprise Content Management (ECM) services, including EDM and ERM capabilities.
Alfresco ECM (alfresco.com)
Hyland OnBase (hyland.com)
IBM ECM (ibm.com)
Knowledgeone RecFind EDRM (knowledgeonecorp.com)
Laserfiche RME (laserfiche.com)
Lexmark RIM (lexmark.com)
Micro Focus Content Manager (microfocus.com)
Microsoft 365 (microsoft.com)
Newgen RMS (newgensoft.com)
Objective ECM (objective.com)
Open Text ECM (opentext.com)
Oracle ECM (docs.oracle.com)
TechnologyOne ECM (technologyonecorp.com)
2021 – Dedicated EDMS vendors
EDM vendors never went away, but many – like Google Drive, DropBox and Box – were built in and for the cloud. This Capterra website has a fairly detailed listing of current EDMS vendors.
The future of standards-based ERM/EDRM/ECM systems
Although the definition of a record has remained largely intact for the past two decades – ‘evidence of business activity’ (ISO 15489) – the form of records has evolved and continues to do so.
The ever-expanded world of digital content has made it increasingly difficult to accurately and consistently identify, capture and manage records in all forms, a challenge to the notion that all records can be stored in a single system.
The ‘in place’ approach to managing electronic records – wherever they are stored – has strong appeal. But where will we be in another 20 years? Some thoughts:
Electronic databases, whether on-premise or cloud-based (including subscription based), will be the primary method of capturing and storing a wide range of digital content rather than network file shares.
Metadata will be automatically captured or auto-generated for all digital content based on the content itself.
Artificial Intelligence (AI) will continue to grow in maturity, allowing records to be identified from all other digital content, classified, aggregated, and managed through to disposal/disposition or transfer to archives.
Email will, slowly, disappear as the current workforce transitions to chat- and video-based communication methods.
The classification of records is fundamental recordkeeping activity. It is defined in the international standard ISO 15489-1:2016 (Information and Documentation – Records Management) as the ‘systematic identification and/or arrangement of business activities and/or records into categories according to logically structured conventions, methods and procedural rules‘. (Terms and Definitions, 3.4)
The purpose of classification is defined by State Records NSW as follows: ‘In records management, records are classified according to the business functions and activities which generate the records. This functional approach to classification means that classification can be used for a range of records management purposes, including appraisal and disposal, determining handling, storage and security requirements, and setting user permissions, as well as providing a basis for titling and indexing‘. (Records Classification, accessed 13 January 2021.)
The ever-increasing volume of digital records, the many different ways to create them, and the multitude of record types that are created and storage locations, have made it more difficult to accurately and consistently manually classify records, including through the creation of pre-defined ‘containers’ or aggregations based on classification terms. Despite this, the requirement to link the classification of records with their retention and and disposal remains.
For over three decades, Microsoft’s applications and technology platforms have been used to create, capture, store and manage records. Some of these records (in the earlier period) were printed and placed on paper files, or stored (from around 2000) in dedicated electronic document and records management (EDRM) systems.
But the volume and type of digital content, including with new types of records (e.g., chat messages) and storage locations, continues to grow. In response, Microsoft invested heavily in addressing the need to classify records ‘at scale’.
This post looks at various ways to classify records, for retention and disposition purposes, in Microsoft 365.
The old-school, manual method – metadata
Most of the records in Microsoft 365 will be created, captured or stored in one of the four primary workloads: Exchange mailboxes, SharePoint sites/libraries, MS Teams chats (a ‘compliance copy’ of which is stored in Exchange mailboxes), and OneDrive for Business libraries. Some records may also exist in Yammer or other web page content (e.g., intranet).
Most SharePoint sites as well as Teams (that have a SharePoint site) will be created according to some form of business need to create, capture, store and share records; that is, the site or team purpose may be based on a business function or activity. This way of grouping records may in some ways be used as a way to classify records – by SharePoint site (e.g., function) or document library (e.g., activity).
Records may be stored in multiple document libraries, or within a folder structure of a single library.
A number of methods (some of which rely on others) can be used to add classification (and other) metadata to records stored in SharePoint document libraries:
1 – Creating the classification taxonomy in the Managed Metadata Service (MMS)/Term Store via the SharePoint Admin portal – Content Services – Term store, and then applying these terms in content types that are then deployed in SharePoint sites.
2 – Creating global content types from the SharePoint Admin portal, in the Content Services – Content type gallery area (see ‘Finance Document’ example below) and then deploying these in specific SharePoint sites where site columns that contain classification terms will be added.
3 – Creating site columns that contain classification terms, including from the MMS, and adding these to global or site content types or document libraries where they can be applied to records.
4 – Creating site content types and adding site columns (including MMS-based columns), then adding these content types to document libraries.
But, most of the above is somewhat complicated and cumbersome and would normally only be used for and manually applied to specific types of records.
The simplest way to apply BCS/File Plan terms at the document (or document set) level is to (a) store records to the same BCS function or activity in the same library, (b) create site or library columns with default values and add these to the library. This way means that the default terms are applied automatically as soon as a new record is uploaded, including when shared/inherited from the site columns added to a document set that ‘contains’ a document content type.
However, keep in mind that SharePoint is just one of the workloads where records are stored.
Records in the form of emails, chats and ‘personal’ content (as well as Yammer messages and web pages) are created in and stored across the other workloads. Some attempt may be made to copy these other records (especially emails) in SharePoint sites but it starts to get complicated or impossible to do so with things like Teams chat messages.
In most cases (and according to Microsoft’s own recommendations), it is better to leave the records where they were created or captured (‘in place’), and apply centralised compliance controls (classification, retention labels and policies) to this content.
Leaving the records in place in this way does not exclude the ability to create SharePoint sites and document libraries in those sites that map to classification terms, and/or use the site column approach described above but these are more likely to be exceptions.
In fact, some form of logical structure is almost certain anyway as most end-users will probably want to access and manage information in their own specific work context (the Team/SharePoint site).
Since not all records are stored in SharePoint and the ever-increasing volume of digital content stored across the Microsoft 365 platform, Microsoft needed to find a way to classify records ‘at scale’.
The solution was to use machine learning (ML) via trainable classifiers accessed in the ‘Data Classification’ section of the Microsoft 365 Compliance portal. This capability is only available to E5 licences.
‘This classification method is particularly well suited to content that isn’t easily identified by either the manual or automated pattern matching methods. This method of classification is more about training a classifier to identify an item based on what the item is, not by elements that are in the item (pattern matching).’
Organisations (including E3 licence holders) may make use of five pre-defined trainable classifiers (Resumes, Source Code, Targeted Harassment, Profanity or Threat. A sixth classifier ‘Offensive language’, is to be deprecated). Custom classifiers require an E5 licence.
Custom classifiers require ‘significantly more work’ than the pre-existing classifiers and the process is quite involved (see the process flow diagram in the ‘Learn about’ page link above) but in summary it involves the following steps:
Creating the custom classifier.
Creating a set of manually selected example records (50 to 500) in a dedicated SharePoint Online site as the ‘seed’. This would include a range of emails in the seed examples.
Testing the classifier with the seeded documents.
Re-training with additional content – both positive and negative matches.
Once the classifier is published, it can be used to identify and classify related content across SharePoint Online, Exchange, and OneDrive (but not Teams).
The page ‘Default crawled file name extensions and parsed file types‘ provides details of all the record types that can be classified in this way. Note it is not clear if trainable classifiers can crawl the compliance copy of Teams chat messages stored in hidden folders in Exchange mailboxes.
Label-based retention policies can then be automatically applied to content that has been identified through the trainable classifier.
However, note that the classifier does not ‘group’, aggregate or ‘present’ (list) the records for review (except broadly via the Content Explorer); however, the label applied to the records can be searched via the ‘Content Search’ option in the Compliance portal. This is a much better option than not having any idea how many records of a particular classification may exist in Exchange mailboxes, OneDrive accounts, or general SharePoint sites. It requires some degree of ‘letting go’ of the ability to view and browse content classified this way, and trusting the system.
The main limit with trainable classifiers is that it requires an E5 or E5 compliance licence.
The other limitation is the management of the disposition of records that have been identified with trainable classifiers and had a label-based retention policy applied. There are significant shortcomings with the current ‘Disposition Review’ process, specifically the lack of adequate metadata to review records due for disposal or the details of what has been destroyed.
Another (but limited) option might be to use SharePoint Syntex (see ‘Introduction to Microsoft SharePoint Syntex‘ for an overview), although its range is limited to SharePoint and – it seems – only records that have a relatively consistent structure and format.
SharePoint Syntex evolved out of Project Cortex’s ability to extract and capture metadata from records. It can also be used through its ‘Document Understanding Model‘ (DUM) to provide a way to classify records stored in SharePoint Online (only). It makes use of a ‘seeding’ model that is similar to trainable classifiers (and may be based on the same underlying AI engine).
Broadly speaking, the DUM works on the basis of loading a small ‘seed’ set of (relatively consistently formated) example files into a dedicated Content Center (or Centers). This is very similar to the process of using trainable classifiers, except that the latter does not require a ‘content center’ SharePoint site to be created.
The example files are ‘trained’ by being ‘classified’ through the document understanding process based on a set of ‘explanation types‘ that are used to help find the relevant content. The three explanation types are: (a) phrase list (a list of words, phrases, numbers, or other characters used in the document or information that you are extracting); (b) pattern list (patterns of numbers, letters, or other characters); and (c) proximity (describes how close other explanations are to each other).
The document understanding model (DUM) produced through the explanation types is associated (and deployed) with a new or existing content type.
Once applied to a SharePoint site library, the DUM/content type provides the basis for identifying and tagging (with metadata) other similar records in the location (e.g., the library) where the DUM has been deployed.
If the documents have consistent content such as invoices, certain data from those documents can be extracted as metadata.
The answer to this question will depend on your compliance requirements.
Smaller organisations may be able to set up SharePoint sites and document libraries with site columns/metadata that maps to their business classification scheme or file plan, and copy emails to those libraries. There may be little need to use AI-based classification methods.
In large and more complex organisations (with E5 licences), especially those with a lot of content stored across Exchange mailboxes and SharePoint sites (including Teams-based sites) there will most certainly be a need for some form of AI-based classification in addition to classification-mapped SharePoint sites (and Teams).
Organisations with E3 licences might use the manual methods described above for specific types of records, and consider acquiring additional E5 Compliance licences to make use of trainable classifiers or SharePoint Syntex for other records.
Microsoft released its ‘Records Management’ solution for Microsoft 365 during 2020. The solution is only accessible to organisations with an E5 licence (or an E5 Security and Compliance licence).
Some of the retention-related options previously available to E3 licences, such as disposition review, are now only available with an E5 licence. However, for cost and other reasons, many organisations have decided to stay with E3 and asked if it is still possible to manage the retention of records.
A basic organization-wide or location-wide Exchangemailbox retention policy and/or to manually apply a non-record retention labeling to mailbox data.
A basic SharePoint or OneDrive retention policy and/or to manually apply a non-record retention label to files in SharePoint or OneDrive.
A Teams retention policy.
This post describes how the retention of records could be managed with an E3 licence and recommends that organisations intending to deploy Microsoft’s retention policies (whether E3 or E5) develop a plan for their deployment based on a detailed understanding of the following:
What records exist and where they are stored across the main Microsoft 365 workloads – Exchange Online (EXO) mailboxes, SharePoint Online (SPO) sites, Microsoft Teams (MS Teams), and OneDrive for Business (ODfB) accounts.
What retention is required, including for EXO mailboxes and personal/ODfB accounts for which retention was previously ‘managed’ through backups, and MS Teams chats.
What type of retention policy will apply to what workloads.
How will disposition be managed (including of original storage locations, not just the records), and what evidence of destruction is required?
How will any limitations and shortcomings be addressed to minimise legal risk?
Where are the records?
A good starting point with retention planning is to establish where the records that will be subject to retention policies are stored.
Most records are created and stored in one of the primary four workloads – Exchange Online (EXO) mailboxes, SharePoint Online (SPO) sites, Microsoft Teams (MS Teams), and OneDrive for Business (ODfB) accounts.
While SharePoint sites (including Teams-based sites and Teams channels) may be used to logically ‘group’ records (e.g., in a team site, a document library, or a channel), this is not the case for EXO mailboxes, ODfB accounts, MS Teams 1:1 or private channel chats (a compliance copy of these chats is stored in the personal EXO mailboxes of participants in the chat).
EXO mailboxes and ODfB accounts usually contain a range of records on different subjects, with different retention requirements. Personal chat messages could be about any subject.
For most of the past three decades, the requirement to keep specific emails or other records meant that end-users had to copy those records to another system, including an EDRMS or even SharePoint, or print and place them on a file. The ability to find old records depended more often than note on the ability to recover them from back-up tapes, a process ironically often referred to as ‘archiving’.
Action: Organisations should establish a list of where records are stored in Microsoft 365 and how retention (including via backups) is currently managed.
What retention is required?
The next step in the process is to establish how long these records need to be kept.
Most of the time this will be based on an organisational records retention policy or schedule. These policies or schedules describe groups of records and how long they must be retained – for example, financial records generally must be retained for seven years. Specific types of financial records may require shorter or longer retention.
This model generally works well when records have been stored in logical groups (e.g., a whole SPO site/Team, or document library), but are more difficult to apply for individual records stored with other records in personal EXO mailboxes, ODfB accounts, SPO sites, or MS Teams chats. If records with longer retention requirements in these locations are not stored elsewhere, or cannot be specifically identified, all the records may have to be kept for the longest retention period.
The start of retention is usually based on a trigger action. Typically these are based on (a) date created, (b) date modified, or (c) date of last action. However, they may also be based on less specific events, for example: (a) 7 years after a contract has expired, (b) 25 years after an employee leaves, (c) when a child turns 21. These less specific events need special attention.
Action: Organisations should ‘map’ retention requirements to the locations where the records are stored. This process will likely involve a discussion regarding the replacement (or supplementation) of email backups with retention policies, and may end up looking something like the following:
EXO mailboxes of senior managers
EXO mailboxes of all other employees
MS Teams 1:1 chats of senior managers
MS Teams 1:1 chats of all other employees
MS Teams private channel chats
As per retention policies
ODfB accounts of senior managers
ODfB accounts of all other employees
SPO sites that are not subject to more specific policies
Microsoft 365 Groups (includes mailbox and SPO site)
As per retention policies
SPO sites with specific retention requirements, per site or library
As per retention policies
What are the retention options with an E3 licence?
The three main options available with an E3 licence – labels, label policies, and retention policies – are all set from the Compliance admin portal under the ‘Information Governance’ section.
The table below summarises the options:
Type of policy
Can be used for
‘Implicit’, ‘safety net’ retention policies. These policies: (a) work in the back end and cannot be changed by an end-user; (b) create a preservation hold library in SPO sites and ODfB accounts, and hold deleted emails in a hidden EXO mailbox folder; (c) provide an alternative to back-ups, although it should be kept in mind that all content that is retained in this way contributes to the overall storage quota; (d) do not retain record of what was destroyed.
EXO, SPO, MS Teams, ODfB
‘Explicit’ label-based retention policies. Labels must be published to the required workloads before they become visible. Records cannot be deleted once a label has been applied, but end-users can change or remove the label and then delete the record. Label-based policies do not retain a record of what was destroyed.
EXO, SPO, ODfB
‘Explicit’ label-based retention policies that are auto-applied based on three limited options – see below. Records cannot be deleted once a label has been applied. These policies do not retain a record of what was destroyed.
EXO, SPO, ODfB
The three auto-apply options are as follows:
Apply label to content that contains sensitive information. Unlikely to be used as retention is never based on sensitivity.
Apply label to content that contains specific words or phrases, or properties. Possibly useful. Recommend that organisations do a Content Search first to see what records may exist.
Apply label to content that matches a trainable classifier. E3 licences provide six out of the box, limited classification options including ‘profanity’. These are unlikely to be of any value.
Mapping the E3 options
Given the options available, the following is a suggested mapping of records, retention and policy options:
EXO mailboxes of senior managers
Retention policy, plus possibly labels auto-applied to specific records
EXO mailboxes of all other employees
Retention policy, plus possibly labels auto-applied to specific records
MS Teams 1:1 chats of senior managers
MS Teams 1:1 chats of all other employees
MS Teams private channel chats
As per retention policies
ODfB accounts of senior managers
Retention policy, plus possibly labels auto-applied to specific records
ODfB accounts of all other employees
Retention policy, plus possibly labels auto-applied to specific records
SPO sites that are not subject to more specific policies
Microsoft 365 Groups (includes mailbox and SPO site)
As per retention policies
SPO sites with specific retention requirements, per site or library
As per retention policies
Retention policy (safety net) plus label-based retention policy for specific records
Organisations planning to deploy retention policies should be aware of the limits on custom policies, as described on this Microsoft page, ‘Create and configure retention policies‘. There are no limits on policies that apply to an entire workload (e.g., all EXO mailboxes).
1,000 Microsoft 365 groups
1,000 users for Teams private chats
100 sites (OneDrive or SharePoint)
The page above notes also that ‘There is also a maximum number of policies that are supported for a tenant: 10,000. However, for Exchange Online, the maximum number is 1,800. The maximum number includes retention policies, retention label policies, and auto-apply retention policies.’
How will disposition be managed
Microsoft 365 retention policies retain records for a specified period and usually then delete the records automatically. No record is retained of what was deleted. Even with an E5 licence, only limited metadata is retained and only on those records subject to disposition review.
Organisations deploying retention policies with an E3 licence need to understand the potential risks associated with being unable to provide evidence of what was destroyed.
There are at least two ways to approach this point, but in all cases the approach and options must be legally defensible:
Ensure that organisational policies clearly indicate what records will be destroyed without any record being kept. For example, with certain exceptions, most emails, Teams chats, SPO sites and the content of ODfB accounts will only be kept for 7 years and then destroyed.
Establish a process to ensure that a record is kept of specific records that have been destroyed. For example, the metadata of records, due for disposal and stored in document libraries in specific SPO sites, will be captured (and stored separately) before the records are destroyed, and then the document library will be deleted. This is a labour-intensive process and may only be used on some sites.
Addressing E3 licence limitations and shortcomings
The limitations of E3 licence retention policies (and also E5 licences from an evidentiary point of view) should not put organisations off using the out of the box options or cause them to acquire third-party products.
The retention options available in Microsoft 365 now provide functionality that assists recordkeeping compliance, for example the ability to apply ‘back end, safety net’ retention policies to EXO mailboxes, MS Teams chats, ODfB accounts and whole SPO sites. Coupled with the ability to locate all records, including ‘deleted’ records via Content Search, this should be a boon to records managers and also to IT (saving the latter from having to recover records from backup tapes).
Once retention policies have been deployed, the next most complex task may be the disposition approval and destruction process for those records for which a formal disposition review and approval process is required, including the requirement to establish a list of records to be destroyed.
As noted above, even the E5 licence options for disposition review and ‘proof of disposition’ are inadequate in terms of the metadata that is presented and retained. Until Microsoft provides the ability to record the metadata of every record that is due for disposal and/or destroyed, based on ANY type retention policy this process will continue to be a manual task for records managers.
Organisations may consider acquiring a limited number of E5 Security and Compliance licences to gain access to the E5 ‘Records Management’ capability to gain access to other options including File Plan, auto-classification options, and Disposition Review. However, aside from these options (and the ability to auto-apply labels more broadly) these options may not add much more capability to the existing options already available with an E3 licence.
Summarising the options
Based on the above details, organisations with E3 licences might do the following:
Create several ‘safety net’ retention policies for EXO mailboxes, ODfB accounts, MS Teams chats and general SPO sites. EXO/ODfB retention policies may be (a) based on current back-up periods and (b) be split into ‘senior’ and all other employees. They may also create policies for Microsoft 365 Groups which will cover the Group’s EXO mailbox and SPO site. Some of these policies may map to existing retention classes in the records retention schedules, but others such as the EXO, ODfB and MS Teams policies may need to be added.
Create label-based retention policies for content stored in specific SPO sites where more granular retention policies are required per library. However, keep in mind these can be changed or removed by end-users with add/edit access. Generally it is not recommended to apply label-based policies to EXO or ODfB unless end-users will apply these accurately and consistently.
Auto-apply some labels to specific content stored in SPO, EXO and ODfB (having identified that these records exist first via a Content Search). These labels should have a retention period that is longer than the safety net policy or any other label-based policy.
Establish a ‘manual’ disposition review process for records that are subject to label-based retention policies.