Posted in Compliance, Conservation and preservation, Electronic records, Governance, Information Management, Information Security, Legal, Records management, Retention and disposal, Security

Destroying digital records – are they really destroyed?

Most people should be aware that pressing the ‘delete’ option for a file stored on a computer doesn’t actually delete the item, it only makes the file ‘invisible’. The actual file is still accessible on the disk and can be retrieved relatively easily or using forensic tools until the space it was stored on is overwritten.

Traditional legacy electronic document and records management (EDRM) systems have two components:

  • A database (e.g., SQL, Oracle) where the metadata about the records are stored
  • A linked file share where the actual objects are stored, most of which are copies of emails or network file share files that remain in their original location.

In most on-premise systems, email mailboxes, network file shares, and the EDRMS database and linked file share are likely to be backed up.

When a digital record comes to the end of its retention and is subject to a ‘destruction’ process, how do you know if the record has actually been destroyed? And even if it is, how can you be sure that the original isn’t still stored in a mailbox, network file share, or a back up?

This post examines what actually happens when a file is ‘deleted’ from a Windows NT File System (NTFS), and questions whether digital records stored in an EDRMS are really destroyed at the end of the retention period.

The Windows NTFS Master File Table (MFT)

Details of every file stored on a computer drive will be found in the NTFS Master File Table (MFT).

In some ways, the MFT operates like a traditional electronic document management system – it is a kind of database that it records metadata about the attributes of the digital objects stored on the drive. These attributes include the following:

attriblist

As noted in the diagram above, the details stored by the MFT include the $File_Name and $Data attributes.

  • The $File_Name attributes include the actual name of the file as well as when it was created and modified, and its size.  This is the information that can be seen via File Explorer and is often copied to the EDRMS metadata.
  • The $Data attribute contains details of where the actual data in the file is stored on the disk (in 0s and 1s) or the complete data if the file is small enough to fit in the MFT record.

If the MFT record has many attributes or the file data is stored in multiple fragments on a disk (for example as a file is being edited), additional MFT ‘extension’ records may be created.

When a file is deleted, the MFT records the deletion.

  • If the file is simply deleted, the record will remain on the disk and can be recovered from the Recycle Bin.
  • If the file is deleted through SHIFT-DEL or emptying the Recycle Bin, the MFT will be updated to the ‘Deleted’ state and update the cluster bitmap section to set the file’s cluster (where the data is stored) as being free for reuse. The MFT record remains until it is re-used or the data clusters are allocated in whole or part to another file.

So, in summary, ‘deleting’ a file does not actually delete it. It may either:

  • Store the file in the Recycle Bin, making it relatively easy to recover, or
  • Change the MFT record to show the file as being deleted but leave the file data on the desk until it is overwritten.

How does an EDRMS store and manage files?

The following summary relates to a well-known Electronic Document and Records Management System (EDRMS). Other systems may work differently but the point is that records managers should understand exactly how they work and what happens when electronic files are destroyed at the end of a retention period.

Most EDRM systems are made up of two parts:

  • A database (SQL, Oracle etc) to store the metadata about the record.
  • An attached file store that stores the actual digital objects.

When EDRM systems are used to register paper or physical records (files and boxes), only the database is used.

When digital records are uploaded to the EDRMS:

  • The metadata in the original file, including the file type, original file name, date created, date modified and author are ‘captured’ by the system and recorded in the new database record.
  • Additional metadata may be added, including a content or record ‘type’.
  • The record will usually be associated with a ‘container’ (e.g., ‘file’). This containment makes the record appear to be ‘contained’ within that container, whereas in fact it is simply a metadata record of an object stored elsewhere.
  • The original record filename is changed to random characters (to make it harder to find, in theory) and then stored on the attached (usually Windows NTFS) file store, often in a series of folders.
  • A link is made between the database record and the record object stored in the file store (the MFT record).

When the end-user opens the EDRMS, they can search for or navigate to containers/files and see what appears to be the digital objects ‘stored’ in that container/file. In reality, they are seeing a link to the object stored (randomly) in the file store.

What happens when an EDRMS record is destroyed?

If there is no requirement to extend their retention, or keep them on a legal hold, records may be destroyed at the conclusion of a retention period.

For physical records, this usually means destroying the physical objects so they cannot be recovered, a process that may include bulk shredding or pulping.

For digital records, however, there may be less certainty about the outcome of the destruction. While the EDRMS may flag the record as being ‘destroyed’ it is not completely clear if the destruction process has actually destroyed the records and overwritten the digital records in a way that ensures its destruction to the same level as destroyed paper files. 

Also:

  • If the original associated NTFS file share becomes full and a new one is used, the original is likely to be made read only.
  • There is likely to be a backup of the EDRMS.
  • The original records uploaded to the EDRMS probably continue to exist on network files shares, in email, or in back up tapes.
  • Digital forensics can be used to recover ‘deleted’ files from the associated file share.

Consider this scenario:

  • An email containing evidence of something is saved to a container in an EDRMS.
  • The container of records is ‘destroyed’ after the retention period expires.
  • A legal case arises after the container is ‘destroyed’
  • A subpoena is made for all records, including those specific records.
  • Has the record actually been destroyed, or could it still be recoverable, including from backups or the digital originals?

Is it really possible to destroy digital records, and does it matter?

Yes, records can be destroyed by overwriting the cluster where the record is kept, and some EDRM systems may offer this option.

But:

  • Do EDRM systems overwrite the cluster when a digital record is destroyed in line with your records retention and disposal authorities, or simply mark the record as being deleted, when it is still technically recoverable?
  • Could the record still exist in the network file shares or email, or in backups of these or the EDRMS?
  • Might it be possible to recover the record with digital forensics tools?
  • Does it matter?

It might be worth asking IT and your EDRMS vendor.

References:

 

 

Posted in Classification, Compliance, Electronic records, Governance, Information Management, Legal, Office 365, Office 365 Groups, Products and applications, Records management, SharePoint Online, Training and education

AI curated chaos or control – the equally valid but opposite ends of the SharePoint spectrum

There are, broadly speaking, two ‘bookend’ options when it comes to creating new SharePoint Online sites and the document libraries in those sites:

  • ‘Controlled’ model: The creation of new sites is restricted to a small group of individuals with admin rights, who also oversee the creation of document libraries and application of metadata. A combination of controlled and manually applied classification and metadata and retention policies are used to access and manage content over time. Artificial intelligence (AI) tools can also be used to manage content.
  • ‘Chaos/uncontrolled’ model: The creation of new sites, including the creation of document libraries is not restricted. AI tools (including auto-classification) and auto-applied retention policies are used to classify, access and manage content over time. This model assumes that any form of random categorisation applied by end users (e.g., library names, metadata) is mostly ignored by AI tools.

From a traditional information governance and records management (ISO 15498/ISO 16175) point of view, the second ‘chaos’ or uncontrolled model option seems to run counter to conventional wisdom and agreed standards.

From a practical point of view, the first ‘control’ model option seems to run counter to common sense given the volume and range of digital information and the difficulty of classifying or categorising information and records correctly.

Which option is better?

Confusingly, perhaps, the answer may be a combination of both.

  • Certain types of more formal records, such as those required for corporate compliance, formal policies, staff files, accounting information not stored in a finance system, property information, and/or product information, is almost certainly going to be better off in a controlled SharePoint sites with pre-defined libraries and metadata. These types of documents are more likely to be subject to records retention requirements and almost certainly may be subject to eDiscovery and legal holds.
  • Other types of less formal records, including ‘working’ documents, chats and conversations may be better off stored in uncontrolled SharePoint sites, including SharePoint sites linked with Office 365 Groups and Teams, and in MS Teams/Outlook. These types of records are less likely to be subject to records retention requirements but may be subject to eDiscovery and legal holds.

Ultimately, the way the organisation needs to implement Office 365, including SharePoint Online and apply retention policies and other options will depend on its need to comply with oversight and legal requirements (including minimum retention periods), and/or its tolerance for risk.

How does this work in Office 365/SharePoint Online?

If both options Organisations need to make a conscious decision to allow both options, and be prepared to manage both.

The key features of Office 365 and SharePoint to allow both options are listed below:

  • Office 365 retention policies apply to all of Exchange Online, all OneDrive for Business accounts, entire sites (invisible to users) or parts of sites (visible to users).
  • Some retention policies may be applied based on the auto-classification of records, subject to review.
  • The creation of SharePoint sites is either controlled (requested and provisioned) or uncontrolled (created by end users) via either (a) ‘Create sites’ in the end-user SharePoint portal or (b) when a new Team is created in MS Teams.
  • All sites, including Office 365 Group/Team sites are reviewed regularly for activity and inactive sites with no content of value deleted.
  • All controlled sites are assigned either an invisible retention policy or individual visible retention policies (with disposal review), depending on their content.
  • All uncontrolled sites are assigned an invisible retention policy. Uncontrolled and inactive sites with content are also made read only.

Features of controlled and uncontrolled SharePoint sites

SharePoint Online is quite different from older versions of the application and those who dismiss it based on previous experience should consider having another look as a lot has changed in the past couple of years.

SharePoint Online allows the creation of sites that contain important content that needs to be controlled of managed as records, as well as sites created and managed entirely by end-users. And, as an added bonus, all the content is stored in the one place, not in multiple locations (network drives, email servers, EDRM system, etc).

The elements that make up both types of sites, as well as ‘informational’ sites, are described below:

  • Controlled sites
    • Where the organisation’s official records are stored and managed.
    • Created by SharePoint Administrators.
    • More formal in nature, containing the official records.
    • Structure decided by business areas – for example, document libraries using agreed naming conventions.
    • Use of Content Types and site column or local library metadata to define the content.
    • Application of Office 365 retention policies to entire sites or individual document libraries, with disposal reviews. Auto-classification is less likely to be required as the content has already been structured as required.
  • Uncontrolled sites
    • Usually based on end-user created Office 365 Groups or MS Teams.
    • Where ‘working documents’ are created and managed, with the emphasis on allowing end-users collaborate and communicate easily and effectively – and move content to formal sites when required.
    • Created by end-users but naming monitored by SharePoint administrators (or using rules).
    • Informal in nature, used for working documents (effectively replacing personal and network file shares, and other unapproved systems).
    • A fluid structure for document libraries, driven by end-user requirements (not imposed by others).
    • Little if any use of Content Types or metadata.
    • Retention based on Group activity (E5 licences), otherwise based on Office 365 site retention policies and/or auto-classification options.
    • No disposal reviews – content is deleted after a given period of time.
  • Informative
    • Communication sites (e.g., ‘intranet’)
    • Used to publish information to the organisation

Things to watch out for

It is largely true that if you give people an option, someone is bound to try it, sooner or later, especially if it says ‘Create site’, ‘Create team’, or ‘Create group’. Early adopters learn quickly and can just as quickly abandon something that provides no benefit. 

In a ‘free for all’ SharePoint environment, where end-users can create new sites, teams or groups (both of the latter have a SharePoint site), the most likely issues will include:

  • Sites with names that are very similar to ones that already exist, created because the end-user didn’t know another existed (it may not be obvious) or didn’t like the name.
  • Sites with names that make no sense (including common acronyms) or are just ‘wrong’ or contrary to preferred naming conventions.
  • Sites used to create and store content that really should be stored in a more formal site or, conversely, doesn’t belong in the organisation’s official information systems (e.g., photos of someone’s wedding).

All of these issues require some general rules about the creation of new sites (or Office 365 Groups or Teams or Yammer Groups), including suggested naming.

Global and SharePoint admins can monitor the environment and fix issues when they arise rather than wielding a big stick.

What’s great about it

You can have the best of both worlds with SharePoint Online.

  • Keep formal official records in ‘formal’ sites with controlled structures and metadata.
  • Allow end-users to get on with creating, collaborating, sharing (one copy, not attachments), chatting, on any device.

If your communications and change management are good, end-users will soon learn how much fun it can be to use Teams, or access their content from File Explorer (or both!), without having to having to be trained how to save records. All they need to know is how to use the ‘Move’ option to move the final version of records to a formal site.

The foundation of any compliance program is knowing where all of your data lives and then classifying, labeling, and governing it appropriately.

Posted in Disasters, Electronic records, Information Management, Information Security, Legal, Office 365, Records management, Retention and disposal, SharePoint Online, Training and education

Why is it so hard to ‘go digital’?

I visited a local fast-food outlet recently and could not help but notice the ‘Lever Arch’ binders in the small office behind the counter. A small two-drawer filing cabinet was also located below the desk.

20191002_125518

It made me wonder – in this day and age when pretty much everyone has access to the internet including via their smart phone, why are there any paper records?

And, why is it so hard to ‘go digital’, when so many better and safer digital options are available?

Reasons for not going digital

People probably want to keep paper records in this digital age for a few fairly common reasons, all of which I’ve encountered over the years.

  • Ease of access. It is much ‘easier’ to access a record if it’s in the folder with an obvious name, like ‘Rosters’.
  • Speed of access. You can access a paper record in a couple of seconds. Accessing the same record on a computer means logging on then searching or navigating to where it is stored (potentially including on personal removable storage devices).
  • Easier to archive. At the end of a given period the records can ‘simply’ be placed in an archive box and sent off for archiving.
  • Keeping digital records is too ‘hard’.
  • The company doesn’t offer any other option.
  • ‘Computers are hard’.
  • No obvious or pressing business reason to go digital.
  • A preference for paper, or belief that paper records must be kept.

Which of the above have you encountered? Let me know via this anonymous Form:

Or click this link:

https://forms.office.com/Pages/ResponsePage.aspx?id=DQSIkWdsW0yxEjajBLZtrQAAAAAAAAAAAAN__td1WRVUM0hJM0g2Q1NCWFdLS0JYM0k5QUlOUVUxRC4u

Keeping paper records can be risky

Keeping paper records can be all well and good, unless this sort of thing happens:

burger-king-fire-hed-2017-1260x840
Source: https://finance.yahoo.com/news/burger-king-used-photos-real-105654804.html

If you keep paper records when better digital options exist, you are taking a calculated risk that doing so is ‘OK’.

Of course, not all businesses (a) store the only copy of their physical records locally or (b) burn down (including by being constructed in fire-prone areas). However, these are not the only risks. Other risks include:

  • Flooding, from burst pipes, storms, or floodwaters. Water-damaged records are not easy to recover.
  • Damage from falling objects, including trees or other objects falling from the sky.
  • Theft or vandalism.
  • Business closure and leaving records behind in the abandoned building.
  • Any combination of the above.

What’s the back up for physical records?

What’s the back up for these paper records when disaster strikes?

Generally, unless the physical records have been transferred off-site, or they are the printed version of a digital original that can still be accessed, there isn’t one.

Is there a better, digital way?

Yes.

Printed records are likely to fall into several broad categories, each of which can be managed in their own way. For example, in the business above:

  • Policies and procedures, including ‘operating manuals’ and similar types of instructions are likely to be the printed version of digital originals. They can be made available on the company intranet or, if one doesn’t exist, sent via email.
  • Financial records (e.g., invoices). Again, these are likely to be the printed version of a digital original. If they were in printed form when received (e.g., by mail, with a delivery), the company should (a) ask for digital copies to be sent by email, or (b) scan them and store them digitally.
  • Rosters and general documents relating to groups of employees (as opposed to individual staff ‘files’). Rosters could still be printed for display purposes, but the original should be kept in digital form.
  • Staff files. The format of these may depend on the organisation, but there should be no reason for ‘local’ staff files to be kept in an organisation that has a centralised HR system.
  • Other types of business documents. If necessary, these could be scanned and kept in digital form.

And, of course, all of these could be kept in Office 365, including SharePoint for document storage and MS Teams for teams chat, including for front line workers.

Additional training and support may be required to help these areas ‘go digital’.

 

 

Posted in Electronic records, Information Management, Legal, Products and applications, Records management, SharePoint Designer, SharePoint Online, Training and education

Auto-populating document templates via a form in SharePoint

Most organisations have standard agreements or contracts or similar types of documents.

The common factor between them is that the original template remains the same while elements within the document change. For example, a client name, address and phone number, or differing contract terms.

There are several different ways this is achieved, including:

  • Printing the form and completing it manually.
    • This is time-consuming, handwriting can be difficult to read or require the form to be re-completed, and there is no easy way to extract the data. These types of forms are often scanned for storage.
  • Completing a digital version in Word (and sometimes printing/scanning or saving as a PDF).
    • This is also time consuming and in many cases it can be faster to print the form to fill it in by hand. Errors and omissions are possible and if the metadata appears in more than one place it must be re-typed. There is no easy way to extract the data.
  • Using editable PDF forms, sometimes using (Adobe or other) digital signatures.
    • These are very common (and very useful for specific purposes such as simple forms, less so for common agreements). They are time consuming, errors and omissions are possible and metadata must be re-typed. There is no easy (or cheap) way to extract the data.

Common factors in all of the above are that they are time-consuming and the data is hard to extract from the form.

A better and more efficient option

This post describes how to create a form in SharePoint that, via a very simple workflow:

  • Auto-creates one or more Word documents (multiple based on metadata choices contained in the form).
  • Auto-populates the Word documents where required with the metadata in the form. Where the same metadata value (e.g., ‘Client Name’) appears more than once, that value appears throughout the document where required at the same time.
  • Stores that document (or documents) in a folder (actually a document set) that can be used to add other content.

Additional benefits are that:

  • The metadata is easily accessible for export and other uses.
  • The Word document can be ‘signed’ with a touchscreen computer.
  • The Word document can be saved as a PDF.
  • Other documents can be added to the same folder.

This post is based on several actual examples that I developed (with the assistance of our SharePoint Developer) in a very large (9,000 staff) organisation.

The primary uses were for client agreements based on standard templates, including up to 10 different documents per client. We also deployed other designs that used a similar methodology, but the underlying principle was the same.

Note that, while the model is actually simple to implement, this post contains all of the details to follow step by step. I’m not a fan of posts that only provide part of the details and leave the rest to the imagination.

Setting up the model

Important note: The SharePoint site MUST have the document set feature enabled in the Site Collection Administration settings. Otherwise, the option to create a custom document set will not appear.

The model consists of the following elements that can be created by a SharePoint Administrator, a Site Collection Administrator or a Site Owner.

  • New site columns that will map to the elements in the form. For example, ‘Client Name’, ‘Client Address’, ‘Client Phone Number’. Note that every SP site has a lot of standard site columns so some of these can be used instead of creating new ones.
  • A new document set site content type containing all the site columns that should appear in the form. (‘Add from existing site columns’ option). It is recommended you give the document set a name that will be clear to end users as they will select this from a list. For example ‘Client Folder’ or ‘Agreement folder’.
  • A new document site content type for every template that is needed. The actual document template are not added now, only after the content type has been added to the document library – see below. It is recommended that you give each of these document CTs a name that is similar to the name of the document template.
  • A document library. It is recommended that you create a dedicated library for this purpose with a name that makes it very clear what it houses, for example ‘Client Agreements’. See below for the set up of the library.

Once all of these options are in place, the SharePoint Designer workflow can be set up – see below.

Setting up the document library

Library settings

The document library needs to be set up as follows in the Library Settings section.

  • In Advanced settings, enable the option ‘Enable management of content types’. This will make a new section ‘Content Types’ appear in the Library Settings.
  • In the newly visible ‘Content Types‘ section and choose ‘Add from existing site content types’ and add all the new site Content Types that were created.
  • The newly added CTs will now be visible, along with the default ‘document’ content type.

Document set CT settings

Click on the new document set CT. The metadata site columns that were added should be visible in the ‘Columns’ section.

Click on ‘Document Set settings’. In the section ‘Allowed content types’, click then use the ‘Add’ option to add all the document CTs that are required. These will now appear in the right-hand section.

 

O365_SPO_AddDocCTtoDocSetCT

Scroll down to ‘Shared columns’ and select all the document set columns. It does not matter that these will be shared with document CTs that don’t use the columns, as we will see below.

Click OK and return to the library settings area.

Adding the templates

At this point it is assumed that you have one or more document templates ready to upload. The template/s should be in a newer version of Word (e.g., .docx NOT .doc).

The ‘Content Types’ section of Library Settings displays a list of all the CTs that were added, including the document set CT (which will not be changed).

To add the template, click on the name of the (document) CT. In the new page that opens, you will see the list of site columns that have been shared from the document set.

Click on ‘Advanced settings’, where you will see the ‘Document Template’ section. Click the ‘upload a new document template’ option, choose your document template, and click OK.

O365_SPO_AddTemplatetoDocCT

Link the metadata columns with the template

Now, return back to the document CT ‘Advanced Settings’ (if you are not still there) and click on ‘Edit Template’ to open the template document in Word.

Now, add the metadata site columns where they are required in the template. For example, next to ‘Client Name’, place the cursor where you want the metadata to appear (don’t forget to include a space!).

In Word, go to the ‘Insert’ option on the ribbon menu and then go to the ‘Text’ section. Choose the ‘Quick Parts’ > Document Property and you should see the metadata columns as shown below.

O365_SPO_InsertMetadatainDocTemplate

Add the relevant document metadata where it should appear in the Word template. You will notice that the same metadata element can be used in multiple locations throughout the document. You can also use these in the header and footer and apply different formatting as required.

If you have made an error, do not ‘delete’ the added metadata in square brackets, instead right click and choose ‘Remove content control’. Be careful of formatting too especially different fonts and font sizes. Some of these will be more visible once you create the first document (see below).

The finished template will look something like the screenshot below.

 

 

O365_SPO_MetadataInWordTemplate3

Repeat for each content type template.

Summary and outcomes of the first stage

The site and library set up stage is now complete. The new content types now appear in the ‘New’ menu as shown below. You may want to edit the new menu options to remove any option you don’t want to appear, such as ‘Folder’ and ‘Document’ (you cannot remove ‘Link’).

O365_SPO_CustomCTDocLibNew.JPG

If the end user selects ‘Client Agreements’, they will be presented with a form to complete such as the example below – but this does NOT yet create the template document. That’s the next step below.

 

O365_SPO_CT_DocSet_NewForm3.JPG

 

Note that the order of these metadata elements can be moved around as required via the document set settings.

Create the workflow

You will need access to and be able to use SharePoint Designer to complete this section.

Remember: The workflow is based on the end user selecting and completing a new (document set by completing the form as shown above. The workflow is triggered by the fact that a new item has been created, which in turn creates and saves a new document (or documents as required) with the metadata populated automatically ‘inside’ the new document set.

Open SharePoint Designer

First, click on ‘Lists and Libraries’, choose the library that the workflow will be associated with, then click on ‘List Workflow’ as shown in the ribbon menu below.

SPD_NewListWorkflow

Give the workflow a name that will help to identify it in future – in this example, ‘Create Client Agreement’ would be a suitable name. Note:

  • You must create this as a SharePoint 2010 workflow.
  • The workflow can create one or more documents. In this example, only one document is created.

New workflow settings

A new tab will open. On the top right of the ribbon menu, click on ‘Workflow Settings’.

In the ‘Start Options’ section, check the box to start the workflow automatically when an item is created. The manual start checkbox should already be checked. This will allow the end user to run it again if required.

SPD_StartOptionsSP2010

Note – Some organisations may prefer not to allow the workflow to start automatically because they want to check the form first. In this case, the document set-based form can be created, but only after it is created the end-user must choose to run the workflow via the ‘More – Workflows’ option from the 3-dot menu.

 

Create local variables

Click on the ‘Local variables’ option on the top right of the ribbon menu to create (Add) two local variables:

  • DocSetName < this one is used to record the name on the document set.
  • DocumentPathforClientAgreement < this one is used to save the new document ‘under’ the document set.

Create the workflow

In the Workflow settings, click on ‘Edit workflow’ to create the workflow. For this example, there are two steps.

Click on ‘Step’ to change the name to something like ‘Initialisation’ or ‘Initialise variables’.

SPD_WorkflowNewStep1Blank.JPG

In this part we add and configure the two local variables that were created.

Click where it says (‘Start typing …’), click on on ‘Action’ in the ribbon menu, and choose ‘Set workflow variable’ to set the two variables.

  • Set Variable: DocSetName to Current Item:Name
  • Set Variable: DocumentPathforClientAgreement to [%Variable: DocSetName%]

Both of these will be set as a String value.

SPD_SetSP2010WorkflowVariable1.JPG

Click just underneath the step; a short orange line should appear. Click on ‘Step’ from the ribbon menu to create the next step.

(Note – a screenshot of all the following steps can be seen below)

  • Rename the step if required (e.g., to ‘Create Agreement’).
  • Click in this new step where it says (‘Start typing …’), then click on ‘Action’ (ribbon menu) and choose ‘Create List Item‘.
  • Click where the new action says ‘this list‘. A new dialogue box opens ‘Create new list item’. Select the name of the library from the drop down list in that dialogue box.
  • As soon as you do this, ‘Path and Name (*)’ appears below ‘Content Type ID’. You must complete the second part of this command before it can be saved.
  • Click on Path and Name (*) and click ‘Modify’. The ‘Set this field’ option should not be changed, only the option ‘To this value’. To the right of the blank field click the ‘fx’ option, then do the following.
    • For ‘Data Source’, choose ‘Workflow variables and parameters’.
    • For ‘Field from Source’, choose ‘Variable: DocumentPathforClientAgreement’
    • For ‘Return field as’, leave it as a ‘string’ value.
  • After you click save, the ‘Value assignment’ dialogue box should still be open. If not on the ‘Path and Name (*)’ option, then Modify, which will open the ‘Value assignment’ dialogue.
  • Click on the three dot menu option (to the left of fx) to open the ‘String Builder’ dialogue. Modify it as shown below by adding the prefix text. This puts the name given in the document as the first part of the document name: [%Current Item:Name%]/[%Variable: DocumentPathforClientAgreement%]
  • Note, you can add anything else you want after the last ‘]’, for example ‘- Client Agreement’, as a suffix to the document name.

SPD_WorkflowNewStep2CreateListItemA.JPG

Click OK (several times) to close the dialogue.

Add a ‘Stop the workflow and log’ option from the Action menu.

The final workflow is shown below:

SPD_TwoStepWorkflowFinal

Publish the workflow

Finally, publish the workflow. You can also press ‘Save’ to save without publishing. Publishing also saves any changes.

Allow some time for the workflow to appear in the document library. Generally this is fairly quick – refreshing the site page may assist.

Confirm the workflow is ready

To confirm the workflow is ready, click the three dot menu to the right of the document set and click on ‘More’, then ‘Workflow’.

The new workflow should appear similar to the screenshot below.

Note that this is the primary interface for most actions relating to the workflow. From here you can click the workflow to run it again any time (Manual start). If the workflow has a problem you will see that message here under ‘Running Workflows’; from there you can terminate the workflow if it has a problem (which sometimes happens – the clue is that the document was not created).

SPD_LibraryWorkflowReady

End result

When the end user completes and saves the form, the workflow will run, creating one or more documents (based on the template) ‘inside’ the document set. Each document will have the correct metadata based on the template.

O365_SPO_MetadataInWordTemplateFilled3.JPG

 

Benefits

There are many benefits to creating this model to manage common document agreements, contracts and other templates.

  • The document template always remains the same and can be updated at any time (but note that entire template updates require re-connecting all the metadata elements).
  • If a mistake is made in the metadata, the end user can simply delete the documents that were created and re-run the workflow as many times as required, saving a lot of effort in having to re-populate an entire document. If there is concern about deleting documents, the manager can set an alert on the library. The Recycle Bin keeps deleted documents for 90 days.
  • All Word documents created this way include the metadata from the library in their properties (the ‘metadata payload’). This includes the Document ID (if enabled).
  • Once the Word document has been created it can be ‘signed’ electronically using touch screen technology. If you really need a more sophisticated signing process, consider acquiring a third-party product.
  • Once the Word document has been signed in this way, it can be saved as a PDF, preventing changes.
  • If saved as a PDF, the defaults save location is the same location. Saving to PDF is a three step process: Open the Word document, click ‘Save as’, and change the option to PDF.
  • All the metadata site columns can be exported for analysis and reporting purposes. It may be also be used to created groupings of records for example ‘All contracts created by users’, or ‘All contracts that have a specific metadata choice option’.
  • The newly created Word or PDF documents can be shared, including with external people if required.

Negatives

In practice we found that there were not many negatives associated with this model and it brought considerable productivity benefits to the business areas that regularly created multiple agreements with clients, based on standard templates.

The primary negatives we found were:

  • Poor bandwidth meant that the new Word agreement may not create as quickly as required. Business areas with this problem kept both digital copies of the agreement to complete or printed versions.
  • If the entire template had to be changed, all the metadata links had to be re-connected. It was usually much easier only to update the part of the document that needed to be updated, including by adding new pages.
  • Every once in a while the workflow would not work. Our first clue to this was that an end user would call to say the document was not created or a metadata field was blank. We could usually track this problem down to either a network ‘glitch’ or other minor issue.
  • If metadata fields are left blank in the form, the square brackets metadata option remained visible. This then had to be deleted from the final.
  • From time to time, for various reasons, the end user would create a second copy of the document template without deleting the first. This simply creates a new document with the date and time as a suffix to the document name.
Posted in Compliance, Electronic records, Governance, Information Management, Legal, Office 365, Office 365 Groups, Products and applications, Records management, Retention and disposal, SharePoint Online

SharePoint Online – records management options and settings

This post summarises the primary records management options, settings and ideas that can be applied in SharePoint Online to manage records.

This post should be read as the second part of my previous post on the records management options and settings available in the Office 365 admin and security and compliance portals. Some of these settings will be referred to in this post.

The options and settings described in this post should ideally form part of your SharePoint governance documentation.

SharePoint Governance

We have already seen in the previous post that Office 365 Global Admins (GAs) have access to all parts of the Office 365 ecosystem. But they should rarely solely be responsible for SharePoint Online (SPO).

Some form of governance arrangement is necessary for SPO, especially if you plan to manage records in that application.

Some of the key considerations are as follows.

  • Who is responsible for ‘marketing’ or promoting SharePoint in the organisation, and making sure it is used correctly? The area responsible in IT for change management should probably take the lead on this as SPO is only one part of the O365 ecosystem. Records managers should have a role too, or be consulted.
  • SharePoint Administrator. You should already have a SharePoint Administrator and that person (or persons) is likely to be sitting in your IT department. Records managers will rarely also be SharePoint administrators; the two need to work closely together.
  • Who is responsible for training people to use SharePoint, especially to highlight the recordkeeping aspects of the application?
  • Who are the Site Collection Administrators? See next point.
  • Who are the Site Owners?
  • Who can create Office 365 Groups?

Answers to these questions should all be documented in your governance documentation.

SharePoint Online Admin Portal

SharePoint Online customised administrator

The SPO administrator role, a ‘customised administrator’ set in the Office 365 (O365) portal, should normally have a log on that is separate from that person’s O365 user log on. The SPO administrator account should not be a generic one (and generic accounts should generally be avoided).

The SPO administrator accesses the SPO admin portal from the Office 365 admin portal. They will also have access to the O365 Message Centre and Service Health sections.

SharePoint Online Architecture

Why a design model is good to have

Organisations should have some sort of design model for their SPO architecture. Most records will be kept in document libraries SharePoint team sites under the /teams path but some could also be under the /sites path.

The design model should include naming conventions for sites to avoid site names that have unknown acronyms or complex names. Site names form part of the total 400 characters allowed from https to the document suffix (e.g., .docx) so site names should ideally be no longer than around 16 characters. For example:

https://tenantname.sharepoint.com/teams/sitename

Records managers should be involved in designing this architecture model and could also be part of any approval process for new sites, to ensure the proposed names are suitable.

The names of SPO sites should generally map to business functions. Where the main function is very large (e.g., Financial Management is very large, you may decide to create sites based on the ‘sub-function’. That is, under the broader Financial Management (or simply ‘Finance’) site, you could have a separate site for Finance AP and another for Finance AR). These can be linked to a hub site that could be the ‘parent’ function site.

Don’t mix functions (such as personnel and IT) in the same site if only because this site is likely to become very large.

Try to aim for team site coverage of all business areas as all areas are likely to create or maintain records.

One relatively easy way to do this is to consult with the business area and understand how they use their current Network File Share location. This has the additional benefits of ‘mapping’ their SPO site to their existing NFS structure (generally or very specific) so it is familiar to them, and assisting with the migration of NFS to SPO later on.

Creating new Site Collections

Generally speaking there now are three types of SPO site:

  • A team site not linked to an O365 Group (but can be retrospectively linked)
  • A team site linked to an O365 Group
  • A communication site

Again, generally speaking:

  • SPO team sites (linked or not with O365 Groups) are the functional replacement for network file shares and, accordingly, contain most of the ‘document’ type records.
  • SPO communication sites are used for publishing purposes, including the intranet. They may contain documents in document libraries that, again, replace network file shares previously used for this purpose.

New sites can be created:

  • Directly from the SPO admin portal.
  • Via the ‘Create Site’ dialogue available in each user’s SharePoint portal, when this option is enabled. When this option is enabled, users can create either a Team site (linked with an O365 Group), a Communication site, or a ‘classic’ site (not linked with an O365 Group).
  • When a new Office 365 Group is created. This includes, if enabled, when a new Team in MS Teams is created, a Yammer group is enabled, or the person choose to create a new group from Outlook. If this option is allowed, whoever creates the O365 Group becomes the Site Collection Administrator and the SharePoint admin will be unable to access the site. For this reason, organisations that want to control their SPO environment may wish to limit who can create Office 365 Groups.
  • Via a PowerShell script.

SharePoint Sites

Site Collection Administrators (SCA)

Every SPO site has Site Collection Administrators. To ensure that records managers can access every site to manage records, it may be useful to add them to the membership of a Security Group that is in turn added to every site’s Site Collection Administrators after it is created.

Site Collection Administrators are added and managed in Advanced Permission Settings.

SPO_AdvancedPermissions

When you click on Site Collection Administrators, this dialogue appears:

SPO_SCAs.JPG

As noted above, if the ability to create O365 Groups is not controlled, the person who creates the O365 Group (as noted in the screenshot above) will become the SCA. The SharePoint administrator will be able to see the site in the SPO admin portal but may not be able to change the SCA settings. They may need to ask a Global Admin to do this.

Being a Site Owner only is not sufficient for records managers. Site Owners should be someone in the business area that ‘owns’ and will manage the SPO site on a day to day basis.

Site collection features – document IDs and Document Sets

Site collection features are only accessible to Site Collection Administrators. The list below expands as new features are activated; as can be seen, the ‘Document ID service’ feature has been enabled on this site. (Note, ‘Site features’ are activated from the Site Administration section, see below).

SPO_SiteCollectionAdminOptions.JPG

The Document ID feature is required for recordkeeping purposes as it assigns a unique Document ID to every object (including document sets but not folders) stored in a library.

If they are to be used in the site, the Document Set feature is also enabled in the Site collection features section.

SPO_DocIDDocSetSiteCollectionFeatures.JPG

After the Document ID service is enabled a new option appears in the Site Collection Administration section called ‘Document ID Settings’ (as noted above).

As can be seen in the screenshot below, all Document IDs begin with a unique set of up to 12 characters. Ideally, the Site name should be used as this will immediately give a clue to the site name on the document.

SPO_DocIDSettings.JPG

Document IDs take the form:

  • Prefix (e.g., ‘SITENAME’)
  • Library number. This is a unique and un-modifiable number of the library where the document is stored. It is not based on the library GUID.
  • Next sequential number.

If a document is deleted or moved from the library, the document ID (the sequential number) is not re-used.

Note that Document Sets use the same Document IDs. These cannot be separately modified.

Site collection features – Site Audit logs

The option ‘Site collection audit settings’ will already be visible in the Site Collection Administration section of all new sites, however (a) the options in the audit settings need to be enabled and (b) the ‘Reporting’ Site collection feature must be activated to enable the production of Site Audit Logs as required.

Note, the Site collection audit sections settings notes that ‘If you’d like to keep audit data for longer than this, please specify a document library where we can store audit reports before trimming occurs’. The default storage location is /_catalogs/MaintenanceLogs. However, the various options shown below must be selected for anything to be saved.

SPO_SiteCollectionAuditOptions.JPG

Enabling ‘Reporting’ results in a new section in the Site collection administration sections called ‘Audit log reports’. This section allows the Site Collection Administrators to create one-off audit logs for a range of activity on the site, going back 90 days.

  • Content Activity
    • Content viewing
    • Content modifications
    • Deletion
    • Content type and list modifications
  • Information Management
    • Policy modifications
    • Expiration and Disposition
  • Security and Site Settings
    • Auditing settings
    • Security settings
  • Custom
    • Run a custom report

The 90 day time period is the same as the O365 audit logs accessible from the Security and Compliance ‘Search’ section. If audit logs are required for longer periods, an add on may be required.

Metadata – Site columns or the Managed Metadata Service

The architecture model and/or business requirements may require the use of specific metadata across your environment. Metadata may be set in three ways.

Managed Metadata Service. This option is effective if you need to use the same metadata columns on multiple sites. Experience suggests that this option will be used selectively.

Site columns. These are in addition to the many columns that already exist by default on every site. This option is very effective if the same metadata needs to be used in multiple document libraries or lists on the same site. It is not accessible on any other site. In document libraries or lists, it must be added as an existing site column (i.e., not via the Create new column option).

Library columns. These columns are created in individual libraries or lists and are not accessible on any other library or list.

All new Site and Library columns have the following options:

SPO_NewColumnOptions.JPG

Each new column may be created in an existing or new group. They may also be (a) made mandatory and/or (b) enforce unique values. Note that making a Site Column mandatory and adding it to a document library will make the library read only in File Explorer if it is synced there.

Columns may have default values and may also include JSON formatting codes.

When Site columns are added to a document library, including via a Content Type (see next section), users may be required to fill in the required metadata (especially if it is mandatory).

Site Content Types

Site Content Types are a way to define metadata requirements for different types of documents, using Site columns. The default ‘document’ Content Type on every new SPO site is simply ‘Document’; all new document-based Content Types will be created using that one as a template.

Site Content  Types may also incorporate standard document templates (via the ‘Advanced section). These templates can be auto-populated using the library metadata. In any case, all metadata in a document library is added to any Office document as its metadata ‘payload’.

Once created, Site Content Types must be added to each individual library where they are to be used. To do this, the individual library must have the setting ‘Allow management of content types’ enabled in the ‘Advanced’ section of the document library settings.

SPO_DocLib_ContentTypesYesNo.JPG

When Content Types are enabled in this way, some other drop down features in the ‘+ New’ option on the library disappear, such as the ability to create Word, Excel or PowerPoint documents as can be seen below (the option on the right shows when Content Types are allowed).

Aggregations, containers, ‘files’ – Site Libraries

SharePoint document libraries are the container, aggregation or ‘file’ (if you will) in which records are stored. They are the functional replacement for network file shares. You may end up migrating from those NFS to SharePoint.

Naming conventions for new document libraries are useful to have but the extent to which you require people to follow them (if Site Owners create them) may differ between organisations.

Document libraries ideally should contain only a year of content; including the year in the library name is a good way to maintain year-based content, which in turn makes it easier to manage at the end of the record’s life.

Avoid using the generic ‘Documents’ library that comes with every new library because users will create folders with uncontrolled names and content.

All SPO document libraries and lists have default views of the metadata. These views can be modified as required (via the option on the top right of the menu bar) with a range of additional metadata that is by default hidden from view. Multiple views can be created; pre-defined views may sometimes be easier than expecting users to depend on searches.

Document libraries include all the usual and expected document management functions including check out/in, copy to or move to and versioning.

SPO_DocumentOptionsincCheckOut.JPG

Users with Contribute or Edit permissions can view and restore versions.

SPO_DocLib_VersionHistory.JPG

If there is a requirement to know who modified what part of a document, it is recommended to enable track changes on that document.

Note, with co-authoring now available, the last person to edit the document will create the last version.

Folders and document sets

Folders  should be seen as visual ‘dividers’ within a file, not as ‘hard-coded’ structures as they are in file shares.

Document sets can include additional metadata (including a document ID), making them suitable for use in breaking down a document library. However, for most of the time, folders are a more logical ‘divider’ for users.

Note that both document sets and folders look the same in a synced library.

Both folders and document sets can have unique permissions.

Create and capture records

One of the best reasons for using SharePoint is the ability to create a single source of truth. That is, a single record stored on a library that multiple people can access and work on at the same time.

Having a single source of truth avoids the requirement to (a) create a initial copy on a personal drive or network file share, (b) attach that copy to an email and send it to multiple people who are all likely to save it somewhere and also send back a changed version.

In SPO, users can create a new record directly within a document library (or in the synced library on a drive). Anyone with access to that library can access it; alternatively the document can be shared. Co-authoring means that anyone with edit access can edit the document. Every time it is edited and closed a new version is created.

If it is necessary to refer to the original from another library, the ‘Link’ option can be used.

Access controls and permissions

All SharePoint site contain three default permission groups. Individuals will usually be added to one of these groups only, depending on their access requirements:

  • Site Owner – Full Control across the site but cannot see the Site Collection Administration section (shown above). There will normally be only two to four Site Owners. Site Owners are responsible for managing their sites.
  • Site Member – Update and edit.
  • Site Visitor – Read only.

All content on a SharePoint site inherits the default permissions above however at any point the default permission inheritance can be broken and unique permissions applied. This is a manual process for document libraries (via Advanced Permissions) but automatically applied if a folder or document is shared with someone who is not in a default permission group.

Note, one of the leading support issues in SharePoint is understanding and unravelling complex permissions, especially when applied to individual documents that are placed ‘under’ folders with unique permissions, in libraries with unique permissions.

Retention and disposal

Generally speaking, a SPO site collection will consist of multiple libraries, each (ideally) containing content that is specific to the activity that it relates to. For example, a library for ‘Meetings’ and one for local forms. Consequently, the records stored in document libraries may require different retention.

If O365 Classification labels are used for retention, and depending on how these are configured, these must be applied per library; the individual documents stored in the library – not the entire library as such, are then governed by the retention requirement.

It is also possible to apply a policy to an entire site collection via site policies. This option will only be useful if the entire site can be subject to a single retention requirement – for example, inactive old sites that have a range of content all likely to be covered by the same retention period, or project sites.

Once retention policies are applied to a library, users cannot delete any content in the library so it may be prurient to apply them when the libraries are no longer used instead. Hopefully you will have implemented year-based libraries, which will facilitate this. Alternatively, the retention period trigger can start when the actual policy is applied.

It may be useful for the records manager to review the content of document libraries, and perhaps export the metadata of the library, before the content is disposed of via the O365 Security and Compliance area as any unique metadata is not visible in the Dispositions area.

When records are due to be disposed, an email is automatically send to whoever is in the Records Management role in the O365 Security and Compliance admin portal. The activity of reviewing and approving disposals happens in the O365 Security and Admin portal.

It may be useful to set up an ‘Archives’ SPO site to keep records of all disposal activities, including metadata from document libraries.

Note the library will remain even after the documents are destroyed. An alternative and perhaps better disposal model would be to use the notifications to alert the records manager to the records due for disposal; the records manager may then export and save the metadata in a SharePoint archives site, and then delete the library entirely.

Note that the retention of records in Exchange Online mailboxes and OneDrive may be managed differently by the organisation.

Minimising duplication of content

SharePoint allows organisations to have a single source of truth, to avoid the duplication of using NFS and then uploading to a document management system.

Users can create the record within a SharePoint library, upload it there, use the ‘save as’ option (where you will see all your SPO sites to choose from).

The ability to share with external users (when this is enabled) also helps to reduce duplication and email attachments.

Hybrid records

As noted above, links can be created in any SPO document library to point to resources in a different location. If paper records are managed in a SPO list, the document library can include a link to that SPO list.

Syncing document libraries to File Explorer

Users with an O365 licence and Windows 10 may use the ‘sync’ option available on the ribbon menu of every library. This option syncs the document library to the user’s File Explorer from where they may continue to access and work on the documents.

Note, as discussed in this post, if there is any mandatory metadata in the library, the synced library will become read only.

End users like using the sync option as, although it doesn’t (yet) display any unique metadata on the library, it allows them to work the way they have always worked and they get the added bonus of being able to do it on any device.

eDiscovery

eDiscovery cases are created in the O365 Security and Compliance portal. Essentially, an eDiscovery case uses search and other options to find records. Once found, these records can be placed on Legal Hold, which prevents their disposal.

If a document library has no retention label applied, and all or some of the content is identified as part of the eDiscovery case with a Legal Hold, and a user deletes a record, that record remains in a hidden library but still visible to the eDiscovery case manager. Once the Legal Hold is lifted, the record will resume the 90 day deletion process after which it will no longer be available.

Search

Search in SPO, and across all of Office 365, is very powerful. A single click in the Search box in the user’s SPO portal will result in suggestions before anything is entered.

Searches will return anything the user has access to. The access limits plus the Artificial Intelligence (AI) engine will return different search results for different users.

Users may also take advantage of the Office Graph-powered Delve (E3 licences and above) or the Discovery option in OneDrive to see information that may be of interest to the user. This works on the basis of the various ‘signals’ between users and objects, as depicted in the graphic below.

microsoft-graph_hero-image.png