Revisiting the past
It has been a little over a year and half since my last data integration blog, I had started this blog but never finished it, and now have some time to review and actually finish this series (and to help me prepare for the CRMUG in Reading on March 1st!).
My previous data integration blogs can be found here:
- Part 1: Defining Data Integration Types
- Part 2: Part 2: Data Integration Tools
- Part 3: Part 3: Which Data Integration Tool should I use?
Most of the content in the above posts are still relevant and valid; except Dynamics CRM Online is now Dynamics 365 after the re brand (yes – this was 2016).
I have decided to split out my the content for this topic into several articles, as I was writing the topic I realised that there was a lot to cover; so please be patient!
Replaying back to Part one of this series, I mentioned that Data Integration needed to be defined into the correct terminology, such as Migration, Integration or Replication.
Whilst the term helps define the data integration tools that we can use to meet that scenario, best practices can be applied to all types of data integration to ensure a consistent and uniform approach to all scenarios where integration with Dynamics 365 is required.
Data Integrations Best Practices will help:
- Reduce time spent building up successful integrations.
- Increase the performance of the integration (highly beneficial with large volume moving data).
- Reduce time spent investigating issues and working to a resolution of these issues
Part 1: Error Reporting and Integration Accounts
Types of Error
All types of data integration will at some point cause an error, there are so many working parts to an integration and errors that may arise are:
- Source Data – the quality and format maybe incorrect even though the customer the adamant that they have fixed any errors.
- String Lengths – too long for the target field.
- Decimal/Whole Number data mismatches.
- Assigned Owner does not have the correct permissions to own the new record.
- Plugins triggering on update and expects a value present.
- Target Data/Metadata – a recent change to the system may have changed the metadata of the field that is apart of the integration , the source data no longer maps correctly.
- Source or Target availability, the systems have gone offline.
- And the list goes on for a whole number of reasons.
The Learning Curve
When I first started learning how to use data integration tools, I often would focus on the core functionality that the tool offered. This would just be to perform the basic operations (such as Read, Create, Update) and to get the sample data to move from A to B.
Any errors that arose, I would ignore and investigate at the end and put it down to my lack of understanding/knowledge of the tool.
In some cases it would take many hours actually figuring what was causing the issue in the first place as there was no meaningful message presented at the outset.
As my exposure to different data migration/integration projects grew over time, if I encountered an error, I would learn new ways (based on the tool I was using) to investigate and identify these errors earlier on in integration picking up hints and tips from online forums and my colleagues. Ultimately, I leaned how to handle and prevent some of these error from ever occurring by building conditional logic and extra validation checks.
When you an encounter an error, there are two actions that need to be completed:
- Investigation – The integration is failing and you do not know why – so time will be required to investigate what is causing failure and identifying the error
- Resolution – Can you prevent/handle the error as part of your integration or how else will you resolve the error.
If you are using 3rd party data integration tools, error reporting functionality should be provided as standard, and that if used correctly can, be very useful.
Scribe Insight has several tables in its internal database (known as the execution/transaction log) where it stores captured details for every data transaction (row) that it handles, the errors provide some very descriptive detail when an error is encountered (e.g. the length of the value XYZ is greater than the length of field 123). You can also assign messages/output text to Steps/Actions that Scribe takes throughout the DTS package which you can additionally write back to your source data to help you read the errors in more meaningful language to you.
Additionally Scribe has some out of the box SQL SSRS reports which you can view from the application which can provide useful statistics on the import and help you group types of errors or data integration failures.
Whilst KingswaySoft is built on top of SSIS, this can output failed rows along with the errors whilst also parsing some very good error details and description to the failed rows (along with additional row information). This can be outputted to custom Excel spreadsheets and tables or a separate Error Reporting Database and Tables (my preference as I have SQL scripts to recreate tables and schema for me!), then it is just a case of mapping these rows to the respective columns. This will allow you to view these errors in a meaningful way and help you diagnose the issues quicker.
With both tools, you could even have error logging Entities inside of Dynamics 365.
When either tool encounters an error, you can write the record of the transaction to Dynamics 365 for visibility to Users in the Application (unless CRM down is the issue!) and the Users can start the investigation.
Whilst you will not be able to prevent some or all of the data related errors (i.e. the data is the customers and they should be resolving the data issues or instructing you what to do) you can prevent some of the errors you encounter.
Both packages allow you to determine what happens to each transaction if the tool encounters an error (data or system generated) – i.e. error output rows. Whilst the default action is to record these errors into your error capture area and move onto the next row, you could re-process the data with another set of steps the tool must take to try and correct itself.
For example, if you have an integration which imports Accounts, Contacts, Opportunities and Products, and each of these processes are happening at different times such as the Sales Process Load below. What happens if a required record (such as Customer) does not exist if you are loading opportunities? If the Opportunity is created without the Customer value present (a required field), this will cause an error.
Q. How could this scenario be handled as part of the integration?
A. By conditionally checking for the customer record to exist before you set it, instead of making the assumption that this will exist will prevent an error for being generated. This row could then be re-directed for a second attempt in the next Opportunity data load or wait a few minutes for the Account data load to catch up (Send to Retry).
You will not be able to cater for every scenario, additional checks do come with added performance hits which I discuss later on in a follow up article discussing performance and are not always necessary.
Be proactive and think about these scenarios, build the data load logic to handle and prevent these types of errors from occuring.
Learn how to use the error functionality of these tools as part of using the product, they will save you invaluable time further on down the road and allow you to identify the root cause quicker.
- Report the Error in a meaningful format
- Handling of errors or scenarios to prevent errors
Some of the free tools, such as the OOB import wizard unfortunately do not always present us with a nice detailed error or have the ability to retry, they may even just give us a code! So when choosing a data integration tool, think about the requirement for error handling and whether investing in a 3rd party tool would outweigh the time spent investigating potential errors with Free tools (a consultant spending a day or two on an error may cost the same as a 3rd party tool!).
The use of service accounts with Dynamics 365 is nothing new these days and is common practice with On Premise Dynamics 365 (CRM) installations to run CRM Services or external web services that connect to Dynamic 365 programmatically.
When discussing service accounts with a data integration context, always consider the size and complexity of the integration/migration and whether this would offer real value or not.
For small scale one-off data migrations or using the OOB import Wizard/Configuration Migration Utility tool , service accounts are generally not required; and for the example of the OOB Import Wizard – cannot be used as they cannot be used to access features through the user interface.
For larger data migrations or data integrations (where data will be integrated with one or more external systems over a period of time), a service account should be considered as best practice.
Service Accounts provide:
- Tractability of integration: it will be easier to identify data that has been modified by an Service Account especially with Auditing and the Createdon/ModifiedOn fields in Dynamics 365.
- Full control of which applications use the service account and can reduce the number of concurrent connections used to the Dynamics 365 server at any one time as to not interfere with connection timeouts
- No Front End access, this will stop any one with the credentials from logging into the application and using the product with the Web Client.
Service Account in Dynamics 365 Online are know as Non Interactive users, you are allowed to setup 5 Non interactive user accounts per Office 365 Tenant. I recommend creating these in your first instance so that if you copy the instance, then you will not have to re-configure this User as a non interactive User.
As they are free, there is no reason why you should not use one and are very easy to setup!
Azure AD account (Online Only)
If considering an data integration/migration with Dynamics 365 Online, ask for an account with the default .onmicrosoft.com domain. (e.g. firstname.lastname@example.org). If a company has their on premise AD domain synced with Azure AD and something corrupts/interupts the sync and those Users cannot log into Office 365. The default .onmicrosoft.com would remain unaffected and the data integration would not experience the same issue (unless the source system was affected by the AD issue!!).
Think about the Dynamics Security Role the integration account actually requires to complete its assigned task. At the beginning of a data integration, System Administrator will get over the initial hurdle permission issues (especially in the increase in security requirements are drawn out) and may help reduce the initial build time for the integration. For UAT and Production, the integration account should only require the permissions it actually needs to complete its task.
- This would stop this user/integration from being used to update/delete/create other data sets that it was not intended for. Essentially it locks down a potential weak spot in the security architecture.
- The account would only see the metadata it requires for the data integration (which would help with metadata refreshes at build level also).
I would generally name an integration account along the lines of “Data-source Integration User” or something that would easily identify the account which is being used to integrate the data-source.
So to summarise the Integration Account:
- Think about using a Service Account (or Non Interactive Account) for data integrations.
- Use a default .onmicrosoft.com Azure AD account if allowed to. (Online only)
- Security Role: Give only the permissions needed to perform the data integration.
- Choose a good name!
I have many many for tips, tricks and best practices still to give, such as (not the full list either):
- Performance improvements (increase that Rows per Minute counter) and other considerations
- Matching for data.
- Maintenance Schedules.
Thanks for reading,