Data Migration : A Fact Of Life
Implementation of a new application or consolidation of applications & data after a merger or acquisition or legacy data conversion to on-premises data center or on-cloud in any organization is a major IT change initiative to standardize business processes, automate back-office operations and achieve the associated business benefits and ROI. The success of such program depends on data-migration aka data-conversions.
Data-migration projects are unique, because data-migration are built for single execution, retire all the migration work post go-live, while other projects are supported and maintained post implementation.
The data-migration market size is expected to grow from $5.14 Billion in 2016 to $11.49 Billion by 2022
as per market research
Data Migration projects are risky and had tendency to fail. Research says 67% of data migration projects overruns in time or cost or fails. Return on investment is at risk if data-migration implementation is delayed.
Why data-migration project fails?
Proposed Solutions
Organizations can minimize delay and risks by adopting a customized data-migration approach, leveraging relevant tools & technologies, look for potential areas to automate mundane tasks, planning for several mock-conversion runs to address the data quality and data conversion issues in advance before the actual cut-over and applying a set of best practices for the successful migration.
Data-migration approach – A six steps processes can be followed with a focus on data quality.
Tools & technologies –
ETL (extract transform load) middleware tools had below listed key advantages –
Automation – explore the opportunities to leverage automation tools like - Tosca, Selenium, RPA, etc. for automating the potential time-consuming testing areas, mock conversion runs (as multiple mock conversion runs are required before the go-live).
Data Migration Framework – explore the opportunities to develop a data conversion framework for simplified, consistent, cost-effective approach to accelerate the data migration process through various phases. Building such framework will also ease out data validation and reconciliation process. Refer below an example of a high level architecture of Data Migration Framework built on Azure Data Framework.
Data Migration Toolkit – Build a centralized repository for the team to refer the best practice, industry standard templates for unit testing, code review, test case/scenarios, requirement gathering questionnaires, quality process, assets & accelerators. Explore the opportunities to automate the unit testing, code review, test cases creation, mock conversion runs, source data validation, rejected record analysis, data reconciliation and error report, etc. and make these artifacts available in the data-migration toolkit for the team.
Recommended by LinkedIn
Best Practices / Key things to keep in mind
i. Data Migration Strategies – Decide on the business cutover strategy during planning phase, either “Big Bang Migration” Or “Incremental in phases”.
Advantages of a Big-bang approach are: No need to run the legacy (old) and new systems simultaneously. After successful migration, legacy system can shut down.
Disadvantages are long window of migration may impact businesses. The team need to stay attentive continuously for several days/hours to complete the data transfer. Risk of migration overruns in fixing the unexpected issues. Synchronization may not be an issue, but fallback strategies can be challenging if issues are found after the migration.
Due to the risks associated with big-bang approach, organizations are now adopting low risk approach of incremental migration in phases.
ii. Data Migration Scope
Structured and Unstructured data – Understand the requirements of structured versus unstructured data-migration and accordingly do volumetric analysis and plan for the migration architecture and solution. For example, document and associated metadata migration may be a time-consuming exercise, can delay the overall migration because of slow document transfer between source to target server.
Volumetric analysis – Prepare an inventory of all the sources with data structure type and expected data volume. Also look for any non-functional requirements (if any). So, figure out all the factors that can influence the migration activities in advance and accordingly plan for optimizations and performance tuning exercise.
Data on Cloud – Understand the requirements for the data transfer from on-premise to cloud, as now a days organizations increasingly ask to migrate all data or only non-critical business data to cloud to meet the speed to market, scalability, and security requirements. Look for optimal / cost-effective network connectivity and data transfer services from on-premises to cloud to migrate large datasets in petabyte.
iii. Mock conversions – Introduce sufficient testing cycles and mock conversions runs to identify the data conversion, data issues and data cleansing needs to address them well ahead of actual cutover. Set pass % for each testing cycle and log all the defects in appropriate tool to track them from fixing to re-test and passing to ensure following a defined quality process.
iv. Data Quality – Improve data quality beforehand not during migration. Plan to profile the data to know the key data quality anomalies listed below and illustrated through below diagram and accordingly, address them by applying cleansing rules.
v. Data masking – Identify the requirement of masking sensitive data like PHI, PII and write the program to mask them before starting the unit test, system test or mock conversion runs.
vi. Key things to keep in mind
Future direction
A data-migration program opens a door for ongoing Data Quality and Data Governance program.
Conclusion
Data-migration projects are unique, risky and had tendency to fail or overruns in time or cost, so it’s important that one should define the scope of migration very carefully, plan the business cutover strategy, adopt the 6 step data-migration process matured over the period of time, identify the potential areas to automate mundane tasks & mock migration runs for successful data migration, hence business benefits and ROI.
Very good read 📚 👍