Feature Article: June 2017
Check out Craig S. Mullins’ blog on data + database technology. more>
Better data warehouse testing has gained in need; qualified testers are in demand. Today there is an increase in business mergers, data center migrations, compliance regulations along with management's greater focus on data and data driven decision making. Among data warehouse testing focus areas are the ETL (extraction, transformation, load) process, business intelligence infrastructures, and applications that rely on data warehouses.
Organizational decisions greatly depend on the enterprise records in data warehouses and that data must be of the utmost quality. Complex business rules and transformation logic, built using ETL logic, demand diligent and thorough testing.
Planning for the DWH Testing Process
A good understanding of data modeling and source to target data mappings help equip the QA analyst with information to develop an appropriate testing strategy. Hence, it’s important that during the project’s requirements analysis phase, the QA team works to understand technical implementation of the DWH.
Different stages of the DWH implementation (source data profiling, DWH design, ETL development, data loading and transformations, etc.), require the testing team’s participation and expertise. Unlike some traditional testing, test execution should not start at the end of the DWH implementation. In short, test execution itself has multiple phases and should be staggered throughout the life cycle of the DWH implementation (see Figure 1 below).
A key element contributing to the success of the DWH solution is the ability of the test team to plan, design and execute a set of effective tests that help identify multiple issues related to data inconsistency, data quality, data security, failures in the extract, transform and load (ETL) process, performance related issues, accuracy of business flows and fitness for use from an end user perspective.
Overall, the primary focus of testing should be on the end-to-end ETL process. This includes, validating the loading of all required rows, correct execution of all transformations and successful completion of the data cleansing operation. The team should also thoroughly test stored procedures and processes like those of Netezza that produce aggregate and summary tables.
The challenges of planning DWH tests: Listed here are just a few of many reasons to thoroughly test the DWH and use a QA process that is specific to data and ETL testing:
Figure 1. Proposed lifecycle DWH testing process
(graphic courtesy Raj Kamal, Microsoft Corp.)
By Wayne Yaddow
Data and ETL Quality Analyst
Testing Phases To Be Considered For The DWH QA Strategy
Figure 1 shows a representative DWH implementation from identification of source data (upper left) to report and portal reporting (upper right). In between, several typical phases of the end to end DWH development process are depicted such as source extract to staging, dimension data to the operational data store (ODS), fact data to the DWH and report and portal functions extracting data for display and reporting. The graphic illustrates that all data movement programs and resulting data loads should be verified throughout the end-to-end QA process.
Planning for QA staffing: Since a DWH primary handles data, a major portion of the test effort is spent on planning, designing and executing tests that are data oriented. Planning and designing most of the test cases requires the test team to have experience in SQL and performance testing.
Following are several tester skills that are in demand for DWH testing. QA staff for the DWH should be considered based on these abilities:
Best practice DWH testing is needed as organizations seek to develop, migrate, or consolidate their data warehouses. It’s vital that data and systems are tested systematically for errors, bugs, and inconsistencies before production.
One of the greatest risks to the success of any company implementing a business intelligence system is rushing the data warehouse into service before testing effectively with an experienced QA ETL testing team. Whether you are expanding your data warehouse or building one from the ground up, it is important to develop a well-planned and executed DWH testing process.
Wayne Yaddow is a data and ETL quality analyst with Data Gaps (datagaps.com).
About the Author:
|What is a Database?|
|The History and Future of Database Change Management|
|Fixing Corrupt Microsost Access Databases|
|How to Work Remotely and Still Be The Best|
|Getting in Touch with Big Data|
|Planning for Effective Data Warehouse Testing|
|Social Data Has Become Social Big Data|
|The Future of Data Centers: Achieving Agility in a Rapidly Shifting World|
|Here’s a News Year’s Resolution: Master Your Database|
|Making the Grade: Cost Savings Upgrades for Today's Data Center|
|How to Choose the Best DBA for Your Company|
|Virtualization: Wading Through the Deluge of Data|
|SQL Databases and Network Attached Storage|
|Why Big Data Needs Cloud|
|Ten reasons why you should use data models to build apps|
|Beware Big Schema|
|How to Implement Successful Data Integration Cross-Regionally|
|Forging a Path Beyond Hadoop - Software Database Mgmt Sys for Big Data Analytics|
|Database Tips and Tricks|
|Why Data Still Matters|