data warehouse - ETL Testing Datasets / Framework -


i'm trying create reasonable tests our etl processes.

i'm thinking reference / testing ingestion dataset needed. don't want use client data (which other alternative here).

i run current etl on testing dataset reference transformations. way, when source code changes can test references being produced in etl , make sure no regressions created.

i'm not sure right approach. example, if transformation changed in source code, tests compare reference transformation rightly fail. we'd have create new reference transformation dataset transformation. can see getting crazy once team of developers starts making changes separate transformations.

ultimately, need way produce test dataset , test transformations. ideas?

create test data set, containing @ least 1 row every possible transformation outcome. you'll use test data set source every etl test run. new transformations or bugs come up, add additional rows test data set cover transformations.

in etl destination, create tests verify transformation of source data set. you'll need test every transformation outcome ensure complete code coverage. since test data set known , consistent source, tests should have predictable outcome.

automated etl testing isn't complex, complicated , can time consuming set up. requires disciplined development team maintain. luck.


Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -