Combine Data from Multiple Databases Into 1 Table
Question: if I have different customers, each with their own DBs with similar structure, how do I combine/append them into a single data-warehouse table?
- All goes into DW-T1, with additional
Approach: multiple data imports, and a transform/view
We suggest to create multiple data imports to load data from different customers tables into different tables, like:
- C1-T1 go to
- C2-T1 go to
- C3-T1 go to
And create a data transform that governs these customer’s data:
-- data transform that writes to dw.master_t1 table select *, 'c1' as customer_id from dw.c1_t1 union all select *, 'c2' as customer_id from dw.c2_t1 union all select *, 'c3' as customer_id from dw.c3_t1
You can also convert this data transform into a incremental transform, so that each time it’ll only capture the latest data for each customer table.
This approach is modular, reducing the chance of data inconsistency, and is also easy to modify at every step along the way.
So the total steps needed are:
- Create N data imports for N customers tables
- Create 1 data transform (query above) to group N tables into 1 master table