Combine Data from Multiple Databases Into 1 Table

Question: if I have different customers, each with their own DBs with similar structure, how do I combine/append them into a single data-warehouse table?

  • C1-T1
  • C2-T1
  • C3-T1
  • All goes into DW-T1, with additional customer_id column

Approach: multiple data imports, and a transform/view

We suggest to create multiple data imports to load data from different customers tables into different tables, like:

  • C1-T1 go to dw.c1_t1
  • C2-T1 go to dw.c2_t1
  • C3-T1 go to dw.c3_t1

And create a data transform that governs these customer’s data:

-- data transform that writes to dw.master_t1 table

select *, 'c1' as customer_id
from dw.c1_t1

union all

select *, 'c2' as customer_id
from dw.c2_t1

union all

select *, 'c3' as customer_id
from dw.c3_t1

You can also convert this data transform into a incremental transform, so that each time it’ll only capture the latest data for each customer table.

This approach is modular, reducing the chance of data inconsistency, and is also easy to modify at every step along the way.

So the total steps needed are:

  1. Create N data imports for N customers tables
  2. Create 1 data transform (query above) to group N tables into 1 master table