Question 1
Walk me through how you choose an incremental strategy in dbt. What are the trade-offs between append, merge, and delete+insert?
Model Answer
I start by asking what the data looks like and what correctness guarantees I need. For
event streams where records never change, append or
insert_overwrite with partitions is cheapest because I load new rows and
move on. When records can be updated after the fact, such as order status changes, I use
merge, which upserts on a unique key. The trade-off is that merge often
requires a full scan of the target table, which gets expensive at scale. Delete+insert is
my middle ground: I delete rows in a matching partition window and re-insert corrected
data, which avoids full table scans while still handling late updates. I default to merge
for correctness, then optimize to delete+insert once I understand the partition shape.
Why Interviewers Ask This
They want to know if you understand the cost and correctness trade-offs of each strategy, not just that the configuration exists.