Question 6 of 10Pro Only

What is data leakage in feature engineering, and how do you prevent it? Describe specific scenarios where leakage commonly occurs.

Sample answer preview

Data leakage occurs when information from outside the training data is used to create the model, leading to overly optimistic performance estimates that do not generalize to real-world deployment.

data leakagetarget leakagetrain-test contaminationtemporal leakagefeature selectioncross-validation

Unlock the full answer

Get the complete model answer, key points, common pitfalls, and access to 9+ more Data Scientist interview questions.

Upgrade to Pro

Starting at $19/month • Cancel anytime