Question 4 of 10Pro Only

How do you handle high-cardinality categorical features with hundreds or thousands of unique values? What are the trade-offs of different approaches?

Sample answer preview

High-cardinality categorical features present a significant challenge because standard one-hot encoding creates an explosion of dimensions. A feature with 10,000 unique values would create 10,000 new columns, leading to memory issues, longer training times, and increased risk of…

high cardinalitytarget encodingfrequency encodingfeature hashingembeddingsrare categories

Unlock the full answer

Get the complete model answer, key points, common pitfalls, and access to 9+ more Data Scientist interview questions.

Upgrade to Pro

Starting at $19/month • Cancel anytime