Question 4 of 10Pro Only
How do you handle high-cardinality categorical features with hundreds or thousands of unique values? What are the trade-offs of different approaches?
Sample answer preview
High-cardinality categorical features present a significant challenge because standard one-hot encoding creates an explosion of dimensions. A feature with 10,000 unique values would create 10,000 new columns, leading to memory issues, longer training times, and increased risk of…
high cardinalitytarget encodingfrequency encodingfeature hashingembeddingsrare categories