How to Build a Bad Predictive model


  1. Don’t pay attention to data types
    1. Treat anything with numbers as a numeric
    2. Don’t remove infrequently occurring factor levels
  2. Ignore seasonality
    1. When modeling retail data use single months or quarters
    2. Holidays don’t matter
  3. Ignore effects of time when generating error metrics
    1. Mix the same time period in your test / train sets
  4. Don’t use a holdout set to determine model parameters
  5. Don’t do counts of rows / columns
    1. As long as you didn’t get an error message when you run your code everything is ok – no rows or columns were silently removed.
  6. Don’t profile your data before building a model
  7. Assume that unknown’s in the target variable of your classification model are negatives (or positives).
  8. Assume that high scoring models don’t need to be examined
  9. Don’t worry about the time-stamping of your data
  10. Don’t set a seed – no need to reproduce the results

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s