I've learned to always pitch my model against ...

2016-08-24T19:08:28.461-06:00

I've learned to always pitch my model against a random and an averaging predictor. If your regressor or classifier can't beat a simple average (or worse, total random guessing), well... no need to continue before finding more signal.

I also like using VW or Random Forest 500 to benchmark against. It can give estimates on the hardness of a problem and how well your optimization is doing vs. very standard modeling techniques.

I really dig the data-halving trick. Have to try that out soon.

Comments on natural language processing blog: Debugging machine learning

I've learned to always pitch my model against ...