Why is this extra step important in this model? The problem is called overfitting: If we supply too much data into our model creation, the model will actually be created perfectly, but just for that data. Remember: We want to use the model to predict future unknowns; we don't want the model to perfectly predict values we already know. This is why we create a test set. After we create the model, we check to ensure that the accuracy of the model we built doesn't decrease with the test set. This ensures that our model will accurately predict future unknown values. We'll see this in action using WEKA.