Servers typically ignore those, but when performing information extraction they pollute the results. The error categories are mutually exclusive, as our mock server code tests for incorrect endpoints first and rejects nonconforming URLs immediately. The numbers for overall errors combine endpoint and query errors. As Table I shows, the log files contain varying but occasionally very high levels of noise. Note though that many errors do not necessarily imply that our method will produce bad results. For example, the same superfluous query parameter may appear across multiple URLs, with limited overall impact.