These activity traces are embedded in search queries [36,50–
76], social media messages [77–92], and web server access logs
[34,72,93]. At a basic level, traces are extracted by counting query
strings, words or phrases, or web page URLs that are related to the
metric of interest, forming a time series of occurrences for each
item. A statistical model is then created that maps these input time
series to a time series estimating the metric’s changing value. This
model is trained on time period(s) when both the internet data and
the true metric values are available and then applied to estimate
the metric value over time period(s) when it is not available, i.e.,
forecasting the future, nowcasting the present, and anti-forecasting
the past (the latter two being useful in cases where true metric
availability lags real time).