At an econometric level, the problem of VAR is whether the (properly integrated) processes we observe are (weakly) stationary. If they are weakly stationary, then ergodic theory states that we can estimate parameters with a confidence level in some proportion to the sample size. Assuming "stationarity," for higher dimensional processes like a vector of uncorrelated securities returns, and for Markov switching distributions with strong asymmetry, we may need centuries, sometimes hundreds of centuries of data. Some people compute a monstrous covariance matrix with limited data points, and make up additional data using a poor application of the bootstrap technique (or the Geman and Geman Gibbs Sampler). Clearly no amount of quantitative sophistication will expand your information set-by a similar argument no amount of mathematical knowledge will help me estimate someone's phone number.