Most term weighting schema is TF-IDF (Term Frequency-IDF := Inverse Document Frequency) based on empirical observations regarding text(Salton, G. 1989).
rare terms are not less relevant than frequent terms(IDF assumption)
multiple occurrences of a term in a document are not less relevant than single occurrences(TF assumption)
Long documents are not preferred to short documents(normalization assumption)