Web content aggregation is an ETL (ETL = Extract,
Transform, Load) method, which extracts, analyzes and sorts
the different sources of information and finally stores the
reorganized information to the database or some kind of data
warehouse. However, there is a huge gap between semantic
web content and structured content of database. So, in many
application areas there is a prominent need of extracting
relevant data from HTML sources and translating it into a
structured format such as XML or some suitable relational
database.