QUERY OPTIMIZATION
In parallel database systems, query processing and optimization techniques have to
address difficulties arising from the fragmentation and distribution of data. To deal with
fragmentation, data localizationtechniques are usedwhere analgebraic query, which is specified
on global relations, is transformed into one that operates on fragments rather than global
relations. In the process, opportunities for parallel execution are identified and unnecessary
work is eliminated. Localization requires the optimization of global operations, which is
undertaken as part of global query optimization, This in turn involves permuting the order of
operations in a query, determining the execution sites for various distributed operations and
identifying the best distributed execution algorithm for distributed operations [6] .
Query execution techniques can be classified into two forms of parallelisms:
i) Intra-operator parallelism - One operation is parallelized over several processors. This can
be achieved by partitioning the data among the processors. Each processors then carries out
the query on its own data set and the results are then combined for the eventual result.
ii) Inter-operator parallelism - Here, several processes are executes simultaneously, each
processor carrying out a process. There are two forms of inter-operator parallelisms :
independent and pipelined [7] [8] .
Various implementations exist for achieving intra-operator parallelism. An approach
known as de-clustering can be used to partition the query into fragments which can then be
allocated to the processors. This approach is most useful when optimizing complex queries.
Various sections of the queries can be broken and each part can be allocated to a separate
processor. The allocation ofthe various query subparts can also be done such thatthose sharing
the same data set can be allocated to adjacent or nearby processors. This helpsin reducing the
costs associated with moving the query results from one stage to another [9].