Abstract
A major cost in executing queries in a distributed database system is the data transfer cost incurred in transferring relations (fragments) accessed by a query from different sites to the site where the query is initiated. The objective of a data allocation algorithm is to determine an assignment of fragments at different sites so as to minimize the total data transfer cost incurred in executing a set of queries. This is equivalent to minimizing the average query execution time, which is of primary importance in a wide class of distributed conventional as well as multimedia database systems. The data allocation problem, however, is NP-complete, and thus requires fast heuristics to generate efficient solutions. Furthermore, the optimal allocation of database objects highly depends on the query execution strategy employed by a distributed database system, and the given query execution strategy usually assumes an allocation of the fragments. We develop a site-independent fragment dependency graph representation to model the dependencies among the fragments accessed by a query, and use it to formulate and tackle data allocation problems for distributed database systems based on query-site and move-small query execution strategies. We have designed and evaluated evolutionary algorithms for data allocation for distributed database systems.
Original language | English |
---|---|
Pages (from-to) | 5-32 |
Number of pages | 28 |
Journal | Distributed and Parallel Databases |
Volume | 11 |
Issue number | 1 |
DOIs | |
Publication status | Published - Jan 2002 |
Externally published | Yes |
Keywords
- Data allocation
- Distributed database systems
- Genetic algorithm
- Mean field annealing
- Neighborhood search
- Optimal allocation
- Query processing
- Simulated evolution