Decision-makers in various domains, such as finance, transportation, manufacturing, and healthcare, often need to make optimal decisions based on constraints and objectives. Many such problems can be solved by selecting a set of tuples (a package) in a relational database that satisfy linear constraints and maximize or minimize a linear objective function. While previous work has developed efficient algorithms for these package queries [1], challenges remain in scaling and adapting to dynamic, streaming environments.
This paper introduces a framework that leverages the Reduced Representation Hypothesis, suggesting that a subset of data points can achieve the same optimization goals as the full dataset. Using a Multiple Layer Skyline (MLS)[2], we construct reduced representations, capturing key candidates while reducing computational costs. We also introduce Incremental Package Maintenance (IPM) algorithms to efficiently update solutions in dynamic streaming settings.
Our approach avoids costly recomputation, scales efficiently, and preserves near-optimal results by incrementally integrating new data. Experiments show up to a 7x improvement in query execution time for large datasets, demonstrating scalability and practical effectiveness. These contributions advance dynamic package queries in data management and optimization.
References:
[1] Matteo Brucato, Juan Felipe Beltran, Azza Abouzied, and Alexandra Meliou.
Scalable package queries in relational database systems. Proceedings of the VLDB Endowment,
9(7):576–587, March 2016.
Link
[2] Wenhui Yu, Zheng Qin, Jinfei Liu, Li Xiong, Xu Chen, and Huidi Zhang. Fast Algorithms for Pareto Optimal Group-based Skyline. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 417–426, Singapore, November 2017. ACM. Link