Download PDFOpen PDF in browser

Optimized Bootstrap Sampling for σ-AQP Error Estimation: A Pilot Study

10 pagesPublished: October 4, 2021

Abstract

Approximate query processing (AQP) aims to provide an approximated answer close to the exact answer efficiently for a complex query on large datasets, especially big data. It brings enormous benefits into many data science fields when the efficiency of query execution weighs more than the accuracy. However, assessing the accuracy of an approx- imated answer from AQP deserves more study. Existing work usually relies on strong dataset assumptions which may not work for real-world datasets. In this work, we employ bootstrap sampling to assess the estimation errors of the AQP for selection queries (called σ-AQP). We implement a prototype system which can calculate confidence intervals for the estimated query results. Experiment results demonstrated that the confidence intervals generated by the prototype system can cover the ground truth of the query results with high accuracy and low computing cost. In addition, we implement optimization strate- gies for the bootstrap sampling which have significantly improved the overall computing efficiency.

Keyphrases: Approximate Query Processing, bootstrap sampling, error assessment, query estimation

In: Frederick C. Harris Jr, Rui Wu and Alexander Redei (editors). Proceedings of ISCA 30th International Conference on Software Engineering and Data Engineering, vol 77, pages 144--153

Links:
BibTeX entry
@inproceedings{SEDE2021:Optimized_Bootstrap_Sampling_for,
  author    = {Semih Cal and En Cheng and Feng Yu},
  title     = {Optimized Bootstrap Sampling for  \textbackslash{}ensuremath\{\textbackslash{}sigma \} -AQP Error Estimation: A Pilot Study},
  booktitle = {Proceedings of ISCA 30th International Conference on Software Engineering and Data Engineering},
  editor    = {Frederick Harris and Rui Wu and Alex Redei},
  series    = {EPiC Series in Computing},
  volume    = {77},
  pages     = {144--153},
  year      = {2021},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2398-7340},
  url       = {https://easychair.org/publications/paper/v1b5},
  doi       = {10.29007/bkw9}}
Download PDFOpen PDF in browser