Login / Signup

Big social data provenance framework for Zero-Information Loss Key-Value Pair (KVP) Database.

Asma RaniNavneet GoyalShashi K Gadia
Published in: International journal of data science and analytics (2021)
Social media has been playing a vital importance in information sharing at massive scale due to its easy access, low cost, and faster dissemination of information. Its competence to disseminate the information across a wide audience has raised a critical challenge to determine the social data provenance of digital content. Social Data Provenance describes the origin, derivation process, and transformations of social content throughout its lifecycle. In this paper, we present a Big Social Data Provenance (BSDP) Framework for key-value pair (KVP) database using the novel concept of Zero-Information Loss Database (ZILD). In our proposed framework, a huge volume of social data is first fetched from the social media (Twitter's Network) through live streaming and simultaneously modelled in a KVP database by using a query-driven approach. The proposed framework is capable in capturing, storing, and querying provenance information for different query sets including select, aggregate, standing/historical, and data update (i.e., insert, delete, update) queries on Big Social Data. We evaluate the performance of proposed framework in terms of provenance capturing overhead for different query sets including select, aggregate, and data update queries, and average execution time for various provenance queries.
Keyphrases
  • social media
  • big data
  • health information
  • electronic health record
  • healthcare
  • mental health
  • machine learning
  • emergency department
  • low cost
  • artificial intelligence
  • data analysis
  • dual energy