GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants

CoRR(2023)

Cited 0|Views16
No score
Abstract
Privacy policies inform users about the data management practices of organizations. Yet, their complexity often renders them largely incomprehensible to the average user, necessitating the development of privacy assistants. With the advent of generative AI (genAI) technologies, there is an untapped potential to enhance privacy assistants in answering user queries effectively. However, the reliability of genAI remains a concern due to its propensity for generating incorrect or misleading information. This study introduces GenAIPABench, a novel benchmarking framework designed to evaluate the performance of Generative AI-based Privacy Assistants (GenAIPAs). GenAIPABench comprises: 1) A comprehensive set of questions about an organization's privacy policy and a data protection regulation, along with annotated answers for several organizations and regulations; 2) A robust set of evaluation metrics for assessing the accuracy, relevance, and consistency of the generated responses; and 3) An evaluation tool that generates appropriate prompts to introduce the system to the privacy document and different variations of the privacy questions to evaluate its robustness. We use GenAIPABench to assess the potential of three leading genAI systems in becoming GenAIPAs: ChatGPT, Bard, and Bing AI. Our results demonstrate significant promise in genAI capabilities in the privacy domain while also highlighting challenges in managing complex queries, ensuring consistency, and verifying source accuracy.
More
Translated text
Key words
privacy assistants,benchmark,ai-based
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined