TauJud: test augmentation of machine learning in judicial documents

Zichen Guo,Jiawei Liu,Tieke He,Zhuoyang Li, Peitian Zhangzhu

ISSTA '20: 29th ACM SIGSOFT International Symposium on Software Testing and Analysis Virtual Event USA July, 2020(2020)

引用 2|浏览57
暂无评分
摘要
The booming of big data makes the adoption of machine learning ubiquitous in the legal field. As we all know, a large amount of test data can better reflect the performance of the model, so the test data must be naturally expanded. In order to solve the high cost problem of labeling data in natural language processing, people in the industry have improved the performance of text classification tasks through simple data amplification techniques. However, the data amplification requirements in the judgment documents are interpretable and logical, as observed from CAIL2018 test data with over 200,000 judicial documents. Therefore, we have designed a test augmentation tool called TauJud specifically for generating more effective test data with uniform distribution over time and location for model evaluation and save time in marking data. The demo can be found at https://github.com/governormars/TauJud.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要