Convolutional Neural Network Based Ensemble Approach For Homoglyph Recognition

MATHEMATICAL AND COMPUTATIONAL APPLICATIONS(2020)

引用 1|浏览11
暂无评分
摘要
Homoglyphs are pairs of visual representations of Unicode characters that look similar to the human eye. Identifying homoglyphs is extremely useful for building a strong defence mechanism against many phishing and spoofing attacks, ID imitation, profanity abusing, etc. Although there is a list of discovered homoglyphs published by Unicode consortium, regular expansion of Unicode character scripts necessitates a robust and reliable algorithm that is capable of identifying all possible new homoglyphs. In this article, we first show that shallow Convolutional Neural Networks are capable of identifying homoglyphs. We propose two variations, both of which obtain very high accuracy (99.44%) on our benchmark dataset. We also report that adoption of transfer learning allows for another model to achieve 100% recall on our dataset. We ensemble these three methods to obtain 99.72% accuracy on our independent test dataset. These results illustrate the superiority of our ensembled model in detecting homoglyphs and suggest that our model can be used to detect new homoglyphs when increasing Unicode characters are added. As a by-product, we also prepare a benchmark dataset based on the currently available list of homoglyphs.
更多
查看译文
关键词
convolutional neural network, deep learning, homoglyphs, phishing, unicode
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要