Novel architecture for long short-term memory used in question classification.

Neurocomputing(2018)

引用 65|浏览53
暂无评分
摘要
Long short-term memory (LSTM) is an extension of the recurrent neural network (RNN) and has achieved excellent performance in various tasks, especially sequential problems. The LSTM is chain-structured, and its architecture is limited by sequential information propagation. In practice, it is hard to solve the problems involving very long term dependencies. Recent studies have indicated that adding recurrent skip connections across multiple timescales may help the RNN improve its performance in long-term dependencies. Moreover, capturing local features can improve the performance of the RNN. In this paper, we propose a novel architecture (Att-LSTM) for the LSTM, which connects continuous hidden states of previous time steps to the current time step and applies an attention mechanism to these hidden states. This architecture can not only capture local features effectively but also help learn very long-distance correlations in an input sequence. We evaluate Att-LSTM in various sequential tasks, such as adding problem, sequence classification, and character-level language modeling. In addition, to prove the generalization and practicality of the novel architecture, we design a character-level hierarchical Att-LSTM and refine the word representation with a highway network. This hierarchical model achieved excellent performance on question classification.
更多
查看译文
关键词
LSTM,Hierarchical model,Highway network,Recurrent skip connections,Attention mechanism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要