NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
arxiv(2024)
摘要
Transformer-based Language Models have become ubiquitous in Natural Language
Processing (NLP) due to their impressive performance on various tasks. However,
expensive training as well as inference remains a significant impediment to
their widespread applicability. While enforcing sparsity at various levels of
the model architecture has found promise in addressing scaling and efficiency
issues, there remains a disconnect between how sparsity affects network
topology. Inspired by brain neuronal networks, we explore sparsity approaches
through the lens of network topology. Specifically, we exploit mechanisms seen
in biological networks, such as preferential attachment and redundant synapse
pruning, and show that principled, model-agnostic sparsity approaches are
performant and efficient across diverse NLP tasks, spanning both classification
(such as natural language inference) and generation (summarization, machine
translation), despite our sole objective not being optimizing performance.
NeuroPrune is competitive with (or sometimes superior to) baselines on
performance and can be up to 10x faster in terms of training time for a given
level of sparsity, simultaneously exhibiting measurable improvements in
inference time in many cases.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要