Online Cascade Learning for Efficient Inference over Streams
CoRR(2024)
摘要
Large Language Models (LLMs) have a natural role in answering complex queries
about data streams, but the high computational cost of LLM inference makes them
infeasible in many such tasks. We propose online cascade learning, the first
approach to addressing this challenge. The objective here is to learn a
"cascade" of models, starting with lower-capacity models (such as logistic
regressors) and ending with a powerful LLM, along with a deferral policy that
determines the model that is used on a given input. We formulate the task of
learning cascades online as an imitation-learning problem and give a no-regret
algorithm for the problem. Experimental results across four benchmarks show
that our method parallels LLMs in accuracy while cutting down inference costs
by as much as 90
processing.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要