Scalable Multitask Representation Learning for Scene Classification Supplementary Material

semanticscholar(2014)

引用 0|浏览0
暂无评分
摘要
1. Implementation Details In this section we discuss certain implementation details of our STL-SDCA and MTL-SDCA solvers. We begin with some notation and then proceed with technical details for each solver. Notation: Let {(xi, yit) : 1 ≤ i ≤ n, 1 ≤ t ≤ T} be the input/output pairs of the multitask learning problem, where xi ∈ R, yit ∈ {±1}, T is the number of tasks, and n is the number of training examples per task. We assume that all tasks have the same training examples even though this can be easily generalized. The standard single task learning (STL) approach learns linear predictors wt in the original space R. In contrast, the proposed multitask learning (MTL) method learns a matrix U in Rd×k, which is used to map the original features xi into a new representation zi via zi = Uxi. The linear predictors wt are then learned in the subspace R. Let X in Rd×n be the matrix of stacked vectors xi, Z in Rk×n the matrix of stacked vectors zi, Y in {±1}n×T the matrix of labels, and W in R·×T the matrix of stacked predictors wt (the dimensionality of wt will be clear from the context). We define the following kernel matrices: K = KX = X>X , KZ = Z>Z, and M = KW = W>W . As mentioned in the paper, both solvers use precomputed kernel matrices and work with dual variables αt in R. We define A in Rn×T as the matrix of stacked dual variables for all tasks. STL-SDCA: The STL optimization problem for a task t is defined as follows:
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要