Instructions and logic to perform floating-point and integer operations for machine learning

user-5d4bc4a8530c70a9b361c870(2017)

Cited 24|Views6
No score
Abstract
One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
More
Translated text
Key words
Graphics processing unit,Operand,Matrix multiplication,Floating point,Multiprocessing,Thread (computing),AND gate,Parallel computing,Computer science
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined