Abstract

　The current(2020.12) GPGPU-Sim supports up to the 1st Gen(Volta) NVIDIA tensor core. This distribution consists of GPGPU-Sim enabled Turing WMMA API and its benchmark results. Each directory inside the Benchmark directory has hardware benchmark results and revised gpgpu-sim benchmark results.
　In this study, the microarchitecture of Tensor Core in Turing architecture is proposed. Since NVIDIA does not disclose the inside of the tensor core, it is necessary to profile through microbenchmarking. Dissecting the NVIDIA GPUs has also been done in previous studies. However, it was not revealed about the experimental features of the Turing architecture, i.e. INT4(int 4-bit) operation mode and B1(binary 1-bit) operation mode. All of these functions were analyzed in this study.

Repository Structure

gpgpu-sim
- GPGPU-Sim enalbed Turing WMMA API
Benchmark
- b1(1-bit)
- u4(unsigned 4-bit)
- u8(unsigned 8-bit)
- fp16(floating point 16-bit)
- mixed(mixed precision)
Paper
- Thesis paper

Recommended environment for running benchmark

GPGPU-Sim 4.0 (refer to https://github.com/gpgpu-sim/gpgpu-sim_distribution)
CUDA 10 or higher
NVIDIA graphic card with sm_75 or higher(after Volta arch.)

Hardware benchmarking

Go to the directory you want to benchmark.
Set the matrix size at test.cu inside hard directory.
$ make
See results in the log file

GPGPU-Sim benchmarking

build GPGPU-Sim(check its version of CUDA is 10 or higher)
Set the matrix size at test.cu inside sim directory.
$ make
See the result shown by simulator.

Results

Proposed 2nd Gen tensor core architecture
Benchmark results

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Benchmark		Benchmark
Paper		Paper
gpgpu-sim		gpgpu-sim
README.md		README.md
benchmark_results.png		benchmark_results.png
proposed_arch.png		proposed_arch.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract

Repository Structure

Recommended environment for running benchmark

Hardware benchmarking

GPGPU-Sim benchmarking

Results

Proposed 2nd Gen tensor core architecture

Benchmark results

About

Releases

Packages

Languages

mayshin10/GPGPU_Sim-Enabled-Turing-WMMA-API

Folders and files

Latest commit

History

Repository files navigation

Abstract

Repository Structure

Recommended environment for running benchmark

Hardware benchmarking

GPGPU-Sim benchmarking

Results

Proposed 2nd Gen tensor core architecture

Benchmark results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages