-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
99 additions
and
27 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,7 +24,7 @@ h3 { | |
margin-bottom: 40px; | ||
} | ||
|
||
li { | ||
.document li { | ||
margin: 0 0 10px; | ||
} | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
Environment Variables | ||
===================== | ||
|
||
`Kernel Launcher` recognizes the following environment variables: | ||
|
||
* **KERNEL_LAUNCHER_TUNE** (default: ``0``): | ||
Kernels for which a tuning specification will be exported on the first call to the kernel. | ||
The value should a comma-seperated list of kernel names. | ||
Additionally, an ``*`` can be used as a wild card. | ||
|
||
Examples: | ||
|
||
* ``foo,bar``: matches kernels ``foo`` and ``bar``. | ||
* ``vector_*``: matches kernels that start with ``vector``. | ||
* ``*_matrix_*``: matches kernels that contains ``matrix``. | ||
* ``*``: matches all kernels. | ||
|
||
|
||
* **KERNEL_LAUNCHER_WISDOM** (default: ``.``): | ||
The default directory where wisdom files are located. Defaults to the current working directory. | ||
|
||
* **KERNEL_LAUNCHER_LOG** (default: ``info``): | ||
Controls how much logging information is printed to stderr. There are three possible options: | ||
|
||
* ``debug``: Everything is logged. | ||
* ``info``: Only warnings and high-level information is logged. | ||
* ``warn``: Only warnings are logged. | ||
|
||
* **KERNEL_LAUNCHER_INCLUDE** (default: ``.``): | ||
List of comma-seperate directories that are considered while compiling kernels when searching for header files. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
Kernel Registry | ||
=============== | ||
|
||
The kernel registry essentially acts like a global cache of compiled kernels. | ||
|
||
TODO |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,44 +1,39 @@ | ||
#include "kernel_launcher.h" | ||
|
||
// Namespace alias. | ||
namespace kl = kernel_launcher; | ||
|
||
int main() { | ||
// Namespace alias. | ||
namespace kl = kernel_launcher; | ||
|
||
// Create a kernel builder | ||
kl::KernelBuilder build_kernel() { | ||
kl::KernelBuilder builder("vector_add", "vector_add_kernel.cu"); | ||
|
||
// Define tunable parameters | ||
|
||
auto threads_per_block = builder.tune("block_size", {32, 64, 128, 256, 512, 1024}); | ||
auto elements_per_thread = builder.tune("elements_per_thread", {1, 2, 4, 8}); | ||
|
||
// Define expressions | ||
auto elements_per_block = threads_per_block * elements_per_thread; | ||
|
||
// Define kernel properties | ||
|
||
builder | ||
.block_size(threads_per_block) | ||
.grid_divisors(threads_per_block * elements_per_thread) | ||
.template_args(kl::type_of<float>()) | ||
.define("ELEMENTS_PER_THREAD", elements_per_thread); | ||
|
||
// Define configuration | ||
kl::Config config; | ||
config.insert(threads_per_block, 32); | ||
config.insert(elements_per_thread, 2); | ||
return builder; | ||
} | ||
|
||
void main() { | ||
kl::set_global_wisdom_directory("wisdom/"); | ||
kl::set_global_tuning_directory("tuning/"); | ||
|
||
// Define the kernel. "vector_add" is the tuning key. | ||
std::string tuning_key = "vector_add": | ||
kl::KernelBuilder builder = build_kernel(); | ||
kl::WisdomKernel vector_add_kernel(tuning_key, builder); | ||
|
||
// Compile kernel | ||
kl::Kernel<int, int*, const int*, const int*> vector_add_kernel; | ||
vector_add_kernel.compile(builder, config); | ||
|
||
// Initialize CUDA memory. This is outside the scope of kernel_launcher. | ||
unsigned int n = 1000000; | ||
float *dev_A, *dev_B, *dev_C; | ||
/* cudaMalloc, cudaMemcpy, ... */ | ||
|
||
// Launch the kernel! | ||
unsigned int problem_size = n; | ||
vector_add_kernel | ||
.instantiate(problem_size) | ||
.launch(n, dev_C, dev_A, dev_B); | ||
vector_add_kernel(problem_size)(n, dev_C, dev_A, dev_B); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,47 @@ | ||
Wisdom Files | ||
============ | ||
|
||
TODO: Write about wisdom files | ||
In the previous example, we saw how it is possible to compile a kernel by providing both a ``KernelBuilder`` instance (describing `blueprint` for the kernel) and a ``Config`` instance (describing the configuration of the tunable parameters). | ||
|
||
However, determining the optimal configuration is often difficult since it highly depends both on the `problem size` and the type of `GPU` being used. | ||
`Kernel Launcher` offers a solution this problem in form of `wisdom` files (terminology borrowed from `FFTW <http://www.fftw.org/>`_). | ||
|
||
Let's see this in action. | ||
|
||
|
||
C++ source code | ||
--------------- | ||
|
||
The following snippet show an example: | ||
|
||
.. literalinclude:: wisdom.cpp | ||
|
||
|
||
Notice how this example is similar to the previous example, except ``kl::Kernel`` has been replaced by ``kl::WisdomKernel``. | ||
On the first call this kernel, the kernel searches for the wisdom file for the key ``vector_add`` and compiles the kernel for the given ``problem_size`` and the current GPU. | ||
If no wisdom file has been found, the default configuration is chosen (in this case, that will be ``block_size=32,elements_per_thread=1``). | ||
|
||
|
||
|
||
Export the kernel | ||
----------------- | ||
To tune the kernel, we first need to export the tuning specifications. To do this, we run the program with the environment variable ``KERNEL_LAUNCHER_TUNE=vector_add``:: | ||
|
||
KERNEL_LAUNCHER_TUNE=vector_add ./main | ||
|
||
This generates a file ``vector_add_1000000.json`` in the directory set by ``set_global_tuning_directory``. | ||
|
||
|
||
Tune the kernel | ||
--------------- | ||
TODO: Using kernel tuner | ||
|
||
|
||
Import the wisdom | ||
----------------- | ||
After tuning the kernel and obtaining the wisdom file, we place this wisdom file in the directory specified by ``set_global_wisdom_directory``. | ||
Now, when running the program, on the first call to ``vector_add_kernel``, the kernel finds the wisdom file and compiles the kernel given the optimal configuration. | ||
|
||
To confirm that wisdom file has indeed been found, check the debugging output by define the environment variable ``KERNEL_LAUNCHER_LOG=debug``. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters