Skip to content

Commit

Permalink
CNTK support for CUDA 9
Browse files Browse the repository at this point in the history
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.

Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py

Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.

To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
    vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
    /Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
    cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).

To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
  • Loading branch information
KeDengMS committed Jan 23, 2018
1 parent 3765da9 commit 3cf3af5
Show file tree
Hide file tree
Showing 297 changed files with 155,334 additions and 136,210 deletions.
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ Makefile text
*.asax text

*.h text
*.hpp text
*.cpp text
*.cc text
*.cu text
Expand Down
53 changes: 20 additions & 33 deletions CNTK.Cpp.props
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,10 @@
<Import Project="$(SolutionDir)\CNTK.Common.props" />
<PropertyGroup>
<CudaVersion />
<CudaVersion Condition="Exists('$(CUDA_PATH_V8_0)') And '$(CudaVersion)' == ''">8.0</CudaVersion>
<CudaVersion Condition="Exists('$(CUDA_PATH_V7_5)') And '$(CudaVersion)' == ''">7.5</CudaVersion>

<NvmlInclude />
<NvmlInclude Condition="'$(CudaVersion)' == '7.5'">"c:\Program Files\NVIDIA Corporation\GDK\gdk_win7_amd64_release\nvml\include"</NvmlInclude>
<NvmlInclude Condition="'$(CudaVersion)' == '8.0'" />

<NvmlLibPath />
<NvmlLibPath Condition="'$(CudaVersion)' == '7.5'">"c:\Program Files\NVIDIA Corporation\GDK\gdk_win7_amd64_release\nvml\lib"</NvmlLibPath>
<NvmlLibPath Condition="'$(CudaVersion)' == '8.0'" />
<CudaVersion Condition="Exists('$(CUDA_PATH_V9_0)') And '$(CudaVersion)' == ''">9.0</CudaVersion>

<NvmlDll>%ProgramW6432%\NVIDIA Corporation\NVSMI\nvml.dll</NvmlDll>
<NvmlDll Condition="Exists('c:\local\bindrop\NVSMI\nvml.dll')">c:\local\bindrop\NVSMI\nvml.dll</NvmlDll>
<NvmlDll Condition="Exists('c:\local\nvsmi9\NVSMI\nvml.dll')">c:\local\nvsmi9\NVSMI\nvml.dll</NvmlDll>

<HasOpenCv>false</HasOpenCv>
<HasOpenCv Condition="Exists('$(OPENCV_PATH)') Or Exists('$(OPENCV_PATH_V31)')">true</HasOpenCv>
Expand Down Expand Up @@ -65,16 +56,20 @@

<PropertyGroup Condition="!$(IsUWP)">
<MathLibrary>MKL</MathLibrary>
<MathIncludePath>$(MKLML_PATH)\include</MathIncludePath>
<MathIncludePath>$(MKL_PATH)\include</MathIncludePath>
<MathDefine>USE_MKL</MathDefine>
<!-- Only non-UWP configurations consume PerformanceProfiler -->
<ReaderLibs>Cntk.PerformanceProfiler-$(CntkComponentVersion).lib;$(ReaderLibs)</ReaderLibs>
<MathLibraryName>MKL-ML Library</MathLibraryName>
<MathLibraryPath>$(MKLML_PATH)\lib</MathLibraryPath>
<MathLibraryName>MKL Library</MathLibraryName>
<MathLibraryPath>$(MKL_PATH)\lib</MathLibraryPath>
<MathLinkLibrary>mklml.lib</MathLinkLibrary>
<MathDelayLoad>mklml.dll</MathDelayLoad>
<MathPostBuildCopyPattern>$(MathLibraryPath)\*.dll</MathPostBuildCopyPattern>
<UnitTestDlls>$(OutDir)mklml.lib;$(OutDir)libiomp5md.dll;</UnitTestDlls>
<HasMklDnn>false</HasMklDnn>
<HasMklDnn Condition="Exists('$(MKL_PATH)\include\mkldnn.h')">true</HasMklDnn>
<MathDefine Condition="$(HasMklDnn)">$(MathDefine);USE_MKLDNN</MathDefine>
<MathLinkLibrary Condition="$(HasMklDnn)">$(MathLinkLibrary);mkldnn.lib</MathLinkLibrary>
<MathDelayLoad Condition="$(HasMklDnn)">$(MathDelayLoad);mkldnn.dll</MathDelayLoad>
</PropertyGroup>
<PropertyGroup Condition="$(UseZip)">
<ZipInclude>$(ZLIB_PATH)\include;$(ZLIB_PATH)\lib\libzip\include;</ZipInclude>
Expand Down Expand Up @@ -109,31 +104,19 @@
<ProtobufLib Condition="$(DebugBuild)">libprotobufd.lib</ProtobufLib>
</PropertyGroup>

<PropertyGroup Condition="'$(CudaVersion)' == '8.0'">
<CudaPath>$(CUDA_PATH_V8_0)</CudaPath>
<CudaRuntimeDll>cudart64_80.dll</CudaRuntimeDll>
<CudaDlls>cublas64_80.dll;cusparse64_80.dll;curand64_80.dll;$(CudaRuntimeDll)</CudaDlls>
<PropertyGroup Condition="'$(CudaVersion)' == '9.0'">
<CudaPath>$(CUDA_PATH_V9_0)</CudaPath>
<CudaRuntimeDll>cudart64_90.dll</CudaRuntimeDll>
<CudaDlls>cublas64_90.dll;cusparse64_90.dll;curand64_90.dll;$(CudaRuntimeDll)</CudaDlls>

<!-- Use NvidiaCompute to define nvcc target architectures (will generate code to support them all, i.e. fat-binary, in release mode)
In debug mode we only include cubin/PTX for 30 and rely on PTX / JIT to generate the required native cubin format
http://docs.nvidia.com/cuda/pascal-compatibility-guide/index.html#building-applications-with-pascal-support -->
<NvidiaCompute Condition="$(DebugBuild)">$(CNTK_CUDA_CODEGEN_DEBUG)</NvidiaCompute>
<NvidiaCompute Condition="$(DebugBuild) And '$(NvidiaCompute)'==''">compute_30,sm_30</NvidiaCompute>

<NvidiaCompute Condition="$(ReleaseBuild)">$(CNTK_CUDA_CODEGEN_RELEASE)</NvidiaCompute>
<NvidiaCompute Condition="$(ReleaseBuild) And '$(NvidiaCompute)'==''">compute_30,sm_30;compute_35,sm_35;compute_50,sm_50;compute_60,sm_60;compute_61,sm_61</NvidiaCompute>
</PropertyGroup>

<PropertyGroup Condition="'$(CudaVersion)' == '7.5'">
<CudaPath>$(CUDA_PATH_V7_5)</CudaPath>
<CudaRuntimeDll>cudart64_75.dll</CudaRuntimeDll>
<CudaDlls>cublas64_75.dll;cusparse64_75.dll;curand64_75.dll;$(CudaRuntimeDll)</CudaDlls>

<NvidiaCompute Condition="$(DebugBuild)">$(CNTK_CUDA_CODEGEN_DEBUG)</NvidiaCompute>
<NvidiaCompute Condition="$(DebugBuild) And '$(NvidiaCompute)'==''">compute_30,sm_30</NvidiaCompute>

<NvidiaCompute Condition="$(ReleaseBuild)">$(CNTK_CUDA_CODEGEN_RELEASE)</NvidiaCompute>
<NvidiaCompute Condition="$(ReleaseBuild) And '$(NvidiaCompute)'==''">compute_30,sm_30;compute_35,sm_35;compute_50,sm_50</NvidiaCompute>
<NvidiaCompute Condition="$(ReleaseBuild) And '$(NvidiaCompute)'==''">compute_30,sm_30;compute_35,sm_35;compute_50,sm_50;compute_60,sm_60;compute_61,sm_61;compute_70,sm_70</NvidiaCompute>
</PropertyGroup>

<PropertyGroup>
Expand All @@ -144,11 +127,14 @@
<CudaMsbuildPath Condition="'$(CudaMsbuildPath)' == ''">$(VCTargetsPath)\BuildCustomizations</CudaMsbuildPath>
</PropertyGroup>

<PropertyGroup>
<PlatformToolset>v141</PlatformToolset>
</PropertyGroup>

<!-- TODO warn if ConfigurationType not (yet) defined -->

<PropertyGroup Condition="'$(ConfigurationType)' == 'StaticLibrary'">
<UseDebugLibraries>$(DebugBuild)</UseDebugLibraries>
<PlatformToolset>v140</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
<WholeProgramOptimization>$(ReleaseBuild)</WholeProgramOptimization>
<LinkIncremental>$(DebugBuild)</LinkIncremental>
Expand All @@ -159,6 +145,7 @@
<PreprocessorDefinitions>CNTK_COMPONENT_VERSION="$(CntkComponentVersion)"</PreprocessorDefinitions>
<!-- UWP does not use MPI -->
<PreprocessorDefinitions Condition="!$(IsUWP)">%(PreprocessorDefinitions);HAS_MPI=1</PreprocessorDefinitions>
<PreprocessorDefinitions Condition="'$(CudaVersion)' == '9.0'">%(PreprocessorDefinitions);CUDA_NO_HALF;__CUDA_NO_HALF_OPERATORS__</PreprocessorDefinitions>
</ClCompile>
</ItemDefinitionGroup>

Expand Down
Loading

0 comments on commit 3cf3af5

Please sign in to comment.