Mercurial > hg > CbC > CbC_llvm
comparison docs/CompileCudaWithLLVM.rst @ 148:63bd29f05246
merged
author | Shinji KONO <kono@ie.u-ryukyu.ac.jp> |
---|---|
date | Wed, 14 Aug 2019 19:46:37 +0900 |
parents | c2174574ed3a |
children |
comparison
equal
deleted
inserted
replaced
146:3fc4d5c3e21e | 148:63bd29f05246 |
---|---|
20 =================== | 20 =================== |
21 | 21 |
22 Prerequisites | 22 Prerequisites |
23 ------------- | 23 ------------- |
24 | 24 |
25 CUDA is supported in llvm 3.9, but it's still in active development, so we | 25 CUDA is supported since llvm 3.9. Current release of clang (7.0.0) supports CUDA |
26 recommend you `compile clang/LLVM from HEAD | 26 7.0 through 9.2. If you need support for CUDA 10, you will need to use clang |
27 <http://llvm.org/docs/GettingStarted.html>`_. | 27 built from r342924 or newer. |
28 | 28 |
29 Before you build CUDA code, you'll need to have installed the appropriate | 29 Before you build CUDA code, you'll need to have installed the appropriate driver |
30 driver for your nvidia GPU and the CUDA SDK. See `NVIDIA's CUDA installation | 30 for your nvidia GPU and the CUDA SDK. See `NVIDIA's CUDA installation guide |
31 guide <https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html>`_ | 31 <https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html>`_ for |
32 for details. Note that clang `does not support | 32 details. Note that clang `does not support |
33 <https://llvm.org/bugs/show_bug.cgi?id=26966>`_ the CUDA toolkit as installed | 33 <https://llvm.org/bugs/show_bug.cgi?id=26966>`_ the CUDA toolkit as installed by |
34 by many Linux package managers; you probably need to install nvidia's package. | 34 many Linux package managers; you probably need to install CUDA in a single |
35 | 35 directory from NVIDIA's package. |
36 You will need CUDA 7.0, 7.5, or 8.0 to compile with clang. | 36 |
37 | 37 CUDA compilation is supported on Linux. Compilation on MacOS and Windows may or |
38 CUDA compilation is supported on Linux, on MacOS as of 2016-11-18, and on | 38 may not work and currently have no maintainers. Compilation with CUDA-9.x is |
39 Windows as of 2017-01-05. | 39 `currently broken on Windows <https://bugs.llvm.org/show_bug.cgi?id=38811>`_. |
40 | 40 |
41 Invoking clang | 41 Invoking clang |
42 -------------- | 42 -------------- |
43 | 43 |
44 Invoking clang for CUDA compilation works similarly to compiling regular C++. | 44 Invoking clang for CUDA compilation works similarly to compiling regular C++. |
71 Typically, ``/usr/local/cuda``. | 71 Typically, ``/usr/local/cuda``. |
72 | 72 |
73 Pass e.g. ``-L/usr/local/cuda/lib64`` if compiling in 64-bit mode; otherwise, | 73 Pass e.g. ``-L/usr/local/cuda/lib64`` if compiling in 64-bit mode; otherwise, |
74 pass e.g. ``-L/usr/local/cuda/lib``. (In CUDA, the device code and host code | 74 pass e.g. ``-L/usr/local/cuda/lib``. (In CUDA, the device code and host code |
75 always have the same pointer widths, so if you're compiling 64-bit code for | 75 always have the same pointer widths, so if you're compiling 64-bit code for |
76 the host, you're also compiling 64-bit code for the device.) | 76 the host, you're also compiling 64-bit code for the device.) Note that as of |
77 v10.0 CUDA SDK `no longer supports compilation of 32-bit | |
78 applications <https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#deprecated-features>`_. | |
77 | 79 |
78 * ``<GPU arch>`` -- the `compute capability | 80 * ``<GPU arch>`` -- the `compute capability |
79 <https://developer.nvidia.com/cuda-gpus>`_ of your GPU. For example, if you | 81 <https://developer.nvidia.com/cuda-gpus>`_ of your GPU. For example, if you |
80 want to run your program on a GPU with compute capability of 3.5, specify | 82 want to run your program on a GPU with compute capability of 3.5, specify |
81 ``--cuda-gpu-arch=sm_35``. | 83 ``--cuda-gpu-arch=sm_35``. |
87 | 89 |
88 You can pass ``--cuda-gpu-arch`` multiple times to compile for multiple archs. | 90 You can pass ``--cuda-gpu-arch`` multiple times to compile for multiple archs. |
89 | 91 |
90 The `-L` and `-l` flags only need to be passed when linking. When compiling, | 92 The `-L` and `-l` flags only need to be passed when linking. When compiling, |
91 you may also need to pass ``--cuda-path=/path/to/cuda`` if you didn't install | 93 you may also need to pass ``--cuda-path=/path/to/cuda`` if you didn't install |
92 the CUDA SDK into ``/usr/local/cuda``, ``/usr/local/cuda-7.0``, or | 94 the CUDA SDK into ``/usr/local/cuda`` or ``/usr/local/cuda-X.Y``. |
93 ``/usr/local/cuda-7.5``. | |
94 | 95 |
95 Flags that control numerical code | 96 Flags that control numerical code |
96 --------------------------------- | 97 --------------------------------- |
97 | 98 |
98 If you're using GPUs, you probably care about making numerical code run fast. | 99 If you're using GPUs, you probably care about making numerical code run fast. |
140 | 141 |
141 ``<math.h>`` and ``<cmath>`` | 142 ``<math.h>`` and ``<cmath>`` |
142 ---------------------------- | 143 ---------------------------- |
143 | 144 |
144 In clang, ``math.h`` and ``cmath`` are available and `pass | 145 In clang, ``math.h`` and ``cmath`` are available and `pass |
145 <https://github.com/llvm-mirror/test-suite/blob/master/External/CUDA/math_h.cu>`_ | 146 <https://github.com/llvm/llvm-test-suite/blob/master/External/CUDA/math_h.cu>`_ |
146 `tests | 147 `tests |
147 <https://github.com/llvm-mirror/test-suite/blob/master/External/CUDA/cmath.cu>`_ | 148 <https://github.com/llvm/llvm-test-suite/blob/master/External/CUDA/cmath.cu>`_ |
148 adapted from libc++'s test suite. | 149 adapted from libc++'s test suite. |
149 | 150 |
150 In nvcc ``math.h`` and ``cmath`` are mostly available. Versions of ``::foof`` | 151 In nvcc ``math.h`` and ``cmath`` are mostly available. Versions of ``::foof`` |
151 in namespace std (e.g. ``std::sinf``) are not available, and where the standard | 152 in namespace std (e.g. ``std::sinf``) are not available, and where the standard |
152 calls for overloads that take integral arguments, these are usually not | 153 calls for overloads that take integral arguments, these are usually not |
546 | 547 |
547 | `gpucc: An Open-Source GPGPU Compiler <http://dl.acm.org/citation.cfm?id=2854041>`_ | 548 | `gpucc: An Open-Source GPGPU Compiler <http://dl.acm.org/citation.cfm?id=2854041>`_ |
548 | Jingyue Wu, Artem Belevich, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng, Robert Hundt | 549 | Jingyue Wu, Artem Belevich, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng, Robert Hundt |
549 | *Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO 2016)* | 550 | *Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO 2016)* |
550 | | 551 | |
551 | `Slides from the CGO talk <http://wujingyue.com/docs/gpucc-talk.pdf>`_ | 552 | `Slides from the CGO talk <http://wujingyue.github.io/docs/gpucc-talk.pdf>`_ |
552 | | 553 | |
553 | `Tutorial given at CGO <http://wujingyue.com/docs/gpucc-tutorial.pdf>`_ | 554 | `Tutorial given at CGO <http://wujingyue.github.io/docs/gpucc-tutorial.pdf>`_ |
554 | 555 |
555 Obtaining Help | 556 Obtaining Help |
556 ============== | 557 ============== |
557 | 558 |
558 To obtain help on LLVM in general and its CUDA support, see `the LLVM | 559 To obtain help on LLVM in general and its CUDA support, see `the LLVM |