================
@@ -1,119 +1,741 @@
-=============================================
-SYCL Compiler and Runtime architecture design
-=============================================
+============
+SYCL Support
+============
 
 .. contents::
    :local:
 
 Introduction
 ============
+The `SYCL 2020 specification <SYCL-2020_>`_ defines a single-source programming
+model and C++ run-time library interface for writing portable programs that
+support heterogeneous devices including GPUs, CPUs, and accelerators.
+The specification is intended to allow for a wide range of implementation
+possibilities, examples of which include:
+
+- A SYCL run-time library written in standard C++ that executes kernels on a
+  homogeneous set of host and device processors, each of which can execute
+  common compiled code from shared memory.
+- A SYCL run-time library that executes kernels on a heterogeneous set of
+  device processors for which each kernel is pre-compiled for each supported
+  device processor (Ahead-Of-Time (AOT) compilation) or for a family of device
+  processors (Just-In-Time (JIT) compilation).
+
+Since Clang is a conforming implementation of the C++ standard, no additional
+features are required for support of the first implementation strategy.
+This document details the core language features Clang provides for use by
+SYCL run-time libraries that use the second implementation strategy.
+
+.. _SYCL-2020:
+   https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html
+
+
+Example Usage
+=============
+SYCL is designed as an extension of C++ rather than as a distinct programming
+language.
+SYCL support is enabled with the `-fsycl <opt-fsycl_>`_ option.
+
+.. code-block:: sh
+
+   clang++ -c -fsycl source-file.cpp
+
+The choice of which target devices will be supported is made at compile time.
+By default, SYCL source files will be compiled with support for a host target
+dependent set of target devices.
+For example, when compiling for a ``x86_64-unknown-linux-gnu`` host target,
+target support will be enabled for ``spirv64-unknown-unknown`` devices.
+The set of supported target devices can be specified via a comma separated list
+of target triples with the `--offload-targets= <opt-offload-targets_>`_ option.
+The following Clang invocation enables support for AMD, NVIDIA, and Intel GPU
+targets.
+
+.. code-block:: sh
+
+   clang++ -c -fsycl \
+     
--offload-targets=amdgcn-amd-amdhsa,nvptx64-nvidia-cuda,spirv64-unknown-unknown 
\
+     source-file.cpp
+
+Object files built with the `-fsycl <opt-fsycl_>`_ option contain device
+images that require additional processing at link time.
+Programs linked with such object files must also be linked using the
+``clang++`` driver and the `-fsycl <opt-fsycl_>`_ option.
+
+.. code-block:: sh
+
+   clang++ -fsycl example.o source-file.o -o example
+
+.. _opt-fsycl:
+   
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fsycl
+.. _opt-offload-targets:
+   
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-offload-targets
+
+
+Compilation Model
+=================
+`SYCL 2020 section 5.1, "Offline compilation of SYCL source files" 
<SYCL-2020-5.1_>`_
+acknowledges two compilation models.
+
+- Single-source Multiple Compiler Pass (`SMCP`_) describes a compilation model
+  in which source code is separately parsed and analyzed for the host target
+  and each device target.
+
+- Single-source Single Compiler Pass (`SSCP`_) describes a compilation model
+  in which source code is parsed and analyzed once with code generation
+  performed separately for the host target and each device target.
+
+Clang only supports the `SMCP`_ compilation model currently, but the SYCL
+language support features have been designed to allow for support of the
+`SSCP`_ compilation model to be added in the future.
+
+By default, SYCL source files are compiled for the host target and for each
+device target.
+In some cases, it is useful to restrict compilation to just the host target or
+just the device targets; the `-fsycl-host-only <opt-fsycl-host-only_>`_ and
+`-fsycl-device-only <opt-fsycl-device-only_>`_ options are available for these
+purposes.
+
+.. _SMCP:
+   
https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:smcp
+.. _SSCP:
+   
https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:sscp
+.. _SYCL-2020-5.1:
+   
https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_offline_compilation_of_sycl_source_files
+.. _opt-fsycl-host-only:
+   
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-offload-host-only
+.. _opt-fsycl-device-only:
+   
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-offload-device-only
+
+
+Supported Targets
+=================
+Support for SYCL is still in the implementation phase, but all targets
+supported by the `--offload-targets= <opt-offload-targets_>`_ option
+are intended to eventually be supported.
+
+
+Predefined Macros
+=================
+`SYCL 2020 section 5.6, "Preprocessor directives and macros" <SYCL-2020-5.6_>`_
+specifies macros that a SYCL implementation is required to provide.
+Most such macros are defined by the SYCL run-time library and require inclusion
+of the ``<sycl/sycl.hpp>`` header file.
+The following macros are conditionally predefined by the compiler.
+
+.. list-table::
+   :header-rows: 1
+
+   * - Macro
+     - Description
+   * - ``__SYCL_DEVICE_ONLY__``
+     - Predefined by a `SMCP`_ implementation during device compilation (but 
not
+       during host compilation).
+   * - ``__SYCL_SINGLE_SOURCE__``
+     - Predefined by a `SSCP`_ implementation during (host and device)
+       compilation.
+
+Since Clang only supports the `SMCP`_ compilation model currently, the
+``__SYCL_SINGLE_SOURCE__`` macro is never predefined.
+
+.. _SYCL-2020-5.6:
+   
https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_preprocessor_directives_and_macros
+
+
+Language Support
+================
+`SYCL 2020 section 3.12.3, "Library-only implementation" <SYCL-2020-3.12.13_>`_
+notes the intent that the SYCL specification be implementable as a C++ library
+with no requirements beyond a compiler that conforms to the C++17 standard.
+The SYCL specification therefore does not specify extensions to the C++ core
+language and a library-only implementation will work with Clang without any
+core language extensions.
+Clang provides the features described in this section to facilitate 
capabilities
+that are not possible with a library-only SYCL implementation.
+
+.. _SYCL-2020-3.12.13:
+   
https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_library_only_implementation
 
-This document describes the architecture of the SYCL compiler and runtime
-library. More details are provided in
-`external document 
<https://github.com/intel/llvm/blob/sycl/sycl/doc/design/CompilerAndRuntimeDesign.md>`_\
 ,
-which are going to be added to clang documentation in the future.
-
-Address space handling
-======================
-
-The SYCL specification represents pointers to disjoint memory regions using C++
-wrapper classes on an accelerator to enable compilation with a standard C++
-toolchain and a SYCL compiler toolchain. Section 3.8.2 of SYCL 2020
-specification defines
-`memory model 
<https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_sycl_device_memory_model>`_\
 ,
-section 4.7.7 - `address space classes 
<https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_address_space_classes>`_
-and section 5.9 covers `address space deduction 
<https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_address_space_deduction>`_.
-The SYCL specification allows two modes of address space deduction: "generic as
-default address space" (see section 5.9.3) and "inferred address space" (see
-section 5.9.4). Current implementation supports only "generic as default 
address
-space" mode.
-
-SYCL borrows its memory model from OpenCL however SYCL doesn't perform
-the address space qualifier inference as detailed in
-`OpenCL C v3.0 6.7.8 
<https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#addr-spaces-inference>`_.
-
-The default address space is "generic-memory", which is a virtual address space
-that overlaps the global, local, and private address spaces. SYCL mode enables
-following conversions:
-
-- explicit conversions to/from the default address space from/to the address
-  space-attributed type
-- implicit conversions from the address space-attributed type to the default
-  address space
-- explicit conversions to/from the global address space from/to the
-  ``__attribute__((opencl_global_device))`` or
-  ``__attribute__((opencl_global_host))`` address space-attributed type
-- implicit conversions from the ``__attribute__((opencl_global_device))`` or
-  ``__attribute__((opencl_global_host))`` address space-attributed type to the
-  global address space
-
-All named address spaces are disjoint and sub-sets of default address space.
-
-The SPIR target allocates SYCL namespace scope variables in the global address
-space.
-
-Pointers to default address space should get lowered into a pointer to a 
generic
-address space (or flat to reuse more general terminology). But depending on the
-allocation context, the default address space of a non-pointer type is assigned
-to a specific address space. This is described in
-`common address space deduction rules 
<https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace>`_
-section.
-
-This is also in line with the behaviour of CUDA (`small example
-<https://godbolt.org/z/veqTfo9PK>`_).
-
-``multi_ptr`` class implementation example:
+
+.. _sect-sycl-address-spaces:
+
+Address Space Attributes
+------------------------
+These attributes are intended for use in the implementation of SYCL run-time
+libraries and should not be used in any other context.
+
+The SYCL address space attributes listed below correspond to the five address
+spaces described by
+`SYCL 2020 section 3.8.2, "SYCL device memory model" <SYCL-2020-3.8.2_>`_ and
+`SYCL 2020 section 4.7.7, "Address space classes" <SYCL-2020-4.7.7_>`_.
+
+.. list-table::
+   :header-rows: 1
+
+   * - Address space attribute
+     - SYCL address space
+     - Description
+   * - ``[[clang::sycl_global]]``
+     - global
+     - A memory region accessible by all work-items executing on a device.
+   * - ``[[clang::sycl_local]]``
+     - local
+     - A memory region accessible by all work-items of a single work-group.
+   * - ``[[clang::sycl_private]]``
+     - private
+     - A memory region that is private to a single work-item.
+   * - ``[[clang::sycl_generic]]``
+     - generic
+     - A virtual memory region from which the global, local, and private memory
+       regions may all be accessed.
+   * - ``[[clang::sycl_constant]]``
+     - constant
+     - (*deprecated*) A memory region that holds constant data for an executing
+       kernel.
+
+These attributes are intended to be used in the implementation of the following
+SYCL features.
+Specifically, they are intended to provide the implementation dependent
+decorated pointer and reference types described in the SYCL 2020 sections
+referenced above.
+
+- The ``sycl::buffer``, ``sycl::accessor``, and ``sycl::local_accessor``
+  classes.
+
+- The ``sycl::remove_decoration`` and ``sycl::remove_decoration_t`` type 
traits.
+
+- The ``sycl::multi_ptr`` class template and its explicit specializations.
+
+- The ``sycl::address_space_cast()`` function.
+
+- The ``sycl::static_addrspace_cast`` extension.
+
+- The ``sycl::dynamic_addrspace_cast`` extension.
+
+The SYCL address space attributes are type attributes that, when present in a
+type specifier, specify a distinct type from the otherwise unattributed type.
+For example, ``int *`` and ``int [[clang::sycl_global]] *`` designate distinct
+pointer types that participate in overload resolution and template
+specialization.
+
+Conversions between address space attributed types are permitted as follows.
+These conversions are consistent with the conversions permitted for the
+corresponding OpenCL address spaces as described in
+`OpenCL 3.0 section 3.3.1, "Fundamental Memory Regions" <OpenCL-3.0-3.3.1_>`_.
+
+- Types attributed with the global, local, or private address space attributes
+  are implicitly convertible to matching types with the generic address space
+  attribute.
+
+- Types attributed with the generic address space attribute may be converted
+  to a matching type with the global, local, or private address space attribute
+  by ``static_cast``.
+
+- All other conversions between matched types with either different address
+  space attributes or where one type lacks an address space attribute may be
+  performed by ``reinterpret_cast``.
+
+For OpenCL device targets, the SYCL address space attributes are synonymous
+with the `OpenCL address space attributes <attr-opencl-addrspace_>`_; e.g.,
+``int [[clang::sycl_global]]*`` and ``int [[clang::opencl_global]]*`` specify
+the same type.
+
+.. list-table::
+   :header-rows: 1
+
+   * - SYCL Address space attribute
+     - OpenCL address space attribute
+   * - ``[[clang::sycl_global]]``
+     - ``[[clang::opencl_global]]``
+   * - ``[[clang::sycl_local]]``
+     - ``[[clang::opencl_local]]``
+   * - ``[[clang::sycl_private]]``
+     - ``[[clang::opencl_private]]``
+   * - ``[[clang::sycl_generic]]``
+     - ``[[clang::opencl_generic]]``
+   * - ``[[clang::sycl_constant]]``
+     - ``[[clang::opencl_constant]]``
+
+Per `SYCL 2020 section 5.9, "Address-space deduction" <SYCL-2020-5.9_>`_,
+programmer use of address space annotated pointer and reference types is not
+required.
+The SYCL implementation is instead required to deduce which address space is
----------------
gmlueck wrote:

I would not say "required" here.  The SYCL spec attempts to provide two sets of 
rules for address space deduction.  One set of rules applies to devices that 
can support the generic address space.  The other set of rules applies to 
devices that can NOT support the generic address space.

The rules in the second case (devices that cannot support the generic address 
space) are not well defined, and it is not clear whether there will ever be any 
SYCL implementations for devices like this.  The downstream DPC++ compiler does 
not support these devices, and I do not think we plan to support these devices 
in our upstream SYCL contribution to clang.

It is debatable whether the first case (devices that can support the generic 
address space) really requires the compiler to do any address space deduction.  
I think it is more correct to say that all undecorated pointers live in the 
generic address space.  Compilers are encouraged to use address space deduction 
as an optimization in order to infer the actual address space of the pointed-at 
object if possible.  (This is an optimization because the generated code is 
usually faster if we know the actual address space of the reference.)  However, 
this address space deduction is just an optimization, and it is not required 
for a conformant implementation.

As a result, I don't think it is very useful for this document to have the code 
sample below showing the deduced address spaces.  This document seems to be 
mostly about the language rules, not about optimizations that may be performed.

https://github.com/llvm/llvm-project/pull/170602
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to