[PATCH] D24933: Enable configuration files in clang

Serge Pavlov via Phabricator via cfe-commits Mon, 02 Oct 2017 10:10:45 -0700

sepavloff added a comment.

Here is a list of design solutions used in this implementation of config files.


**How config file is specified**

There are two ways to specify config file:

- To encode it into executable file name, such as `foo-clang`,
- To pass config file in command line arguments.

There were no objections to the variant `foo-clang`. It can be considered as a 
natural extension of the existing mechanism, in which invocation of `foo-clang` 
is equivalent to specifying target in command line: `clang --target=foo`. 
Config file allows to specify more than one option and its name is not confined 
to the registered targets.

As for specifying config file in command line, there are two variants:

- Use existing construct `@foo`.
- Use special command line option `--config foo`.

Each way has own advantages.

Construct `@file` allows to reuse existing command line syntax. Indeed, config 
file is a collection of command line arguments and `@file` is just a way to 
inserts such arguments from a file. Config file may include other files and it 
uses `@file` with the exception that `file` is resolved relative to the 
including file, not to current directory. Config file could be considered as an 
extension of existing mechanism provided by `@file`.

Using `@file` creates compatibility issues, because existing use of `@file` 
must not be broken. Obviously the file in `@file` may be treated as 
configuration only if it cannot be treated according to the existing semantics. 
Possible solution is to try loading `file` as configuration if it does not 
contain path separator and is not found in current directory.

The drawback of this syntax is that the meaning of `@foo` in the invocation 
`clang @foo abc.cpp` depends on the content of current directory. If it 
contains file `foo`, `@foo` is an expansion of response file, otherwise it 
specifies a config file. This behavior causes error if current directory 
accidentally contains a file with the same name as the specified config file.

Using dedicated option to apply config file makes the intention explicit. It 
also allow to use config files from arbitrary places. For instance, invocation 
`clang --config ./foo` allows to treat file `foo` in current directory as 
config file.

Although config file contains command line arguments as conventional response 
file, it differs from the latter:

- It resolves nested `@file` constructs differently, relative to including 
file, not current directory.
- File is searched for in predefined set of directories, not in the current 
only.
- Config file is more like a part of working environment. For instance, clang 
based SDK supplier could deliver a set config files as a part of their product. 
Response file in contrast is more close to transient build data, often 
generated by some tool.
- Warning about unused options are suppressed in config files.
- There was a proposal to extend syntax of config file to enable comments and 
using trailing backslash to split long lines, although these extensions are 
useful for response files as well.

So, maybe, loading config file deserves a special option. This way has 
advantages:

- it expresses intentions explicitly and reduce risk of accidental errors,
- it allows using absolute paths to specify config file.

**Where config files reside**

There may be several places where config files can be kept:

- Directory where clang executable resides. It is convenient for SDK developers 
as it simplifies packaging. User can use several instances of clang at the same 
time, they still may use their own set of config files without conflicts.
- System directory (for instance, /etc/llvm), that keeps config files for use 
by several users. Such case is interesting for OS distribution maintainers and 
SDK developers.
- User directory (for instance, ~/.llvm). A user can collect config file that 
tune compiler for their tasks in this directory and use them to select 
particular option set.
- Config file can be specified by path,  as in `clang --config ./foo`. This is 
convenient for developers to ensure that particular configuration is selected.

For the sake of flexibility it make sense to enable all these locations as they 
are useful in different scenarios. Location of user and system directories are 
specified at configuration, by default they are absent. If user directory is 
specified, it should have higher priority over other places for search so that 
user could correct system supplied option sets.

**Driver mode**

If config file is encoded in executable name, such as `foo-clang`, there is 
concern of using different driver modes. What config file should be searched 
for if compiler is called as `foo-cpp`, `foo-cl` etc? These tools support 
different set of options, so a flexible solution should provide possibility to 
specify different config files for different driver modes.

Clang implements flexible scheme of tool naming, in which a tool name has 
components:

  <arch>-<something>-<driver-mode>[-<optional version suffix>][<optional 
version number>]

The part of executable name that precedes the driver-mode suffix can be 
arbitrary. It make sense to not analyze the executable name by components but 
use entire name without version as a base name of config file. So executables:

  i686-linux-android-g++
  i686-linux-android-g++5.0
  i686-linux-android-g++-release

would search for file `i686-linux-android-g++.cfg`, while

  foo-clang
  foo-gcc
  foo-s++

would search files `foo-clang.cfg`, `foo-gcc.cfg` and `foo-s++.cfg` 
respectively.

On the other hand, important use of config file is tuning compiler options for 
cross compilations. In such case it is likely that different tools would use 
the same options. Cloning config file for each tool is odd, so natural solution 
is a single file for a target.

The flexible solution is to search long name (like 
`i686-linux-android-g++.cfg`) first and if it is not found, look for short name 
based on 'target' only (such as `i686.cfg` and `foo.cfg`).

It make sense to use the same rule for the case when config file is specified 
explicitly (but not as path), so that invocation `foo-clang` be equivalent to 
`clang --config foo`. In this case the invocation:

  clang-cpp --config foo abc.c

would first search for the file `foo-clang-cpp`, then for `foo.cfg`.

**Target reloading**

Configuration file is a general mechanism but it was proposed as a solution for 
cross compilation problems. In this case config file holds options required to 
tune compilation for particular target.  There is a difficulty here because 
some command line options like -m32 effectively change the target and the new 
target may require different settings than those contained in config file.

The proposed solution is to reload config file. If:

- config file starts with architecture component (like `x86_64-`),
- command line contains option(s) that effectively changes target (like `-m32`),

then the driver tries to load config file with name obtained by replacing 
architecture component with the actual architecture. For instance, if config 
file was `x86_64-clang.cfg` the driver looks for `i686-clang.cfg`. If proper 
config file is found, options read from the previous config file are removed 
and content of new config file is inserted at the beginning of effective 
command line. If such file is not found, it is not an error.

Effect of target reloading must be exactly the same as if actual target were 
initially specified. For instance,  invocation `x86_64-clang -m32` must be 
equivalent to `i686-clang`. It means that:

- Options read from previous config file are removed entirely.
- The search for new config file is made by the same rule as original file, 
that is first `i686-clang.cfg` then `i686.cfg`.

**Conflicting settings**

It is possible that clang is requested to load several config files. Consider 
this possibility for the case of target, which represents more general case.

There are three ways to specify target for compilation:

- config file, like `clang --config i686`,
- executable prefix, like `i686-clang`,
- command line option, like `--target i686`.

All may be combined.

Using `--target` is existing way to reload target. It must be able to use it in 
combination with config files, the only difference is possible target 
reloading. It has precedence over other ways due to compatibility.

Combining executable prefix  and explicit config file may create conflicting 
choice, for instance:

  mips64-clang --config x86_64

Possible solutions are:

- emit error,
- treat command line option as having higher priority.

The latter solution looks more appropriate, as it is consistent with the way 
clang processes command line options, `--target` in particular. Such 
combination may appear if `--config` comes from the set of compilation flags, 
while compiler is specified in different way. User may use `--config` instead 
of `--target` in existing build system just to set several target specific 
options, so this combination can be allowed for compatibility reason.

Another conflicting case is two config files specified in command line:

  clang --config mips64 --config x86_64

Again, we have possible solutions:

- treat such case as an error,
- treat the second option as having higher priority.

The latter way is consistent with the general option treatment, but we have no 
compatibility reason to enable it. As this combination may be a result of 
error, probably it is better to prohibit it.


https://reviews.llvm.org/D24933



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D24933: Enable configuration files in clang

Reply via email to