Design Document: Specifying environment variables for actions
Design documents are not descriptions of the current functionality of Bazel. Always go to the documentation for current information.
Status: Implemented. See documentation
Author: Klaus Aehlig
Design document published: 21 June 2016
Currently, Bazel provides a cleaned set of environment variables to the actions in order to obtain hermetic builds. This, however is not sufficient for all use cases.
Projects often want to use tools which are not part of the repository; however, their location varies from installation to installation. So, some sensible value for the
PATHenvironment variable has to be set.
Some set-ups depend on every program having access to specific variables, e.g., indicating the homebrew paths, or library paths.
Commercial compilers sometimes need to be passed the location of a license server through the environment.
We propose to add a new bazel flag,
--action_env which has two
valid forms of usage,
specifying a variable with unspecified value,
specifying a variable with a value,
--action_env=VARIABLE=VALUE; in the latter case, the value can well be the empty string, but it is still considered a specified value.
This flag has a “latest wins” semantics in the sense that if the option is given twice for the same variable, only the latest option will be used, regardless whether specified or unspecified value. Options given for different variables accumulate.
In every action executed
use_default_shell_env being true,
precisely the environment variables specified by
--action_env options are set as the default environment.
(Note that, therefore, by default, the environment for actions is empty.)
If the effective option for a variable has an unspecified value, the value from the invocation environment of Bazel is taken.
If the effective option for a variable specifies a value, this value is taken, regardless of the environment in which Bazel is invoked.
Environment variables are considered an essential part of an action. In other words, an action is expected to produce a different output, if the environment it is invoked in differs; in particular, a previously cached value cannot be taken if the effective environment changes.
Given that normally a rule writer cannot know which tools might need fancy
environment variables (think of the commercial compiler use case), the default
parameter will become true.
List of rc-files read by Bazel
The list of rc-files that Bazel takes options from will include, at
least, the following files, where files later in the list take precedence over
the ones earlier in the list for conflicting options; for the
--action_env option the already described “latest wins” semantics is
A global rc-file. This file typically contains defaults for a whole group of machines, like all machines of a company. On UNIX-like systems, it will be located at
A machine-wide rc-file. This file is typically set by the administrator of the machine or a group of machines with the same architecture. It typically contains settings that are specific to that architecture and hardware. On UNIX-like systems it will be next to be binary and called like the binary with
.bazelrcappended to the file name.
A user-specific file, located in
~/.bazelrc. This file will be set by each user for options desired for all Bazel invocations.
A project-specific file. This is the file
tools/bazel.rcnext to the
WORKSPACEfile. This file is considered project-specific and typically versioned in the same repository as the project.
A file specific to user, project, and checkout. This is the file
.bazelrcnext to the
WORKSPACEfile. As it is specific to the user and the machine he or she is working on, projects are advised to ignore that file in the repository of the project (e.g., by adding it to their
.gitignorefile, if they version the project with git).
When looking for those rc-files, symbolic links are followed; files not
existing are silently assumed to be empty. Note that all those are regular
rc-files for Bazel, hence are not limited to the newly introduced
--action_env option. Also, the rule that options for more specific
invocations win over common options still applies; but, within each level of
specificness, precedence is given according to the mentioned order of rc-files.
Example usages of environment specifications
The proposed solution allows for a variety of use cases, including the following.
Systems using commercial compilers can set the environment variables with information about the license server in the global rc file.
Users requiring special variables, like the ones used by homebrew, can set them in their machine specific rc-file. In fact, once this proposal is implemented, the homebrew port for Bazel could itself install that machine-wide rc-file.
Projects depending on the environment, e.g., because they use tools assumed to be already installed on the user’s systm, have several options.
If they are optimistic about the environment, e.g., because they are not very version dependent on the tools used, can just specify which environment variables they depend on by adding declarations with unspecified values in the
If dependencies are more delicate, projects can provide a configure script that does whatever analysis of the environment is necessary and then write
--action_envoptions with specified values to the user-project local
.bazelrcfile. As the configure script will only run when manually invoked by the user and the syntax of the user-project local
.bazelrcfile is so that it can be easily be edited by a human, it is OK if that script only works in the majority of the cases, as a user requiring an unusual setup for that project can easily modify the user-project local
.bazelrcby hand afterwards.
Irrespectively of the approach chosen by the project, a user where the environment changes frequently (e.g., on clusters or other machines using a traditional layout) can fix the environment by adding
--action_envoptions with specific values to the user-project local
To simplify this use case, and other “freeze on first use” approaches, Bazel’s
infocommand will provide a new key
client-envthat will show the environment variables, together with their values. More precisely, each variable-value pair will be prefixed with
build --action_env=, so that
bazel info client-env >> .bazelrccan be used to freeze the environment.
Currently, some users of Bazel already make use of the fact that
TMPDIR are being passed to actions. To allow those
projects a smooth
transition to the new set up, the global Bazel rc-file provided by upstream
will have the following content.
build --action_env=PATH build --action_env=LD_LIBRARY_PATH build --action_env=TMPDIR build --test_env=PATH build --test_env=LD_LIBRARY_PATH
Bazel’s own dependency on
Bazel itself also uses external tools, like
sh, but also
bash where the location differs between installations. In
particular, a value for
PATH needs to be provided. This will be covered
by the setting of the global bazel configuration file. Should the need arise, a
configure-like script can be added; at the moment it seems that this will not
Reasons for the Design Choices, Risks, and Alternatives Considered
Conflicting Interests on the environment influencing actions
There are conflicting requirements for the environment variables of an action.
Users expect Bazel to “just work”, i.e., the expectation is that if a tool works on the command line, it should also work when called from an action in a Bazel invocation from the same environment. A lot of compilers, however, depend, at least on some systems, on certain environment variables. An approach used by quite a few other build systems is to pass through the whole invocation environment.
Bazel wants to provide correct and reproducible builds. Therefore, everything that potentially influences the outcome of an action needs to be controlled and tracked; a cached result cannot be used if anything potentially changing the outcome has changed.
Users expect Bazel to not do rebuilds they (i.e., the users) know are unnecessary. And, while for a lot of users the environment variables that actually influence the build stay stable, the full environment constantly changes; take the
OLDPWDenvironment variable as an example.
This design tries to reconcile these needs by allowing arbitrary environment variables being set for actions, but only in an opt-in way. Variables need to be explicitly mentioned, either in a configuration file or on the command line, to be provided to an action.
Generic Solutions versus Special Casing
As Bazel already has quite a number of concepts, there is the valid concern that the complexity might increase too much and newly added concepts might become a maintenance burden. Another concern is that more configuration mechanisms make it harder for the user to know which one is the correct one to use for his or her problem. The general desire is to have few, but powerful enough mechanisms to control the build behaviour and avoid special casing.
Putting the environment variables visible in actions in the hand of the user avoids the need of special casing more and more “important” environment variables.
Building on the already existing mechanism to specify, inherit, and override command-line options reduces the amount newly introduced concepts. The main addition is a command-line option.
Source of Knowledge for Needed Environment Variables
Another aspect that went into the design is that different entities know about environment variables that are essential for the build to work.
Some variables are “obviously” relevant, like
TMPDIR. However, there is no “obvious” value for them.
Both depend on the layout of the system in question. A special fast file system for temporary files might be provided at a designated location. Binaries might be installed under
/usr/local/bin, or even versioned paths to allow parallel installations of different versions of the same tool. For example, on Debian Gnu/Linux the
bashis installed in
/bin, whereas on FreeBSD it is usually installed in
/usr/local/bin(but the prefix
/usr/localis at the discretion of the system administrator).
The user might have custom-built versions of tools somewhere in the home directory, thus making the user the only one who knows an appropriate value for the
PATHvariable. Moreover, a user who works on several projects requiring different versions of the same tool may even require different values of the
PATHvariable for each project.
The authors and users of a tool know about special variables the tools need to work. While the tool itself might serve a standard purpose, like compiling C code, the variables the tool depends on might be specific to that tool (like passing information about a license server).
The maintainers of a porting or packaging system know about environment variables a tool might additionally need (e.g., in the homebrew case). These might not be needed if the same tool is packaged differently.
The project authors know about environment variables special to their project that some of their actions need.
These different sources of information make it hard to designate a
single maintainer for the action environment. This makes approaches
undesirable that are based on a single source specifying the action
environment, like the
WORKSPACE file, or the rule definitions. While
those approaches make it easy to predict the environment an action will
have, they all require the user to merge in the specifics of the system
and his or her personal settings for each checkout (including rebasing
these changes for each upstream change of that file). Collecting environment
variables via the rc-file mechanism allows setting each variable within
the appropriate scope (global, machine-dependent, user-spefic, project-specific,
specific to the user-project pair) in a conflict-free way by the entity
in charge of that scope.