Persistent workers can make your build faster. If you have repeated actions in your build that have a high startup cost or would benefit from cross-action caching, you may want to implement your own persistent worker to perform these actions.
The Bazel server communicates with the worker using stdin
/stdout
. It
supports the use of protocol buffers or JSON strings.
The worker implementation has two parts:
- The worker.
- The rule that uses the worker.
Making the worker
A persistent worker upholds a few requirements:
- It reads
WorkRequests
from its
stdin
. - It writes
WorkResponses
(and only
WorkResponse
s) to itsstdout
. - It accepts the
--persistent_worker
flag. The wrapper must recognize the--persistent_worker
command-line flag and only make itself persistent if that flag is passed, otherwise it must do a one-shot compilation and exit.
If your program upholds these requirements, it can be used as a persistent worker!
Work requests
A WorkRequest
contains a list of arguments to the worker, a list of
path-digest pairs representing the inputs the worker can access (this isn’t
enforced, but you can use this info for caching), and a request id, which is 0
for singleplex workers.
NOTE: While the protocol buffer specification uses "snake case" (request_id
),
the JSON protocol uses "camel case" (requestId
). This document uses camel case
in the JSON examples, but snake case when talking about the field regardless of
protocol.
{
"arguments" : ["--some_argument"],
"inputs" : [
{ "path": "/path/to/my/file/1", "digest": "fdk3e2ml23d"},
{ "path": "/path/to/my/file/2", "digest": "1fwqd4qdd" }
],
"requestId" : 12
}
The optional verbosity
field can be used to request extra debugging output
from the worker. It is entirely up to the worker what and how to output. Higher
values indicate more verbose output. Passing the --worker_verbose
flag to
Bazel sets the verbosity
field to 10, but smaller or larger values can be used
manually for different amounts of output.
The optional sandbox_dir
field is used only by workers that support
multiplex sandboxing.
Work responses
A WorkResponse
contains a request id, a zero or nonzero exit code, and an
output message describing any errors encountered in processing or executing
the request. A worker should capture the stdout
and stderr
of any tool it
calls and report them through the WorkResponse
. Writing it to the stdout
of
the worker process is unsafe, as it will interfere with the worker protocol.
Writing it to the stderr
of the worker process is safe, but the result is
collected in a per-worker log file instead of ascribed to individual actions.
{
"exitCode" : 1,
"output" : "Action failed with the following message:\nCould not find input
file \"/path/to/my/file/1\"",
"requestId" : 12
}
As per the norm for protobufs, all fields are optional. However, Bazel requires
the WorkRequest
and the corresponding WorkResponse
, to have the same request
id, so the request id must be specified if it is nonzero. This is a valid
WorkResponse
.
{
"requestId" : 12,
}
A request_id
of 0 indicates a "singleplex" request, used when this request
cannot be processed in parallel with other requests. The server guarantees that
a given worker receives requests with either only request_id
0 or only
request_id
greater than zero. Singleplex requests are sent in serial, for
example if the server doesn't send another request until it has received a
response (except for cancel requests, see below).
Notes
- Each protocol buffer is preceded by its length in
varint
format (seeMessageLite.writeDelimitedTo()
. - JSON requests and responses are not preceded by a size indicator.
- JSON requests uphold the same structure as the protobuf, but use standard JSON and use camel case for all field names.
- In order to maintain the same backward and forward compatibility properties as protobuf, JSON workers must tolerate unknown fields in these messages, and use the protobuf defaults for missing values.
- Bazel stores requests as protobufs and converts them to JSON using protobuf's JSON format
Cancellation
Workers can optionally allow work requests to be cancelled before they finish.
This is particularly useful in connection with dynamic execution, where local
execution can regularly be interrupted by a faster remote execution. To allow
cancellation, add supports-worker-cancellation: 1
to the
execution-requirements
field (see below) and set the
--experimental_worker_cancellation
flag.
A cancel request is a WorkRequest
with the cancel
field set (and
similarly a cancel response is a WorkResponse
with the was_cancelled
field set). The only other field that must be in a cancel request or cancel
response is request_id
, indicating which request to cancel. The request_id
field will be 0 for singleplex workers or the non-0 request_id
of a previously
sent WorkRequest
for multiplex workers. The server may send cancel requests
for requests that the worker has already responded to, in which case the cancel
request must be ignored.
Each non-cancel WorkRequest
message must be answered exactly once, whether or
not it was cancelled. Once the server has sent a cancel request, the worker may
respond with a WorkResponse
with the request_id
set and the was_cancelled
field set to true. Sending a regular WorkResponse
is also accepted, but the
output
and exit_code
fields will be ignored.
Once a response has been sent for a WorkRequest
, the worker must not touch the
files in its working directory. The server is free to clean up the files,
including temporary files.
Making the rule that uses the worker
You'll also need to create a rule that generates actions to be performed by the worker. Making a Starlark rule that uses a worker is just like creating any other rule.
In addition, the rule needs to contain a reference to the worker itself, and there are some requirements for the actions it produces.
Referring to the worker
The rule that uses the worker needs to contain a field that refers to the worker
itself, so you'll need to create an instance of a \*\_binary
rule to define
your worker. If your worker is called MyWorker.Java
, this might be the
associated rule:
java_binary(
name = "worker",
srcs = ["MyWorker.Java"],
)
This creates the "worker" label, which refers to the worker binary. You'll then define a rule that uses the worker. This rule should define an attribute that refers to the worker binary.
If the worker binary you built is in a package named "work", which is at the top level of the build, this might be the attribute definition:
"worker": attr.label(
default = Label("//work:worker"),
executable = True,
cfg = "exec",
)
cfg = "exec"
indicates that the worker should be built to run on your
execution platform rather than on the target platform (i.e., the worker is used
as tool during the build).
Work action requirements
The rule that uses the worker creates actions for the worker to perform. These actions have a couple of requirements.
The "arguments" field. This takes a list of strings, all but the last of which are arguments passed to the worker upon startup. The last element in the "arguments" list is a
flag-file
(@-preceded) argument. Workers read the arguments from the specified flagfile on a per-WorkRequest basis. Your rule can write non-startup arguments for the worker to this flagfile.The "execution-requirements" field, which takes a dictionary containing
"supports-workers" : "1"
,"supports-multiplex-workers" : "1"
, or both.The "arguments" and "execution-requirements" fields are required for all actions sent to workers. Additionally, actions that should be executed by JSON workers need to include
"requires-worker-protocol" : "json"
in the execution requirements field."requires-worker-protocol" : "proto"
is also a valid execution requirement, though it’s not required for proto workers, since they are the default.You can also set a
worker-key-mnemonic
in the execution requirements. This may be useful if you're reusing the executable for multiple action types and want to distinguish actions by this worker.Temporary files generated in the course of the action should be saved to the worker's directory. This enables sandboxing.
Assuming a rule definition with "worker" attribute described above, in addition
to a "srcs" attribute representing the inputs, an "output" attribute
representing the outputs, and an "args" attribute representing the worker
startup args, the call to ctx.actions.run
might be:
ctx.actions.run(
inputs=ctx.files.srcs,
outputs=[ctx.outputs.output],
executable=ctx.executable.worker,
mnemonic="someMnemonic",
execution_requirements={
"supports-workers" : "1",
"requires-worker-protocol" : "json"},
arguments=ctx.attr.args + ["@flagfile"]
)
For another example, see Implementing persistent workers.
Examples
The Bazel code base uses Java compiler workers, in addition to an example JSON worker that is used in our integration tests.
You can use their scaffolding to make any Java-based tool into a worker by passing in the correct callback.
For an example of a rule that uses a worker, take a look at Bazel's worker integration test.
External contributors have implemented workers in a variety of languages; take a look at Polyglot implementations of Bazel persistent workers. You can find many more examples on GitHub!