Module extensions

Report an issue View source

Module extensions allow users to extend the module system by reading input data from modules across the dependency graph, performing necessary logic to resolve dependencies, and finally creating repos by calling repo rules. These extensions have capabilities similar to repo rules, which enables them to perform file I/O, send network requests, and so on. Among other things, they allow Bazel to interact with other package management systems while also respecting the dependency graph built out of Bazel modules.

You can define module extensions in .bzl files, just like repo rules. They're not invoked directly; rather, each module specifies pieces of data called tags for extensions to read. Bazel runs module resolution before evaluating any extensions. The extension reads all the tags belonging to it across the entire dependency graph.

Extension usage

Extensions are hosted in Bazel modules themselves. To use an extension in a module, first add a bazel_dep on the module hosting the extension, and then call the use_extension built-in function to bring it into scope. Consider the following example — a snippet from a MODULE.bazel file to use the "maven" extension defined in the rules_jvm_external module:

bazel_dep(name = "rules_jvm_external", version = "4.5")
maven = use_extension("@rules_jvm_external//:extensions.bzl", "maven")

This binds the return value of use_extension to a variable, which allows the user to use dot-syntax to specify tags for the extension. The tags must follow the schema defined by the corresponding tag classes specified in the extension definition. For an example specifying some maven.install and maven.artifact tags:

maven.install(artifacts = ["org.junit:junit:4.13.2"])
maven.artifact(group = "com.google.guava",
               artifact = "guava",
               version = "27.0-jre",
               exclusions = ["com.google.j2objc:j2objc-annotations"])

Use the use_repo directive to bring repos generated by the extension into the scope of the current module.

use_repo(maven, "maven")

Repos generated by an extension are part of its API. In this example, the "maven" module extension promises to generate a repo called maven. With the declaration above, the extension properly resolves labels such as @maven//:org_junit_junit to point to the repo generated by the "maven" extension.

Extension definition

You can define module extensions similarly to repo rules, using the module_extension function. However, while repo rules have a number of attributes, module extensions have tag_classes, each of which has a number of attributes. The tag classes define schemas for tags used by this extension. For example, the "maven" extension above might be defined like this:

# @rules_jvm_external//:extensions.bzl

_install = tag_class(attrs = {"artifacts": attr.string_list(), ...})
_artifact = tag_class(attrs = {"group": attr.string(), "artifact": attr.string(), ...})
maven = module_extension(
  implementation = _maven_impl,
  tag_classes = {"install": _install, "artifact": _artifact},
)

These declarations show that maven.install and maven.artifact tags can be specified using the specified attribute schema.

The implementation function of module extensions are similar to those of repo rules, except that they get a module_ctx object, which grants access to all modules using the extension and all pertinent tags. The implementation function then calls repo rules to generate repos.

# @rules_jvm_external//:extensions.bzl

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_file")  # a repo rule
def _maven_impl(ctx):
  # This is a fake implementation for demonstration purposes only

  # collect artifacts from across the dependency graph
  artifacts = []
  for mod in ctx.modules:
    for install in mod.tags.install:
      artifacts += install.artifacts
    artifacts += [_to_artifact(artifact) for artifact in mod.tags.artifact]

  # call out to the coursier CLI tool to resolve dependencies
  output = ctx.execute(["coursier", "resolve", artifacts])
  repo_attrs = _process_coursier_output(output)

  # call repo rules to generate repos
  for attrs in repo_attrs:
    http_file(**attrs)
  _generate_hub_repo(name = "maven", repo_attrs)

Extension identity

Module extensions are identified by the name and the .bzl file that appears in the call to use_extension. In the following example, the extension maven is identified by the .bzl file @rules_jvm_external//:extension.bzl and the name maven:

maven = use_extension("@rules_jvm_external//:extensions.bzl", "maven")

Re-exporting an extension from a different .bzl file gives it a new identity and if both versions of the extension are used in the transitive module graph, then they will be evaluated separately and will only see the tags associated with that particular identity.

As an extension author you should make sure that users will only use your module extension from one single .bzl file.

Repository names and visibility

Repos generated by extensions have canonical names in the form of module_repo_canonical_name~extension_name~repo_name. For extensions hosted in the root module, the module_repo_canonical_name part is replaced with the string _main. Note that the canonical name format is not an API you should depend on — it's subject to change at any time.

This naming policy means that each extension has its own "repo namespace"; two distinct extensions can each define a repo with the same name without risking any clashes. It also means that repository_ctx.name reports the canonical name of the repo, which is not the same as the name specified in the repo rule call.

Taking repos generated by module extensions into consideration, there are several repo visibility rules:

  • A Bazel module repo can see all repos introduced in its MODULE.bazel file via bazel_dep and use_repo.
  • A repo generated by a module extension can see all repos visible to the module that hosts the extension, plus all other repos generated by the same module extension (using the names specified in the repo rule calls as their apparent names).
    • This might result in a conflict. If the module repo can see a repo with the apparent name foo, and the extension generates a repo with the specified name foo, then for all repos generated by that extension foo refers to the former.

Best practices

This section describes best practices when writing extensions so they are straightforward to use, maintainable, and adapt well to changes over time.

Put each extension in a separate file

When extensions are in a different files, it allows one extension to load repositories generated by another extension. Even if you don't use this functionality, it's best to put them in separate files in case you need it later. This is because the extension's identify is based on its file, so moving the extension into another file later changes your public API and is a backwards incompatible change for your users.

Specify the operating system and architecture

If your extension relies on the operating system or its architecture type, ensure to indicate this in the extension definition using the os_dependent and arch_dependent boolean attributes. This ensures that Bazel recognizes the need for re-evaluation if there are changes to either of them.

Only the root module should directly affect repository names

Remember that when an extension creates repositories, they are created within the namespace of the extension. This means collisions can occur if different modules use the same extension and end up creating a repository with the same name. This often manifests as a module extension's tag_class having a name argument that is passed as a repository rule's name value.

For example, say the root module, A, depends on module B. Both modules depend on module mylang. If both A and B call mylang.toolchain(name="foo"), they will both try to create a repository named foo within the mylang module and an error will occur.

To avoid this, either remove the ability to set the repository name directly, or only allow the root module to do so. It's OK to allow the root module this ability because nothing will depend on it, so it doesn't have to worry about another module creating a conflicting name.