Solving Package Environments#

The libmambapy.solver submodule contains a generic API for solving requirements (MatchSpec) into a list of packages (PackageInfo) with no conflicting dependencies.

Note

Solving Package Environments can be cast as a Boolean satisfiability problem (SAT). Mamba currently only uses one SAT solver: LibSolv. For this reason, the generic interface has not been fully completed and users need to access the submodule libmambapy.solver.libsolv for certain types.

Populating the Package Database#

The first thing needed is a Database of all the packages and their dependencies. Packages are organised in repositories, described by a RepoInfo. This serves to resolve explicit channel requirements or channel priority. As such, the database constructor takes a set of ChannelResolveParams to work with Channel data internally (see the usage section on Channels for more information).

The first way to add a repository is from a list of PackageInfo using DataBase.add_repo_from_packages:

import libmambapy

channel_alias = libmambapy.specs.CondaURL.parse("https://conda.anaconda.org")

db = libmambapy.solver.libsolv.Database(
    libmambapy.specs.ChannelResolveParams(channel_alias=channel_alias)
)

repo1 = db.add_repo_from_packages(
    packages=[
        libmambapy.specs.PackageInfo(name="python", version="3.8", ...),
        libmambapy.specs.PackageInfo(name="pip", version="3.9", ...),
        ...,
    ],
    name="myrepo",
)

The second way of loading packages is through Conda’s repository index format repodata.json using DataBase.add_repo_from_repodata_json. This is meant for convenience, and is not a performant alternative to the former method, since these files grow large.

repo2 = db.add_repo_from_repodata_json(
    path="path/to/repodata.json",
    url="htts://conda.anaconda.org/conda-forge/linux-64",
    channel_id="conda-forge",
)

One of the repositories can be set to have a special meaning of “installed repository”. It is used as a reference point in the solver to compute changes. For instance, if a package is required but is already available in the installed repo, the solving result will not mention it. The function DataBase.set_installed_repo is used for that purpose.

db.set_installed_repo(repo1)

Binary serialization of the database (Advanced)#

The Database reporitories can be serialized in binary format for faster reloading. To ensure integrity and freshness of the serialized file, metadata about the packages, such as source url and RepodataOrigin, are stored inside the file when calling DataBase.native_serialize_repo . Upon reading, similar parameters are expected as inputs to DataBase.add_repo_from_native_serialization. If they mismatch, the loading results in an error.

A typical wokflow first tries to load a repository from such binary cache, and then quietly fallbacks to repodata.json on failure.

Creating a solving request#

All jobs that need to be resolved are added as part of a Request. This includes installing, updating, removing packages, as well as solving cutomization parameters.

Request = libmambapy.solver.Request
MatchSpec = libmambapy.specs.MatchSpec

request = Request(
    jobs=[
        Request.Install(MatchSpec.parse("python>=3.9")),
        Request.Update(MatchSpec.parse("numpy")),
        Request.Remove(MatchSpec.parse("pandas"), clean_dependencies=False),
    ],
    flags=Request.Flags(
        allow_downgrade=True,
        allow_uninstall=True,
    ),
)

Solving the request#

The Request and the Database are the two input parameters needed to solve an environment. This task is achieved with the Solver.solve method.

solver = libmambapy.solver.libsolv.Solver()
outcome = solver.solve(db, request)

The outcome can be of two types, either a Solution listing packages (PackageInfo) and the action to take on them (install, remove…), or an UnSolvable type when no solution exists (because of conflict, missing packages…).

Examine the solution#

We can test if a valid solution exists by checking the type of the outcome. The attribute Solution.actions contains the actions to take on the installed repository so that it satisfies the Request requirements.

Solution = libmambapy.solver.Solution

if isinstance(outcome, Solution):
    for action in outcome.actions:
        if isinstance(action, Solution.Upgrade):
            my_upgrade(from_pkg=action.remove, to_pkg=action.install)
        if isinstance(action, Solution.Reinstall):
            ...
        ...

Alternatively, an easy way to compute the update to the environment is to check for install and remove members, since they will populate the relevant fields for all actions:

Solution = libmambapy.solver.Solution

if isinstance(outcome, Solution):
    for action in outcome.actions:
        if hasattr(action, "install"):
            my_download_and_install(action.install)
        # WARN: Do not use `elif` since actions like `Upgrade`
        # are represented as an `install` and `remove` pair.
        if hasattr(action, "remove"):
            my_delete(action.remove)

Understand unsolvable problems#

When a problem has no Solution, it is inherenty hard to come up with an explanation. In the easiest case, a required package is missing from the Database. In the most complex, many package dependencies are incompatible without a single culprit. In this case, packages should be rebuilt with weaker requirements, or with more build variants. The UnSolvable class attempts to build an explanation.

The UnSolvable.problems is a list of problems, as defined by the solver. It is not easy to understand without linking it to specific MatchSpec and PackageInfo. The method UnSolvable.problems_graph gives a more structured graph of package dependencies and incompatibilities. This graph is the underlying mechanism used in UnSolvable.explain_problems to build a detail unsolvability message.