Solving Package Environments#

The libmambapy.solver submodule contains a generic API for solving requirements (MatchSpec) into a list of packages (PackageInfo) with no conflicting dependencies. This problem is hard to solve (NP-complete) which is why Mamba uses a SAT solver to do so.

Note

There is currently only one solver available in Mamba: LibSolv. For this reason, the generic interface has not been fully completed and users need to access the submodule libmambapy.solver.libsolv for certain types.

Populating the Package Database#

The first thing needed is a Database of all the packages and their dependencies. Packages are organised in repositories, described by a RepoInfo. This serves to resolve explicit channel requirements or channel priority. As such, the database constructor takes a set of ChannelResolveParams to work with Channel work with Channel data internaly (see the usage section on Channels for more information).

The first way to add a repository is from a list of PackageInfo using DataBase.add_repo_from_packages:

import libmambapy

db = libmambapy.solver.libsolv.Database(
    libmambapy.specs.ChannelResolveParams(channel_alias="https://conda.anaconda.org")
)

repo1 = db.add_repo_from_packages(
    packages=[
        libmambapy.specs.PackageInfo(name="python", version="3.8", ...),
        libmambapy.specs.PackageInfo(name="pip", version="3.9", ...),
        ...,
    ],
    name="myrepo",
)

The second way of loading packages is throuch Conda’s reposoitory index format repodata.json using DataBase.add_repo_from_repodata. This is meant as a convenience and performant alternative to the former method, since these files grow large.

repo2 = db.add_repo_from_repodata(
    path="path/to/repodata.json",
    url="htts://conda.anaconda.org/conda-forge/linux-64",
)

One of the reppository can be set to have a special meaning of “installed repository”. It is used as a reference point in the solver to compute changes. For instance if a package is required but is already available in the installed repo, the solving result will not mention it. The function DataBase.set_installed_repo is used for that purpose.

db.set_installed_repo(repo1)

Binary serialization of the database (Advanced)#

The Database reporitories can be serialized in binary format for faster reloading. To ensure integrity and freshness of the serialized file, metadata about the packages, such as source url and RepodataOrigin, are stored inside the file when calling DataBase.native_serialize_repo . Upon reading, similar parameters are expected as input to DataBase.add_repo_from_native_serialization. If they mistmatch, the loading results in an error.

A typical wokflow first tries to load a repository from such binary cache, and then quietly fallbacks to repodata.json on failure.

Creating a solving request#

All jobs that need to be resolved are added as part of a Request. This includes installing, updating, removing packages, as well as solving cutomization parameters.

Request = libmambapy.solver.Request
MatchSpec = libmambapy.specs.MatchSpec

request = Request(
    jobs=[
        Request.Install(MatchSpec.parse("python>=3.9")),
        Request.Update(MatchSpec.parse("numpy")),
        Request.Remove(MatchSpec.parse("pandas"), clean_dependencies=False),
    ],
    flags=Request.Flags(
        allow_downgrade=True,
        allow_uninstall=True,
    ),
)

Solving the request#

The Request and the Database are the two input parameters needed to solve an environment. This task is achieve with the Solver.solve method.

solver = libmambapy.solver.libsolv.Solver()
outcome = solver.solve(db, request)

The outcome can be of two types, either a Solution listing packages (PackageInfo) and the action to take on them (install, remove…), or an UnSolvable type when no solution exist (because of conflict, missing pacakges…).

Examine the solution#

We can test if a valid solution exists by checking the type of the outcome. The attribute Solution.actions contains the actions to take on the installed repository so that it satisfies the Request requirements.

Solution = libmambapy.solver.Solution

if isinstance(outcome, Solution):
    for action in outcome.actions:
        if isinstance(action, Solution.Upgrade):
            my_upgrade(from_pkg=action.remove, to_pkg=action.install)
        if isinstance(action, Solution.Reinstall):
            ...
        ...

Alternatively, an easy way to compute the update to the environment is to check for install and remove members, since they will populate the relevant fields for all actions:

Solution = libmambapy.solver.Solution

if isinstance(outcome, Solution):
    for action in outcome.actions:
        if hasattr(action, "install"):
            my_download_and_install(action.install)
        # WARN: Do not use `elif` since actions like `Upgrade`
        # are represented as an `install` and `remove` pair.
        if hasattr(action, "remove"):
            my_delete(action.remove)

Understand unsolvable problems#

When a problem as no Solution, it is inherenty hard to come up with an explanation. In the easiest case, a requiered package is missing from the Database. In the most complex, many package dependencies are incompatible without a single culprit. In this case, packages should be rebuild with weaker requirements, or with more build variants. The UnSolvable class attempts to build an explanation.

The UnSolvable.problems is a list of problems, as defined by the solver. It is not easy to understand without linking it to specific MatchSpec and PackageInfo. The method UnSolvable.problems_graph gives a more structured graph of package dependencies and incompatibilities. This graph is the underlying mechanism used in UnSolvable.explain_problems to build a detail unsolvability message.