[fpm] Discovery of installed fpm packages

Hello Fortran Discourse readers,

I’m initiating this thread to gather your thoughts and insights on the upcoming task of discovery of installed fpm packages. Here are the key developments we’d like to introduce:

  1. Support for config file:

    • a global config file for the fpm that holds all the information for the remote registry (fpm-registry or self hosted fork ) and local registry (if you prefer to store your dependencies locally, e.g. during development or to have your dependencies bundled with your code, you can define a directory that acts as your local registry).
    • a local config file for the fpm (users can specify a custom location for the file by using a command option. That allows for having all configurations in a single folder.) can also be used as different conda/python environments.
  2. Discovery of packages installed:

    • discovery of installed packages from the local registry as multiple config files (each package could have its own config file) could exist thus multiple local registries (environments), should we consider the search of packages into a multiple local registries or only in the projects local registry and default local registry ~/.local/share/fpm/dependencies . Your input on this decision would be valuable.
    • display functionality of packages installed also to support directly (CLI) adding the dependencies to the project similar to pip install. Your input on this decision would be valuable.

We’re keen to receive your feedback, ideas, and criticisms. Please share your thoughts on this page, and let’s work together on this task.

Thank you,
Henil

CC @certik @awvwgk @FedericoPerini @arteevraina

3 Likes

I would suggest a new sub-command for helping to maintain the registry cache(s). Perhaps something of the form

fpm registry-cache clean|get|list|search [--namespace=GPF [--package M_strings] [--release=1.0.0]] [--dir DIRNAME]

EXAMPLES

fpm registry-cache  # list current local registry cache directories being searched

# get and install into /usr/share/fpm/dependencies a registered package from current list of remote repositories 
fpm registry-cache get --namespace=GPF --package M_strings --release=1.0.0 --dir /usr/share/fpm/dependencies

# list manifest information about locally installed registry packages
#    perhaps restrict the listing  with --namespace, --dir, --package --release supporting
#    globbing and maybe allow dir to be a list of directories.
fpm registry-cache list 

# show manifest information that matches requested string
# search and list functionality could perhaps be combined into a single verb
fpm registry-cache search "date"

# remove specific package 
# should defaults be all found or should globbing be required? That is, should
# the following get rid of any versions?
fpm registry-cache clean --namespace GPF --package M_strings 
# 
# should this be allowed to remove an entire namespace?
fpm registry-cache clean --namespace GPF 
# should a --quiet switch be needed or should cleaning or removing a package produce prompting?
fpm registry-cache clean --namespace GPF --package "*"

On clusters with multiple users a shared registry cache should
be supported to minimize duplication of files and to allow for an
administrator to populate the cache. This is particularly necessary for
systems not connected to the www.

Allowing the use of multiple caches should be supported. This raises the
question of when duplicates are found whether the first one encountered
should be used or whether the highest release found should be used when
a release is not specified.

3 Likes

For inspiration you can have a look at what nuget is doing: https://learn.microsoft.com/en-us/nuget/reference/cli-reference/cli-ref-locals
In essence you have a command locals that gives you access to installed packages:

nuget locals all —list

On the disk, they are installed under c/users/xxx/.nuget/packages

2 Likes

Thanks @urbanjost and @davidpfister , These are great starting points.

I think we should allow a flag or option that gives user choice to select the which method to use something like :

strategy = "first_encounter"
# Possible values: "first_encounter", "highest_release", "prompt_user"

we could also configure to make a shared local registry using by configuring the cache_path to a shared path on clusters which could be accessible to all the users. Also how should we consider supporting a config file, either an entry in the fpm.toml and/or flag used during the build?

1 Like
  1. I think it would be beneficial to have
    a. global config file (i.e. /etc/fpm.toml)
    b. user config file (i.e. $HOME/.config/fpm.toml)
    c. project config file (but we already have this, i.e. /project/path/fpm.toml)
  2. It depends what you mean by installed. Is it “source code downloaded” or “compiled and placed in a findable location”.

I.e.

- /install/location
  | - proj1/1.2.3
      | - src
          | - ...
  | - proj2/3.2.1
      | - src
          | - ...

Or

- /install/location
  | - proj1/1.2.3/gfortran_abcd1234
      | - lib/proj1.so
      | - include/proj1.mod
  | - proj1/2.3.4/ifx_1234abcd
      | - lib/proj1.so
      | - include/proj1.mod
  | - proj2/4.3.2/gfortran_abcd1234
      | - lib/proj2.so
      | - include/proj2{,_utils,_misc,_etc}.mod
...
3 Likes

Some quick thoughts:

Ideally this is only affecting where a registered package is obtained.

Currently for non-registry packages the manifest file must explicitly specify the remote site or a path starting at the top of the package directory. The remote dependencies are cached only in the build/ directory. The approach for registered packages appears to be taking a different approach. The fpm.toml file does not change, just where the registered packages are obtained from is changed.

In this case the manifest file does not indicate the source. So far where the registry is is hard-coded into the fpm code or a config file. So I would say the same method should apply as is used with the fpm(1) publish subcommand. That is, it (the indicator of where to search) should not be contained in the fpm.toml file.

Making the local directories expanded files without a revision history as is being done now instead of simulating the actual repository, which distributes archive files makes it easy for someone to alter the packages locally, which is a drawback of that approach. It has advantages as well but if the goal is to only cache registered packages this could be an issue. So at a minimum write access should be removed from the files by default to discourage casually changing registered packages. Keeping the cached packages expanded as is done now has advantages when working with abandoned registered packages but the better approach would be to expand them and convert them to new projects if they need expanded.

Looking at some of the other package managers a more common name for “get” in the prototype command is “fetch”, and the default for a command like

fpm registry-cache fetch

would be to get all remote registry packages and cache them.
The commands typically appear to be done as plug-ins and not coded into the main executable. Instead of the name “registry-cache” names like “gallery”, “cache” and “locals” are used. There seems to be no common term used. This is a local registry so “cache” seems inappropriate, as a true cache would just be in the build/ directory and/or would typically have an expiration time.

1 Like

I think the appropriate term here would be more like “mirror”. All registries that have a version of a package should agree on its contents.

Published versions of packages should never change, so which copy of that version to use should not matter.

I believe that the decision to not specify a version implies that one desires the latest version, so all available registries should be queried to determine the latest version, and that version can be obtained from whatever registry is most efficient.

Agreed. Changes the contents of a published version of a package is a terrible idea. It should be discouraged in every way possible. Placing local mirrors in hidden folders and making them read-only is pretty standard practice I think.

2 Likes

The mirror will contain the source code downloaded from the registry in an unzipped form. Thanks @everythingfunctional and @urbanjost , These are great pointers to refer and have brought great clarity on the implementation details. I will keep on updating this thread with further updates/inputs.

2 Likes