- Conda packages are not compiled for performance typically
- Even the compiler stack is pretty flaky (since they muck around with the names)
- They vendor old versions (e.g.
gfortran
on Windows is outdated for years now) - Some of these sharp edges are in the other article some are here on the NumPy docs (older)
- More on general scientific software and python is in the packaging docs
- It is very easy to get a shared library from
conda
which expected something on your system and will subsequently not work (I thinkbiber
is still a good example of that)
For more on why maybe conda
isn’t great for scientific use see Spack (which I use often) and Easybuild (similar enough) which are really more robust, since they allow variants and different compilation options…
Since we are now far enough away from the original topic, I should point out this is not unique to Python. CRAN (for R) has strict restrictions on (among other things) the kind of compiled code you can distribute. For this, biologists have Bioconda
(also for python I guess). However, R packages still develop with CRAN in mind typically, because that’s the ecosystem.
Similarly, Python ↔ PyPI, regardless of the many other ways you can get some kind of Python interaction / setup.
- Here is more on
pip
andconda
. - This is the Easybuild comparison page
Conda is a package manager that runs on Windows, macOS and Linux, and is very popular in the scientific community.
It focuses on quick installation of software and ease of use, and lets users create a conda environment in which they can install one or more packages. These packages are usually pre-built generic binaries however, which may significantly impact the performance of the installations.
Despite wide adoption in the scientific community
conda
is not a good fit for HPC systems for a number of reasons, including poor support for multi-user environments, a lack of focus on performance, heavily relying on the home directory (which usually is limited in size on HPC systems), and more. There is also no guarantee that it will install libraries that are compatible with the hardware of the cluster you’re working on, so the Conda-installed software may not always talk properly to the cluster interconnect or resource manager. See this link for a more detailed discussion.
In addition, software installed via
conda
usually does not mix well with software installed through environment modules.
^ Relevant parts extracted. How much this matters to you as a developer / user will differ from person to person. In my day to day work I need high performance. I use nix
on HPC or compile the world via spack
with whatever compilers best suite the machine (once I did this by hand)… Though that is far away from the average python
user trying to pip
install scipy
to fit a spline or something. I rarely use Windows either, but that doesn’t mean I would be willing to say it isn’t of relevance to anyone because there are better alternatives…