Node packaging is fucked. Node packaging remains fucked. And we have fucked it. How shall we comfort ourselves, the makers of all unmaintainable spaghetti? What was webscale and most utilitarian of all that the computers have yet executed has ground to a halt under our keyboards: who will wipe this blood off us?
Node packaging is fucked. Node packaging remains fucked.
I am sorry, but as a noobie user of npm I don’t understand. It works pretty well for me if you use it normally for what it is supposed for.
If used in larger systems it can be a pain to maintain code bases as you could install an innocuous package but that package may depend on 100 other packages which in turn could have other dependencies and it cascades.
This can introduce bugs into your code which can be a pain to resolve.
Isn’t this a problem with every package/library system? Is there really a solution to this that doesn’t limit packages with how they handle their dependencies?
This may also be about trust. npm probably could limit a number of dependencies that a single package can have with an arbitrary limit, but they don’t do that, because they trust the developers they won’t misuse their options. Well…
Thats a good question and I’m not sure to be honest.
We use NPM at work client side for React Typescript and Nuget server side for C# .net and all I know is the senior always complains about NPM but not NuGet I do believe the backend is less package reliant on our applications so maybe that’s why it’s not as bad.
I’m curious if you mean this one issue talked about in the article is the only reason why node packaging is “fucked” or do you have any citations you can provide that point out other issues with it?
I feel this is just a natural progression of how the developers wanted it to function and this is an opportunity to resolve it.
Better that this is done by mistake and resolved than it being used in a malicious attack.
It’s the cascading nature of the dependencies. You could install a single package that might directly or indirectly depend on 100’s of other packages, which can introduce bugs into existing code bases which can be difficult to fix as you have no control over another library or dependency.
We’ve since realized there is an issue with “star” versions - a.k.a depending on any/all versions of another package ( “package-xyz”: “*” ) - any version of that package is now unable to unpublish.
kinda reminds me of the ‘reply all’ snafu that microsoft caused themselves with early exchange server, the complete system failures, and the subsequent attempts at controlling that feature
This is hilarious, but now I’m wondering, what would a saner package manger look like?
- Like Python, have a large and featureful standard library such that > 80% of NPM packages are redundant. Other languages allow you to make very large projects with only a few tens of dependencies. JavaScript requires THOUSANDS.
- With this in place, stop with the recursive dependencies, immediately and forever. Every other package manager under the sun installs the dependencies next to each other.
I’d say
pip
is saner, though not by much as its support for private registries is very bad and seems designed to facilitate supply-chain attacks. I’ve heard a lot of good things aboutcargo
but haven’t used it enough myself to have a strong opinion.The lack of a standard library is really the worst offender. Most of a given node_modules directory is filled with middleware to handle JS’s lack of everything.
Is that still a valid argument in 2024? The standard library has grown since the leftpad scandal. JS does have standard leftpad now.
It’s a genuine question, I no longer write Javascript for a living.
Compared to other languages it’s still very barebones – but admittedly some of the bloat is also because the JS world is kinda set in its ways. I still see people use jQuery for basic selector queries and SASS for basic CSS variables.
Another factor is that developers these days assume that users have fast unmetered connections. Loading 800 kB of minified gzipped JS from ten different domains is seen as no big deal. When the cost of adding piles of dependencies is considered nil there’s no impetus to avoid them.
That last point truly bothers me, too. It’s fine to have a bloated work environment (some people use Visual Studio, after all). But that complexity should not get offloaded to your users. Webdevs need to do better on this front, it’s not 2015 anymore.
Pip is definitely not saner. The way installs are centralized has bitten me in the ass multiple times, when I wanted to have two different versions of Conan installed on a single machine.
And I know there are workarounds like virtualenvs, but they’re complex hacks. Stockholm syndrome yadda yadda yadda.
If it was sane, downloads would be centralized (no point in downloading the same package over and over again) but installs would be project-local (symlinks? There are multiple ways to do this, cf Conan)
Sure, NPM is wasteful with storage space but I’ll take inefficient over brittle any day.
It’s saner, not perfect. With virtualenvs it does basically what you describe except that it re-downloads everything for every virtualenv, but that does not typically matter much since it’s not downloading a billion dependencies.
With NPM there’s no choice but to have hundreds of duplicates installed for every project, that’s not just inefficient but it is a security, maintainability, and auditability nightmare.
NPM is definitely saner for that use case because it works out of the box. Pip is not because it is based on shakier foundations. With NPM, you don’t get to a point where you rely on things to work correctly, and they suddenly don’t and you have trouble understanding why. And it does not force me to look at its nuts and bolts to allow me to work with it.
I can afford big node_modules directories, even if it’s not optimal. It’s still small compared to the cruft I’ve accumulated on other projects I’ve worked on with other technologies. Remember the order of priority of things: make it work >> make it efficient. Software engineering is about delivering software, it’s not an art. It doesn’t have to be pretty everywhere.
I will concede that NPM is not perfect. Despite its flaws, I love how Conan solves the issue we’re talking about.
The standard library thing is a really valid point, but how do you avoid recursive dependencies? Do you just not allow library packages to depend on anything?
pip
is sanerIs it? It is very bare bones in my experience, I could never bring myself to use it until they make it a more fully fledged tool, such as the cargo you mentioned, yes
Other package managers, like nuget, throw errors if all dependencies on a package cannot be met by a single version.
This is probably the result of it copying all libraries in the same output directory and that .net cannot load 2 different versions of the same library so more an application restriction.
The downside of this is that packages often can’t use newer features if they want to not block the users of that library and that utility libraries have to have his backwards compatibility so applications can use the latest version while dependent libraries target an older version. Often applications keep using older versions with known security issues.
Damn, sounds like a big headache x.x
npm
downloads every dependency recursively. Ifa
depends ond (= 1.2.3)
andb
depends ond (= 1.2.4)
, then both versions ofd
get downloaded intoa
andb
’s respectivenode_modules
.All other package managers I’m aware of resolve dependencies into a flat list then download, and you can only have one version of the same package on your system.
You mean npm duplicates even if the the two dependency versions are compatible?
you can only have one version of the same package on your system.
That couldn’t be, right? Otherwise, if you installed two packages that rely on different incompatible versions of another package, one of the two would break. Reading a bit they should check for “satisfiability”, I found some really interesting things on the topic looking around:
- pattern to install dependencies in Linux
- history of libsolv (used in e.g. DNF) (kinda flies over my head, but that’s where the term " satisfiability" came up)
You mean npm duplicates even if the the two dependency versions are compatible?
By default yes, unless you explicity use the “peer dependency” system which isn’t the default. The “default” naive implementation is for every package in your
node_modules
to have anode_modules
of its own, all the way down recursively. There are tricks nowdays to deduplicate packages with the exact same version, but not to automatically detect “compatible” versions and use those instead (in my experience nothing would work if that was the case, deleting package-lock.json causes way too many issues due to the… uh, let’s call it “brave” approach of JS devs to stability).That couldn’t be, right? Otherwise, if you installed two packages that rely on different incompatible versions of another package, one of the two would break
Correct. This is intended behavior which is solved in several ways:
- Correctly declaring your dependencies. If newer versions of a dependency break your package, disallow them, but that is not normally needed for minor version changes.
- Focus on quality. Semver exists for a reason, and
1.2.3
should not break something built against1.1.2
. JS and NPM’s cascade of stupid implementations bred a culture of “move fast and break things”, but that’s not the norm in any other commonly used ecosystem - Linux distros almost exclusively use curated repositories, so they are (mostly) internally consistent and incompatibilities are rare and quickly fixed. A good package manager will resolve dependencies and automatically detect incompatibilities, proposing several fixes (typically abort the upgrade or uninstall one of the problematic packages)
- Not breaking down packages into a constellation of smaller packages.
glibc6
isglibc6
, notglibc_string (1.2.3)
+glibc_memory (2.6.5)
+glibc_fs (1.5.3)
+glibc_stdio (1.9.2)
+glibc_threads (6.1.0)
+ …
Internallyglibc6
is a bunch of modules, but they get bundled into one package specifically to simplify dependency management.
Not being able to install two versions of the same package sounds restrictive, but it’s a HUGE security benefit:
glibc6 (1.2.3)
is vulnerable to CVE-2024-1, then updating toglibc6 (1.2.4)
secures your entire system at once. With NPM though, you have to either wait for every. single. dependency on that vulnerable package down your tree to recursively update, or patch those versions yourself (at your own risk because again, small version changes often break things since developers think that NPM’s dependency model means they don’t have to actually provide stability guarantees).Wow, awesome explanation! I think I understand now
IDK any full-time JS or Node developers but they seem like they’re lazy and all have ADD. Packages developed for years still on version 0.x, packages depending on deprecated packages that were replaced by core functionality, packages still using CommonJS format (which I actually like better unfortunately), and popular packages without an update for 3 years. It feels like the entire ecosystem is for hobbyists only and businesses are like, “Cute language, but not for us.”
Ryan Dhal, the creator of node, litterally saw the npm problem(s) before incidents like this happened, and created Deno to fix his mistakes. And fix them he did! The Deno import system is incredible. Its basically the only reason I use deno. You can just import URLs directly, the deno vendors (aka caches) them. Deno has an equivlent to npm.org (Deno.land/x) but anyone can import straight from github, or make their npm.org equivlent, or import from their own private server. So if a company wants reliability, they can mirror deno.land while also avoiding unpublishing.
Yes, that’s really nice! Even though I haven’t touched it in a long time, I remember messing around with it out as soon as it came out a few years ago. There’s also nest.land between the alternative repositories, I find their concept interesting
have a look at nix
This situation is due to npm’s policy shift following the infamous “left-pad” incident in 2016, where a popular package left-pad was removed, grinding development to a halt across much of the developer world. In response, npm tightened its rules around unpublishing, specifically preventing the unpublishing of any package that is used by another package.
This already seems like a pretty strange approach, and takes away agency from package maintainers. What if you accidentally published something you want to remove…? It kind of turns npm into a very centralized system.
If they don’t want to allow hard-removals because of this, why not let people unpublish packages into a soft/hidden state instead? Maybe mark them with the current dependencies, but don’t allow new ones - or something
I prefer the approach of Azure DevOps more. When you publish any nuget, or npm into their system, the entire package dependency tree is pulled in and backed up there. So you don’t rely on NPM anymore to keep your referenced packages safe
Cargo.rs also has no option to unpublish a package. There is however the option to yank a package which disables the inclusion in new projects by the automated dependency resolution. If the version is entered manually it will still be used.
I feel like you could also give the maintainers the power to “re-publish” using a different verified maintainer so that if such a thing does happen, it can be reversed without input from the maintainer that originally pulled it. I don’t know enough about the system to really know if this is a good idea tho.
Yeah then you’ve got security problems. If a maintainer pulls a package, you wouldn’t want some rando able to push a new one in its place.
What I want to know is how big would me node_modules be if I’d managed to install this?
Fastest answer I could find was from 2021: 285GB.
So make sure you have over half a terrabyte of free storage before you try this. These libraries can do things on install, like download and compile binaries. Then there’s overhead for inodes and such, since we’re talking about millions of files. So the impact to the filesystem is going to be much, much bigger than any figure cited like the above.
Amazing, thank you!
yes
I see this as a delightful and very apropos revenge against how they treated Azer Koçulu for his use of kik as a package name, and how they tore it out of his hands simply because he wasn’t some million-dollar corp with an army of vampire lawyers.
Any truly fair system would have a “you used it legitimately first, you can have it” system.
You want to centralize control, and kick small devs in the teeth? Enjoy the fallout, f**kers.
Any site that uses AI generated images for the thumbnail can fuck off, I’d rather see nothing
I know it’s fun to mock
npm
, but it any package registry secure from something like this? Is there any public package registry that reviews all its packages?It’s less of an issue of reviewing all packages than it is that this causes DOS in the first place. It’s pretty damn stupid that you can’t unpublish packages others depend on, and the whole recursive dependencies thing makes the situation a lot worse than it otherwise would be. Neither of these are issues with other package registries.
One problem that’s particular to node is that you can’t unpublish packages if another package depends on them. As it says in the article, that means that no one can unpublish their packages, including the everyone package since someone apparently depends on that.