Package Management

We all use software written by other people. Whether it is libraries that we statically link, like in the old days of C, packages we import, like in most modern frameworks, or even external microservices, no developer stands alone. While often shown as a solitary, isolating job in the old “hacker” type movies, software development has since become an environment requiring lots of collaboration. This collaboration still occurs, even if you don’t know (or communicate with) the person whose code you are relying upon. While this collaboration has allowed us to build wonderful things, it also comes with a price. As a friend describes it, hell is … other people’s code. You might be able to trust someone else’s code and you might not, but you can never trust the notion that someone else’s code is trustworthy for any given purpose.

There are lots of issues with bringing in dependencies to a codebase. The code might create security issues, performance problems, break your app in subtle and hard-to-detect ways, or even trap you in an ancient version of your framework of choice. While it may be cheaper and easier to import a software package to help with whatever you are doing, adding a dependency is not a completely free operation – it requires maintenance and verification of the code you are bringing in. At the very least, even if the current version is perfect (unlikely), subsequent versions might still cause problems.

In essence, as in economics, there is no free lunch. Anything you do has a cost, and that includes the act of bringing in third party code to avoid having to do certain work yourself. If you are going to use third party code (and you certainly are if you are going to get anything done), then you need to make sure that your development processes take the unique risk and opportunities into account that such code provides. Done properly, these practices will not only insulate you from a lot of the risk of using other people’s code in your applications, but can also improve the overall quality of your code base.

Using third party code improves productivity and is often necessary in order to get things done in modern software development. However, doing so is not a zero cost act. Third party code can and will cause you problems in your development, testing, and production environments alike. These problems can range from small annoyances to system failures, to security holes (and intentionally introduced vulnerabilities) that can utterly destroy the quality of your system. If you are going to use third party code, you MUST take steps to make your implementation as stable and safe as possible, while still allowing you to update at a reasonable cadence. It takes practice and discipline to do this properly, but you need to do it.

Episode Breakdown

Be able to detect out of date packages

While your package manager will likely tell you that one or more packages are out of date, it’s unwise to trust your team to update packages as part of their normal course of work. It balloons the testing scope, takes time, and is easily missed. This probably means that you will want some sort of alerting that lets you know when packages are out of date, possibly even stopping builds and deployments if something is out of date “enough”.

Be aware that not all packages need to be updated immediately, and that people may not necessarily install the latest version immediately. You’ll need to stay on top of this to avoid ending up with wildly out of date packages being used by your codebase. You should also be aware that some package updates change things like licensing, interfaces and the like. You may not be able to upgrade immediately (or ever). This means that you’ll need to track the status of the various packages you are using. Make sure you are on the mailing and notification lists for critical packages, so that you can find out quickly if a major vulnerability surfaces.

Be able to determine surface area of changed packages.

hen a package changes, you need to know what part of the system is impacted. If you don’t, you are going to have to test for regressions across the board. This also has security implications. If an update has occurred, you need to know which system boundaries need to be re-tested before rolling the code to any publicly accessible location. Note that this also includes development and staging environments if those are exposed to the open internet.

Be especially careful when a package is communicating across a security boundary. In addition to the obvious security issues, such updates can result in compatibility issues that are difficult to track, such as changes in things like JSON serialization. Also pay attention to any packages that reference an updating package. If package A depends on package B, then the impacted surface area of a change to package A is the union of the impacted areas of both packages – don’t assume that the package authors got this right.

Treat package updates as a regular, recurring, and separate ticket from other work.

If package updates are expected as part of a normal course of development, they either don’t get done, or things are constantly breaking because of updates. Eventually the latter case causes everyone to be reluctant to update anything. Package updates, especially major updates, security updates, or updates that touch a system boundary, should be complete separate from other work. Not only does this make it easier to reason about what is wrong with an update if something happens, but it also decouples it from everything else that is going on.

Package updates should probably be batched (because they are annoying) in a single ticket. However, you may find that it is easier to reason about what broke when you only update a single package before running a full suite of tests. This is why it is important to know what a change is actually going to impact – otherwise you are going to be waiting forever for test runs to complete. Package updates should occur on a fairly regular schedule, probably triggered by a calendar reminder. You don’t want to try and remember to do this yourself, as you will likely forget for far too long.

Distrust and verify

In addition to unintentional vulnerabilities, breaking changes, and performance problems, in recent years there have been an increasing number of packages released where security issues have been purposely introduced. You can’t necessarily trust a package vendor, even if their updates are delivered by a reputable package management system. You should always be on the alert for changes to a package that indicate that a new party has taken over maintenance, that the package now does more stuff (especially if that “stuff” includes new features that require additional permissions), or large changes in general.

You should also be verifying cryptographic signatures on any package updates, to make sure that they came from a trusted party. If a cryptographic signature is invalid, the code should not be released, nor should it be run on development machines (because it may compromise those as well). When bringing in a library, you should always consider what a malicious author could do with it. This means that you need to have least privilege execution of your code and that you need to also have monitoring in place for unusual behavior from your application. In light of this, fewer dependencies is better.

Keep a minimal number of dependencies

You shouldn’t be adding new packages without carefully considering what they will do to your application. Period. While a dependency can reduce the amount of code you have to write yourself, it is not a maintenance-free alternative. In general, your development team should be made aware of any dependencies that are added to a project and you should be able to clearly to describe a limited scope of what they do for you. This scope should be similar to what the package does as a whole.

In general, you shouldn’t add large packages with a huge amount of functionality in order to obtain only a small slice of that functionality. Doing so tends to mean that you either spend a lot of time trying to figure out whether a change impacts you, or you find out the hard way that something is a problem that you hadn’t considered. Limiting the number of packages you depend upon is also a good idea because it makes it easier to keep track of what dependencies should be updated and when. You likely will have some things that you can’t just update immediately, and this list quickly gets unmanageable as the number of packages grows.

Watch out for less obvious dependencies

While sounds obvious now, it’s very easy to miss certain categories of system dependencies. Two common ones are: dependencies of testing and development tools and dependencies of your build pipeline. Developer tools that are not ever deployed on a production server are easy to miss in terms of problems and vulnerabilities. While they aren’t likely to show up in an audit, they can cause compatibility and security problems all on their own.

Another similar category is tools that are used for your build and deploy pipeline. It’s easy to forget about these tools, especially for team members that don’t like dealing with them in the first place. They often get deployed to a server and forgotten. Ask yourself how many of your developers are running on an old version of npm (or node itself) and then realize that the worst case is probably worse than you guess and the situation on your build pipeline is probably worse than that.

Monitor for “stale” packages

In addition to dealing with package updates, you need to watch for packages that have no activity. While it is entirely possible for a package to be “done” in the sense that updates are no longer needed, it’s more likely that a project is simply abandoned. While this may have no security or usability implications in the short term, this is unlikely to be true in the long term. Consequently, you also need to track when the most recent package version updates occurred – if none have happened in a while, you need to look at replacing the package with something newer.

Really old packages can be very stable, but you may find that they limit your ability to update other parts of your system over time. A “stale” package can also indicate that a package was renamed, rebranded, etc., so it doesn’t necessarily mean that you have to replace it. You do have to be aware of it, however.

Tricks of the Trade

With package management you want to distrust and verify however with people, especially with those who are in charge you want to do the opposite. Trust but verify when it comes to people. You gain trust by trusting, however you don’t want to blindly follow. You trust the person but also verify as they may be wrong, not intentionally but just mistaken.

Tagged with: , , , , ,