Open Source: Theory of Operation

This is a short, practical guide to open source software for programmers at work. If you code on the job and wonder how open source works, why you can use some kinds, but not others, and what you have to do to get away with it, this guide’s for you. Feedback is welcome at github.com/kemitchell/oss.kemitchell.com.

Spoiler Alert: If your company or contract client has an open source policy or approval process, do what their policy says to do. Read the policy—I’m serious—and follow it. If you’re new to this stuff, you will probably feel better equipped to understand, talk about, and perhaps improve that policy after reading through this guide. The policy and the people who administer it might even start to make sense.

Either way, know going in that this guide abstracts away some details, exceptions, and edge cases to make the whole package more accessible. Fortunately, good sources of detail on open source licensing and compliance are easy to find. Almost all the good stuff is on the Internet. Buried in mounds of ill-considered histrionics, sure, but it’s there. You can learn to tell the difference.

Do not extrapolate from the breezy generalizations you find here to specific, business-critical decisions without further research. In other words, don’t make this guide the last source you read before taking a big decision about open source code. Educate yourself further, look at what others have done and got away with, and if at all possible, talk to your company’s lawyers and compliance pros. Doing all of that should be easier once you’ve read this guide.

In Brief

  1. Open source software licenses override legal defaults to make giving away and using software safe and easy.

  2. To enjoy that convenience as a user, you have to follow a few license requirements, like attributing open source work to those who licensed it.

  3. Copyleft licenses have more requirements. Usually you have to give users access to source code and use similar license terms for work that builds on the copyleft code.

  4. Some open source licenses take permission away from anyone who uses patents to sue other users or contributors.

  5. Contributor license agreements, assignments, and developer certificates create records showing the legal defaults get overridden by all those who have intellectual property in a project.

Operating Environment

Under current law, software never starts out open source. People who own rights in software turn it into open source software with legal tools called open source licenses. Open source licenses override legal defaults that fit the traditional, proprietary model of software developed in private and licensed to customers for a fee under negotiated contract terms. Open source licenses replace those defaults with written terms that make publishing, distributing, and using open source software relatively simple and low-risk.

When open source licenses work correctly, users and distributors, like your company, needn’t worry about the proprietary-style legal defaults. But you can’t understand how open source licenses work, or decide whether a license overrides all the defaults your company should worry about, if you don’t understand what the default rules for software are in the first place.

Defaults and Overrides

In summary, the most important legal defaults and open source overrides are:

Law Area Default Legal Rule Open Source Override
IP Copyright Owners can sue for copying work. Owners license to make copies.
IP Copyright Owners can sue for distributing work. Owners license to distribute.
IP Copyright Owners can sue for changing work. Owners license to make changes.
IP Patent Owners can sue for using inventions. Owners license to use.
IP Patent Owners can sue for making inventions. Owners license to make.
IP Patent Owners can sue for selling inventions. Owners license to sell.
IP Patent Owners can sue for importing inventions. Owners license to import.
Contract Warranties Sellers guarantee merchantability. Owners disclaim merchantability.
Contract Warranties Sellers guarantee fitness for purpose. Owners disclaim fitness for purpose.
Contract Warranties Sellers guarantee noninfringement. Owners disclaim noninfringement.
Contract Liability Buyers can sue for foreseeable damages. Terms limit liability to zero.
Contract Liability Buyers can sue in contract, tort, &c. Terms exclude all claims.
Contract Formation Each contributor licenses their part. Same terms, from all contributors.

Not all open source licenses override every default listed here. Some open source licenses do a better job of overriding specific defaults than others. Many open source licenses address more legal rules, either to override defaults or make clarifications.

Intellectual Property Law

Intellectual property laws give monopolies over many kinds of techniques, ideas, symbols, and products of thinking work. The strongest core idea behind these laws—the primary policy behind them—is to reward those risking and investing with the right to control, and charge for, the benefits of their results.

Using intellectual property without permission is infringement. The law allows intellectual property owners to sue infringers to get court orders to stop infringement—injunctions—pay money to the owners—damages—or both. Owners can sell permission, in the form of licenses, to do what would otherwise count as infringement. Owners can also sell or assign intellectual property to others, making them the new owners.

The key intellectual property laws for open source are copyright and patent laws.

Copyright law rewards authors of creative works with rights to stop others from making copies of their work, or using it as a starting point for new work. The archetype of a copyright owner is the novelist who sells a publisher exclusive license to print and sell their work as a book.

If code, documentation, configuration, or data—almost anything you can store in files—is even a tiny bit creative, copyright applies automatically, from the moment it’s typed out. There’s no requirement to file with the government or pay any fees, at least until you want to sue for infringement. Since copyright damages for even small infringement can be very large, and making sure a piece of work utterly lacks legal “creativity” is hard and takes legal expertise, most savvy companies and savvy open source projects treat every bit of software and documentation they receive as though someone owns copyright in it.

Lawyers read nearly all common open source licenses to grant everyone a copyright license. By default, nobody has a license, and the copyright owner can sue them if they infringe. Giving everyone a license—a public license—reverses the default, so the owner can’t sue anyone, so long as they meet all the license conditions.

Professionally drafted license agreements make clear when a license is a copyright license by using the word “copyright”. For example, the copyright license in section 2 of the Apache License, Version 2.0 reads:

Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a … copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.

This is the style most professional licensing lawyers use and expect. Actually, it’s pretty modern by lawyer standards. It’s the style of better proprietary software licenses.

Many open source licenses use a less structured style. These licenses give “permission” or say that users “may” do things with the software that would otherwise infringe copyright, without saying “copyright”. For example, the first paragraph of The MIT License begins:

Permission is hereby granted … to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software…

Without this language, copying, modifying, publishing, or distributing would infringe any copyrighted work in the software. Using and selling, on the other hand, might infringe patents.

Patent

Patent law rewards inventors who publish inventions through the government with twenty-year rights to stop others from making, using, selling, or importing goods and services that implement them. The archetype of a patent owner invents a better mousetrap in their garage, secures a patent, sues a company that manufacturers a similar trap without permission, and settles the lawsuit for a big check and a hefty royalty going forward.

Teaching a computer to do something useful, instead of a person or some other machine, does not, in itself, entitle the programmer to a patent. But new inventions that happen to be implemented in software, and new inventions in the art of making software, can be patented. Algorithms, data structures, protocols, and other methods of achieving useful results with software have all been covered by patents. It used to be easier to patent ideas in software, and the government issued a lot of bad patents. It’s harder now, but still possible.

Unlike copyright, patents don’t happen automatically. To receive patent rights, an inventor must apply for a patent, and the United States Patent and Trademark Office must grant it. The application process is technical, complex, and takes years, even for unsuccessful applications. It’s possible to apply for and receive a patent without professional help, but well practiced experts—a patent attorneys and patent agents registered with the Patent Office—speak the language, know the rules, and markedly increase chances of approval for a patent that’s worded for strength and strategic value. Good professional time and government fees during a long application process add up to many thousands of dollars, in the ballpark of a new luxury car.

A patent lawsuit often costs even more than getting a patent in the first place, on the order of a new supercar. This is often as true for the side being sued as for the side with the patent. Many companies simply can’t afford to put up a patent fight. Many that can put a lot of people and process on avoiding, slipping out of, and winning as many suits as they can. Some companies exist solely to purchase patents and abuse economics of scale in suing as many companies as possible, scaring them into pricey license agreements to settle lawsuits they couldn’t afford to win.

Lawyers make sure that commercial software agreements make clear if and when they grant patent licenses, by saying the word “patent” and listing the specific patents licensed whenever possible. Section 3 of the Apache License, Version 2.0 follows the general style:

Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a … patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work…

Again, many open source licenses use shorter, more general grants of permission. Many of these licenses are so short or general that lawyers aren’t sure whether they grant a patent license at all, or what it covers. For example, the preamble of The Two-Clause BSD License reads:

Redistribution and use in source and binary forms, with or without modification, are permitted…

Permission to “use” software covered by a patent might be a license under that patent. It’s not clear.

Ideally, drafters of open source licenses would like to completely override the default rules that allow patent owners to sue users and distributors of open source software. The fact that patenting requires application and approval means there may not be any relevant patents around to sue under.

Unfortunately, a key legal difference makes a complete override impossible. Independent discovery of patented technologies doesn’t stop an inventor with a patent from suing. If Ivan patents a new search algorithm and six months later Carol comes up with the same approach and uses it in code for a database, Ivan can sue Carol and users of Carol’s software. Even if Carol uses an open source license with a very clear patent license. Even if Carol came up with the algorithm on her own, knowing nothing of Ivan or his patent. Ivan owns the relevant patent. It isn’t Carol’s to license.

That’s the opposite of how copyright works. If Ivan writes a limerick and Carol just so happens to come up with the same limerick later, Ivan can’t sue Carol for publishing “his” limerick unless he can prove she copied him, rather than wrote it herself. Because copyright applies to code itself, a copyright license that properly covers all the material in a project code gets rid of copyright risk for users. But a patent works more like a category or pattern—it does or does not cover specific code. If some piece of code implements, or embodies, a patented invention, the patent owner can sue.

Drafters of recent open source licenses have tried to address this problem with patent-termination clauses. Suing users, distributors, or contributors under a patent triggers these clauses, which cancel the licenses granted for the software. In that way, patent termination clauses work a bit like license conditions, except permissions disappear if you do something specific—sue under patents—rather than fail to do something specific—like attribution.

Lawyers often use the last sentence of section 3 of the Apache License, Version 2.0 as the canonical example of a patent-termination clause:

If You institute patent litigation … alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.

Losing the licenses puts the one suing under a patent at risk of infringement suits if they use or distribute the software. But those with relevant patents may or may not use or distribute any kind of software. For patent owners that don’t, the termination clause doesn’t have teeth. They don’t need a license to begin with.

Whenever patents and open source cross, lawyers look very carefully at the language of the relevant patent licenses and any patent-termination clauses. Companies may also look into whether contributors to the project are likely to have or seek patents that may apply, or a competitive reason to apply for some.

Lack of clear patent language in the license for software that implements old, patent-free inventions, or software published by contributors who aren’t likely to have or seek patents, may not pose any risk. Conversely, even very clear, predictable patent language may not be enough for software that’s likely covered by patents belonging to a competitor or well-known patent bully. Companies considering whether or not to contribute code to an open source project also scrutinize patent terms, to make sure they understand what patent they’d be licensing.

Patents make software law hard, and the anti-defaults of open source make it even harder. If you find yourself staring down the double-barreled shotgun of open source patent law, get the best advice you can. Not just from lawyers. From engineers who know your company and what’s going on in your industry.

Contract Law

Contract laws give the power to use public courts to enforce private agreements, with rules for interpreting terms and deciding who bears what risk when the terms didn’t say. The prime directive of contract laws is to hold parties to contracts to the promises they intended, or at least probably intended, when they made their deal.

Warranties

Warranties law is a part of contract law. Warranties are promises about products and services covered by contracts. The archetypal warranty gives the purchaser of a new television the right to a new one, or a refund, if it quits working a week after purchase, due to shoddy materials or design.

Many warranties are written out, but the law implies some by default, as well. An implied warranty of merchantability says, essentially, that goods licensed or sold by someone in the business of selling that kind of thing will be up to usual standards, and not shoddy knock-offs. A default warranty of fitness for a particular purpose says that goods an industry player licenses or sells to a customer who isn’t in the biz to meet a specific need will do the job. A default warranty of noninfringement says that using what you’re buying as expected won’t immediately land you in an infringement lawsuit. These defaults are read into a contract, or implied, unless its terms clearly get rid of, or disclaim, them.

Implied warranties from sales-contracts law may or may not apply to terms for open source software, but in certain circumstances, they will. The implied warranties would then translate into promises about the quality and functionality of the software. Promises the one giving the software away could get sued for failing to keep.

Open source licenses disclaim all warranties, among them implied warranties. Typically the words that do this are set in ALL CAPS, and layer general disclaimer language on top of specific disclaimers of merchantability and fitness, for redundancy. Contract law expects contracts to look like sales, and for those selling to get something in return, especially money, that looks like the value of what is sold. Owners of software may indeed get something in return for licensing open source, like patches or interest in other products or services they otherwise wouldn’t. But they don’t receive payment, as they would for proprietary software. Without payment coming in, they don’t want to risk paying out on warranties. Even if the warranties arguably make sense.

Liability

The law gives each side of a contract the right to sue the other side for damages when they fail to hold up their end of a bargain. Damages are usually limited to what the court thinks the one at fault could have foreseen when they made their deal. If a software company licenses a badly flawed data backup tool that corrupts a customer’s financial data, the customer might collect the costs of digitizing its records once more. If the tool wrecks the control system for the customer’s top secret satellite project, crashing it into a chicken farm, the customer probably can’t get the cost of launching a new satellite, or of the chickens.

In order to collect damages, you have to bring a lawsuit, or credibly threaten to bring a strong one and convince the other side to settle. Bringing a lawsuit means filing a document that makes claims—references to laws you say apply to your situation and entitle you to protection from the court or compensation from someone responsible. Those laws might be the contract laws that say a party has to do what it promised, claims for torts like physical harm, property damage, or fraud, and so on.

Commercial software licenses usually set a limit on the total amount of money the one providing the software can owe as damages. Often that limit on liability is the same as, or a multiple of, what the customer pays for the software. If the limit is high, the seller might buy insurance, so one claim can’t shut their doors.

Open source software licenses almost universally limit damages, too—to zero, nothing, nada, the same as contributors get paid. They also exclude all the kinds of claims—contract, tort, whatever. Liability under a license agreement makes a great deal more sense when the one giving the license gets paid. Giving software away for free and retaining the risk of liability would not make sense for many contributors.

Formation

The law requires two sides to take some kind of intentional action to create an enforceable contract. The law usually talks about this process in terms of one side making an offer of terms, and the other side accepting the offer. A contract is born.

If Able offers Bob terms for an agreement, and Bob accepts, Able and Bob have a contract. If they intended some of their terms to benefit Candice, as well, Candice may be able enforce the parts of the Able-and-Bob contract to her benefit. But Able and Bob can’t make enforceable promises on behalf of Candice without her. If Able wants an enforceable promise from Candice, he needs to make a contract with Candice.

If Bob and Candice both contributed to an open source software project, Able wants licenses from both of them to know he’s free to use the code. Lawyers disagree about whether open source licenses are contracts, technically speaking. But they agree that users need everyone with intellectual property in the project to override the troublesome legal defaults. Everyone with copyright in code in the project should give a copyright license. Ideally, everyone with a patent covering the software should give a patent license, too.

Open source projects try to accomplish this in a number of ways. Each contributor can independently provide their contributions to the public under an open source license, perhaps using the Developers Certificate of Origin or a license with terms about contributions, like section 5 of the Apache License, Version 2.0, to clarify. Contributors can also sign a contributor license agreement to license intellectual property in their contributions to a single person or entity, like a foundation, which in turn licenses others. Or contributors can assign their rights in contributions to another, who in turn licenses it to others, as with some Free Software Foundation projects. The important point from a users’s point of view is that each contributor creates a written record of reversing the proprietary software defaults, and a written record they can find later, if they’re sued.

Tools for all of these approaches tend to pay special attention to who owns the intellectual property that the individual contributors create. In typical employment and contractor relationships, copyrights and patent rights in job-related work end up belonging to employers and clients, rather than the coders. That means the employers and clients, rather than the coders, need to give the licenses, at least some of the time. Because of this, open source foundations looking carefully after project intellectual property usually require contributor agreements or assignments from individual contributors as well as their employers. They don’t want to take in and start distributing code that the contributor didn’t have permission to publish and license.

In the proprietary software world, nearly all software comes with a warranty that the seller has all the rights they need to give a license. With very rare exception, open source licenses never give that kind of warranty. That means it’s up to users and distributors, the consumers of open source, to do their own research and come to their own conclusion about the risk. As a result, a company may need to look past what open source license form it uses to who has contributed, and how the project managed rights in their contributions. Again, the key is whether users can find written records that show everyone relevant reversed the scary legal defaults.

Specifics matter. If the software implements patented inventions, uses an open source license with a clear patent license, received contributor license agreements from patent holding contributors—all good news—but used a form contributor license agreement with no patent language, using the software may put the company at unacceptable patent risk. If programmers contributed code they may have written at work or for a client, without making a certificate or signing a contributor license agreement, they may not have thought about whether the rights were theirs to license. The public records a software project has, like revision control data, are often as important as the license notice in determining whether a project is safe to use.

Real-World Risk Conditions

Few private companies match the legal and process rigor of the large, enterprise-focused open source foundations, even for proprietary software they sell for lots and lots of money. At the same time, it’s easy to meet an open source contributor who believes all code pushed to public repositories automatically becomes open source, and runs their projects accordingly. Very, very few open source projects do licensing completely right. There is almost always some cause for uncertainty, as there is in using pretty much any software at all. But plenty of big, important companies—the best targets for lawsuits—use lots of open source software, under a variety of licenses.

Enforcement of license conditions like attribution is relatively lax in some areas of software, like web application programming. In other areas, like embedded systems, compliance is better ingrained. The trend seems to favor enforcing conditions to gain compliance—whether for attribution or copyleft requirements—rather than to collect damages for infringement. On the other hand, naming-and-shaming license violators increasingly happens in the open, on the Internet, where it can take a very immediate, serious toll on reputation. Especially for companies that compete to recruit computer programmers who identify with open source.

Overall, threats of intellectual property infringement lawsuits in the open source world appear very rare. For individuals and many small companies, there would be little point in suing anyway, since there is no significant money to be won from them, at least for now. Community goodwill and reputation remain very valuable for users and creators of open source software alike. More valuable, in many cases, than insisting on even very valuable legal rights.

Implementations

Any terms that meet the Open Source Definition make up an “Open Source” license in enterprise-user parlance. Initially, this produced a flurry of new licenses, often specific to a single project. Over time, the open source community standardized on a few licenses reused for many projects. Reusing a well known license makes it easier for potential users to assess license terms.

The popular open source license tools can be grouped into a few classes.

Permissive Licenses

Permissive or academic licenses impose minimal conditions. These licenses almost always require attribution—including a copy of the copyright notice and license terms that came with the software—and may impose other, usually mechanical chores, like listing changes to the open source original. Conditions satisfied, all are free to use the software, redistribute it, make changes, and combine with other, even proprietary software.

The MIT-style family of licenses, including The MIT License and the ISC License, are permissive licenses, popular in a variety of software communities. Much the same is true of the BSD family of licenses, in two- and three-clause variants. The Artistic licenses, versions 1.0 and 2.0, are popular in the Perl programming community, but apply to a few important projects in other communities, as well. The Apache License, Version 2.0 applies to many Apache Foundation and other projects. Apache 2.0 remains the benchmark for professional-style drafting in open source licensing.

Permissively licensed open source software is often the easiest for companies to approve for use as published, for reuse with changes, and for inclusion in binary or source-obfuscated programs. It’s usually legally straightforward to incorporate permissively licensed open source software into other software, even proprietary software, and to combine it with other open source. With older permissive licenses, like MIT- or BSD-style licenses, there may be stronger concerns about patent risk, since it isn’t clear if or how those licenses cover patent rights.

Copyleft Licenses

Copyleft or reciprocal licenses work like permissive licenses, with additional conditions. The additional conditions require those who distribute copies and changes to provide access to source code and give license terms similar to those for the original code. Many in the copyleft licensing community consider these conditions necessary to preserve essential freedoms to run, understand, and improve software.

Copyleft licenses are often compared in the relative strengths of their additional, copyleft license conditions. Strong conditions trigger source code and license terms requirements in more situations, for more code. Those requirements may themselves demand more by way of compliance. Weaker copyleft licenses give clear ways to avoid source and license-terms requirements by separating new code from that of the project, less stringent requirements when copyleft is triggered, or both.

Copyleft license conditions often clash with the conditions of other licenses: permissive, copyleft, or proprietary. When combining software triggers more than one set of conditions that can’t be satisfied at the same time—such as when the licenses of two copyleft libraries require different sets of terms for the whole program—the combination prevents distributing the combined software. Licenses that clash this way are incompatible.

To make combination and distribution possible, one or both libraries may relicense, or change their license terms. This can be very difficult or impossible for projects with many contributors who can’t be reached for agreement to offer the new terms. Some projects have chosen to dual license—offer a choice of two or more open source licenses—in hopes that users will be able to make at least one compatible choice. Other projects have adopted variations on standard license forms adding license exceptions to make them compatible with important copyleft licenses.

The best-known copyleft licenses are the GNU General Public Licenses, maintained by the Free Software Foundation. The Linux Kernel uses GPL version 2.0. Both GPL version 2.0 and GPL version 3.0 apply to a great deal of software frequently used with Linux. Linus Torvalds and the FSF have published clarifications of some of the more vague terms of the GPL for their software.

The FSF also maintains the weaker LGPL family of licenses, used for many linked libraries, to require source and licensing only of changes to the library, not programs builds with it. On the other hand, the Affero variants of the GPL, or AGPL licenses, strengthen copyleft to trigger source and license-terms requirements when software is made available to use over a network.

The Mozilla and Eclipse foundations maintains the Mozilla Public License and Eclipse Public License for the projects they host. These are considered weaker copyleft licenses than the GPL, and use a drafting style more familiar to professional licensing lawyers, akin to that of The Apache License, Version 2.0.

Most companies that use open source software use significant amounts of copyleft software, often with GNU/Linux as a base. Using copyleft software internally, without changes, can be just as simple as using permissively licensed software. But it’s often a trick to stay within those lines, operationally, over time.

When a company needs to make changes to copyleft software, wants to make copyleft software part of its own software or services, or both, things get more complicated. Whenever possible, companies like to avoid questions about whether their plans will trigger copyleft conditions, and what they will have to do to comply. Relatively new, strong-copyleft licenses like Affero GPL breed the most anxiety and confusion. Companies can find and hire both lawyers and compliance professionals familiar with copyleft to help manage copyleft complexity, but prefer to avoid the problem altogether by avoiding copyleft code.

Copyleft licenses are harder to comply with, reliably, than permissive licenses. Wise company counsel have reason to scrutinize copyleft code coming into their domains more carefully than permissively licensed code. But much ill-founded fear, uncertainty, and doubt around copyleft—especially about “viral” licenses “infecting” proprietary code—continues to waft off early public relations efforts to scare customers away from copyleft software competing with proprietary products.

Overall, deciding whether to work with copyleft software is neither trivially easy, nor impossibly hard. From a typical business point of view, assessing copyleft software is akin to assessing whether to accept a give-and-take deal, while assessing permissive software feels a more like deciding whether to accept a free gift. Weighing a give-and-take proposition usually means considering more specifics and long-term goals, but gifts can be welcome or burdensome. You might like a good deal on pizza delivery better than the generous gift of a puppy dog you’re not keen to care for. A great piece of copyleft software, even with complex copyleft conditions, might work out better for a company than permissively licensed code when all manner of intellectual property, usability, or maintenance issues are considered.

Applying Licenses to Code

There are all kinds of conventions, traditions, and fashions in how, exactly, open source licenses get applied to open source code. The good news is that they mostly boil down to putting a copy of a license, or mention of a license and where to find its text online, in files, next to the code to which it applies. Whatever method contributors choose to distribute their code—publicly accessible web server directory, hosted revision control repository, published archive—they make sure it comes with an easy-to-find statement, or notice, of license terms.

Probably the most common method is adding a file to the root of a project’s directory with a name like LICENSE or COPYING. A close second is mentioning a license in a header comment at the top of each source file. Some authors also mention licensing in standard documentation files, as at the bottom of README. More recently, package metadata formats—Maven’s, RubyGems’, npm’s—specify a field and format to identify standard open-source license terms.

The good news is that these methods seem to work. There have been relatively few court cases involving open-source licenses in the United States, but those we’ve seen have reached decisions on the assumption that the terms in the notice are the terms for the code. The bad news is that, in practice, issues frequently crop up when code under multiple licenses, from different contributors, ends up in the same archive or distribution artifact. It’s often unclear what license from who applies applies to what code.

More recently, the Software Package Data eXchange (SPDX) project under the Linux Foundation published an XML-based standard for metadata to describe the license terms of software artifacts. This approach differs in allowing consumers, rather than maintainers and distributors, to review code for license information and annotate code with machine-readable metadata. A key motivation for the project is to enable consumers of open code to exchange licensing annotations, reducing duplication of effort. But parts of the group’s work have also been adopted to make it easier for maintainers to annotate their own work. SPDX’ maintained list of string identifiers for common form licenses has been used for validation of package metadata, for example.

Contribution Management

Many open-source contributors believe that sending a patch to an existing open-source project without any mention of licensing gives permission to use their contribution under the same terms that apply to the existing code. This idea, which Richard Fontana calls “inbound=outbound”, is a very good idea. Unfortunately, it is not the law. That’s made clear, and now less likely to become a legal rule, by a popular code host’s felt need to inject it, in clumsy terms, into their terms of service.

To date, the lawyer-pleasing approach to contributor licensing has been the Apache Foundation’s layered use of a section about contributions in its license form and contributor license agreements between contributors and the Apache Foundation, which in turn licenses the public to use Apache software. For contributors with a company affiliation, Apache also signs a contributor agreement with the company, covering contributors by company personnel. Other institutional stewards of open-source projects, foundations and private companies, have made do with just contributor license agreements, likely the stronger half of the Apache two-part approach.

Unfortunately, contributor license agreements are imposing legal documents—almost universally longer and denser than academic licenses like The MIT License or even the community-oriented GPL version 2.0—especially for contributors starting from the false assumption that no license document from them at all is plenty clear enough. Ensuring that signed contracts precede patches accepted is also a drag, even with bots or scripts to help automate the process. The cost is particularly high for small and relatively informal projects, which can otherwise attract and take a lot of small commits, rapid fire, for which nobody is going to bother signing and filing a contract.

Shaken by the SCO lawsuits, but unwilling to impose the inconvenience of a centralized CLA service on a sprawling network of contributors, Linux Kernel developers developed the Developer Certificate of Origin. The DCO, a short statement from the coder’s point of view that they’re sure they have all the permission they need to contribute the code they’re offering, isn’t a license, nor is it a guarantee that the contributing coder has the rights they claim. But even when referenced shorthand, via Signed-Off-By tags in Git commit messages, it arguably creates a relevant and permanent written record of attribution and licensing diligence for all contributors. Those kinds of records might have been enough to stave off claims, like those from SCO, that code in the Linux codebase came not from open source contributors, but from their own, copyrighted code. The Eclipse Foundation has approved this mechanism, in lieu of its prior, Apache-esque contributor license agreements.

Finally, a special note on assignments, especially for projects under copyleft licenses. Historically, both some foundations and many private-company stewards required not licenses, but assignments of copyrights in contributions for acceptance into their open-source projects. The Clojure programming language project has extended that model, via adaptation of Oracle’s form agreement, to personal assignments of joint ownership to Rich Hickey, the leader of the project.

Assignments work much like contributor license agreements, and sometimes have essentially the same legal effect, due to rights granted back to contributors. But assignments inevitably entail more hassle, since many countries’ laws require notarization or other formalities to assign intellectual property rights by contract. Not to mention they sound scary and unnecessary. If the corporate steward on the other side is more than willing to use open-source software under license, why does it require assignment from contributors to its own projects?

Some organizations required assignments just because that was the most they could get, and legal advice was to get as much as possible. In some cases, this produced backlash, which was often deserved for lack of good and transparent communication, if not for the legal substance of the assignment agreements.

One early, rational case for assignment was relicensing. If the steward of a project anticipates that they may want to change the license terms for a project in the future, agreements to license contributions under the current license terms mean a whole lot of work hunting down old contributors later. Where there are many contributors, that may be impossible. Eventually, drafters of contributor license agreements turned to this problem, and built rights to relicense into many forms. Often these forms limit relicensing, so that the steward can’t chose whatever terms—say, proprietary terms—they may like. A choice of licenses approved by named foundations, like OSI and FSF, is common.

A more lasting argument for assignment is to create a single legal entity with all the relevant rights needed to bring a lawsuit for infringement. For reasons of legal procedure and negotiating leverage, having a single copyright owner makes it much easier to sue someone skirting license conditions for infringement, and compel a settlement. The more license conditions the open-source license you use, and the more meaningful the objectives those conditions aim to achieve, the more valuable centralized ownership becomes. Accordingly, copyleft-focused organizations like the Free Software Foundation and Software Freedom Law Center, both active enforcers of GPL copyleft conditions, routinely require assignments for contributed code. These organizations primarily enforce license conditions to secure compliance—release of source code—but need strong claims that they can sue in court for damages, as leverage.

Additional Resources

Important Institutions

You should be aware of a number of important institutions in the open source world. These institutions are a great source of good information.

Many also publish points of view on policy, operational, and legal issues. On some of these issues, they disagree. You should be aware of these positions and disagreements when they affect licenses or software that interest you. A foundation’s positions often carry extra weight for software projects that the foundation hosts.

Further Reading

The best book on open source for business, with especially strong coverage of copyleft licenses like the GPL, is Heather Meeker’s Open (Source) for Business. I recommend it to clients.

The best book on open source in the broader context of intellectual property law more generally is Van Lindberg’s Intellectual Property and Open Source. At nearly ten years old, it’s a bit dated now, especially on patent and trade secret law, where we’ve seen very important legal changes. Van tells me he’s working on a second edition, with update coverage of community as well as legal topics.

If you enjoyed this guide, you may enjoy my/blog. Folks seem to like my guide to how business views specific licenses, a line-by-line rundown of The MIT License, and my thoughts on how the meaning of “open source” is changing. I’ve also published a field guide to the major species of intellectual property

Thanks

My special thanks to Kunal Marwaha, for proofreading help.