December 23, 2017

Day 23 - Open Source Licensing in the Real World

By: Carl Perry (@edolnx)
Edited By: Amy Tobey (@AlTobey)

Before we get started, I need to say a couple of things. I am not a lawyer. What I am sharing should not be considered legal advice. The idea of this article is to share my experiences to date and provide you resources you should use to start conversations with your lawyer, your boss, and your company’s legal team. You should also know that my experiences are primarily based on laws in the United States of America, the State of Texas, and the State of California. Your governmental structure may be different, but most of what I am going to talk about should be at least partially applicable no matter where you are. Now, with that out of the way, let’s get started!

Open Source vs Free Software

I don’t want to spend a lot of time getting into a philosophical debate about Free Software vs Open Source. So, I’m going to define how I use those terms for this article. I refer to Free Software as things that are either coming from the Free Software Foundation (FSF), or using an FSF license. Anything where the source code is made publicly available I am referring to Open Source. Yes, this means to me FSF projects are Open Source. They also mean things like “shared source” are open source. But the important part is the license, which is what we are here to talk about.

The mixing of two worlds that don’t always see eye to eye

Capitalism and Open Source Software don’t always mix well. There are exceptions, but most companies see software no different than a brick: cheap, replaceable, frequently made by an outside provider, durable, and invisible (until it fails or is missing). We know this is not the case, and that is where the problems begin. Many corporate lawyers treat software identically to any other corporate purchase: a thing with a perpetual license that places all liability on the provider. We know that’s not true either, especially with Open Source. That means education is super important.

Everything starts with copyright

Copyright has been around for a long time, and the Berne Convention of 1989 really formed the basis for copyright we know today. Many countries ratified the treaty, and others still are forced to follow it using various treaties. The basics are that anything is automatically copywritten by the “author” for a minimum of 50 years. No filing is required. When something is copy-written then all rights are reserved by the “author”. This is the crux of the problem. That means you cannot copy, reuse, integrate, or otherwise manipulate the work at all. This is where licensing comes in.

Licensing to the rescue?

Licensing a work is a contract that allows for the “author” to grant rights to other parties, typically with restrictions. Many of the commercial software licenses simply allow use and redistribution of the work within an organization in binary form only in exchange for lack of warranty and lack of liability. Open source and free software licenses work quite a bit differently. Back in the early days of computing, even commercial software included the source code. Failing to include the source code usually meant you had something to hide, until some companies decided that source code was intellectual property and not to be shared. But, enough history for now, let’s get into open source and free software licenese and what they do.

Free Software licenses

The most Free Software licenses are the GNU Public License and its derivatives (GPLv2, GPLv3, LGPLv2, LGPLv3, AGPLv3, et cetera). These are all based on a very clever legal hack called “Copy Left”: the idea is to use a license to enforce the exact opposite of all rights reserved. Instead, the rights of access to source, ability to modify the source, the ability to redistribute, the inability to use as part of a closed source project, and the inability to charge money for the software. There are exceptions, in that you can charge for distribution costs for example, but that’s pretty much it. It’s important to realize that this is a one way street; things can start licensed by some other license but once it becomes GPL it cannot go back.

How businesses see the Free Software licenses

Typically, business do not have a favorable view of the GPL and it’s derivatives. Part of the reason is patents (the clauses are vaguely worded at best), but the biggest reason is the perception of “losing IP” and “competitive advantage”. Because you have to release all your changes, it can make life difficult. Great examples of this conflict are GPU and SOC drivers. There is a lot of proprietary tech in those but you have to make all the code that controls it open to the public.

There are bounding boxes, however. Any code that is derived from a Free Software based product must use the same or compatible license. However, if your code is not based on a Free Software base, vendors often choose to make a Free Software shim that converts calls from the Free Software project (like the Linux kernel) to vendor proprietary binaries or code under a difference license. This is how ZFS and the binary NVIDIA drivers work with the kernel. This is not always true, as the Linux Kernel has a very clear line of demarcation in their LICENSE file which states that public APIs are just that: public and not subject to the GPLv2. That makes the kernel a unique case. Otherwise, any lack of clarification like that leaves you with the poorly defined “linking clause” to deal with.

“Ah, but LGPL doesn’t have those restrictions!” someone is shouting at their keyboard. That’s not entirely true. The LGPL lacks the “linking clause” which can help with adoption, but that’s it. The license is otherwise identical to the GPL. What does this mean? Great question. That is left up to the “author”. An excellent example of how to do it well is the libssh project clearly spells out that you can use libssh in a commercial product as long as you do not modify the library itself (see their features page under the “License Explained” section). But if a project does not do this in their documentation then it can be ambiguous.

The Academic licenses

This is a term I use to lump the BSD, MIT, CERN, X11, and their cousin licenses in one place. These licenses are super simple, typically less than a paragraph long. They also don’t do much except limit liability and allow for royalty free redistribution.

How businesses see the Academic licenses

Most companies do not have a problem with these licenses. They are simple, and allow for packaging them into commercial products without issue. The only major sticking point is patents. They are not covered at all by many of these licenses, and that can be a sticking point for projects that are sponsored by a company. However, using code licensed using of the Academic liceneses tends to be a non-issue.

The Common Open Source licenses

This is another term I use for things like the Apache v2 License, the Artistic License (used by Perl), the LLVM License, the Eclipse Public License, the Mozilla Public License, and a few others that are widely used and well understood. These are typically very much like the Academic licenses but were written by corporate lawyers and thus are verbose. But that’s not a bad thing, their verbosity means they are ver clear about what they are trying to do. Most also have provisions for patents.

How businesses see Common Open Source licenses

Much like the Academic licenses, these are typically not an issue. There are some options available for patent protection and/or indemnification under different licenses. Typically projects that are under some form of corporate sponsorship tend to use Apache v2 for these reasons over Academic licenses.

Creative Commons

So far, everything we have talked about has been for software generally, and source code specifically. But what about things like blog articles, artwork, sound files, videos, schematics, 3D Objects, basically anything that isn’t code? That’s what creative commons is for. They have an excellent license chooser on their site, answer three questions and you are done. They have plain text descriptions of what each license does, and well structured legal text to back up those descriptions to keep the lawyers happy.

But why not just use one of the above licenses for things like 3D Objects, PCB Gerbers, and Schematics? Aren’t they just source code that uses a special compiler? Sort of, but frequently for non-code items the base components are remixed through other applications so it’s not so cut and dry. CC licenses help with this immensely. Also, there is CC0 for a “public domain work-alike” license (since it’s not always clear how to put something in public domain no matter how hard you try).

Every other license

There are far too many here to go over, but I’ll give some highlights: CDDL, Shared Source, The JSON License (yes, really), et cetera. My biggest lessons to pass along here are twofold; don’t create your own license, and if you stumble across one of these you will need to start having conversations with lawyers. CDDL was made so that Oracle didn’t have to see all the wonderful work from SUN Microsystems/Solaris wind up in the Linux Kernel. I’ll leave it up to the reader to figure out how that worked out for them. The JSON License holds a special place in my heart: “The software shall be used for good, not evil.” Funny right? Not so much. The tools that use the JSON License routinely are houned by IBM because they want to use their tool and cannot guarantee that IBM or its clients will only use the software for good. Literally, people get calls from IBM lawyers every quarter about this. Not so funny now. Don’t invent licenses, and don’t be cute in them. For everyone’s sake.

Why do you always say “author” in quotes like that?

Let’s take a quick seque to talk about when someone who wrote something is not the “author”. If you are writing code as part of your job, you are likely working under a “work for hire” contract. You need to check this and what the bounds are. In Texas, companies can get away with just about anything in this realm as there are not strong employee protection laws in the state. California on the other hand has very strong laws about this. In California work done on your own time not using any company resources (like a work laptop or AWS account) cannot be claimed by the company as part of their work for hire. Texas is a lot less clear. Your local jurisdiction may vary. If you are working for the United States Government, for example, all work created for the US Government cannot be copywritten by a company or individual. So, you may not be the author of the code you write. It’s important to understand that, because of what we are about to talk about next….

Sometimes the license is not enough

Many commercially sponsored projects have additional clarifications put in place to deal with shortcomings of the license, or to assign copyright back to the sponsoring organization to simplify distribution issues. These are typically done using a Contributor License Agreement (CLA) which are typically managed by an out of band process and enforced using some form of gating system. But there are problems here as well: many projects want contributing to be as lightweight as possible so they implement a technical solution (an example is Developer Certificate of Origin). This is great for expediency of individual contributors but can be a real pain for corporate contributors.

Protecting yourself

OK, so now that I’ve probably scared you let’s talk about risk and how to mitigate it. Step one is get a lawyer. It’s a hard step, but if/when you need a lawyer it’s better to know who to call instead of getting someone who just deals with traffic tickets. It’s also important to point out that, in almost all cases, your company lawyer/legal team does not have your best interests in mind. They work for the company, not you. If you need help finding a lawyer that can understand this field, EFF is a great resource.

Second: understand your employment contract. You have one, even if you don’t think you do. Many companies have you sign a piece of paper that says you agree to the Employee Manual (or the like) when you join: congratulations, that makes the employee manual your employment contract. Understand if you are a work for hire employee or treated as a contractor. This has huge ramifications on what you can contribute to outside Open Source projects and who is the “author” in those cases.

Third: it’s likely your contract will not cover things like this. Fix that. Talk to your boss/manager and get an understanding of what the company’s expectations are and what the their expectations of you being an Open Source contributor are. It’s best to do this as part of your negotiations when being hired, but either way you need to have those conversations. It’s important to start with your boss/manager instead of legal because the last thing you want to do is confuse/annoy legal. If you work in a large company, expect this process to take a while and while it does do not contribute during company time or using company resources to open source projects. I cannot stress that enough. If the company doesn’t want you contributing and you do, then you are on the hook legally speaking. Be up front and transparent. If you have contributed start discussions now and stop contributing while you do. Hiding information is worse than making an honest mistake.

Fourth: Understand licenses used for projects you are using and/or contributing to. Also understand that if there is no license, you legally cannot use it. This includes Stack Overflow, Github, and random things found in search results. It’s much safer to find something that is properly licensed or to use what you find as a reference and reimplement the concepts yourself.

Fifth: Leave an audit trail. Did you get a chunk of code from somewhere? Link to it in a comment. Note where you are getting libraries and support applications from, and the licenses they use. If you need to use an open source piece of code, but modify it (like a Chef cookbook or an Ansible playbook) then add a file with a source (including version or even better immutable link like github with the revision SHA), what you changed (just a list of files), and why so if it needs to be upgraded later it’s easier to understand what your past self was thinking.

Sixth: If you are coding in an organization, find out what their open source policy is (or help build one). At several places I have worked, the company didn’t care about open source licenses we used, but we occasionally had contracts with customers/vendors who did. One in particular had a “No GPLv3 code would be delivered” clause in our contract and that caused some issues. It’s important to make this known to avoid surprises later.

Protecting your organization

Protecting your organization doesn’t just mean protecting the company you work for, it’s just as important to protect the open source communities you participate in and contribute to. A lot of what I suggested in the last section works wonders for both. But there are a couple of other steps that can be effective depending on your level of involvement, or contractual obligations:

Dependency Audit

These two words tend to strike a sense of horror and dread into most developers. Don’t let them. There are some great tools out there to help in the popular languages. You may discover some interesting surprises when you dive all the way down your dependency tree. Use of language native package management (pip, npm, rubygems, et cetera) can make this easier. Other languages (like Java and C#) you may have a harder time and need to do a lot more manual work. You may also find that there are libraries in there you do not want due to license concerns, but don’t fret too much as there are usually replacements. A great example of this is libreadline, which is used frequently in tools with an interactive CLI. Good old libreadline is GPL (2 or 3 depending on version), but there exists the equivalent libedit which is 100% API compatible and is BSD licensed. Things like that are a somewhat easy fix, you may need to build some more dependencies in your build pipeline to get the license coverage you desire. Some may be harder.

Legal Fiction

Do you run an open source project? You may want to build, or have it become part of, an organization to help protect it and you from a legal and financial standpoint. Examples of larger groups are The Free Software Foundation, Software in the Public Interest, The Linux Foundation, and the Apache Foundation. Others build their own: like the Blender Foundation and VLC. The idea there is the legal fiction (read: company) can absorb most/all of the legal risk from the individual contributors. It’s not perfect, as the recent unpleasantness with netfilter shows, but it can help. Larger groups can even provide legal support without you going out on your own.

Contributor License Agreements

Your organization may want to implement a CLA for external (and internal) contributors to do things like assign copyright to organization or grant royalty free patent licenses. If you are going to try and do something like that think about your users. If you are expecting or want contributions from corporations, think really hard about making something easier for corporate legal teams to work with rather than placing the onus on the contributor solely. As much of a pain as it can be, just having a Corporate Contributor License Agreement (CCLA) that is out of band with your other process to allow contributions from that org. This will allow your lawyers and their lawyers to work it out, and can be beneficial for larger orgs that don’t yet get Open Source.

Wrapping Up

This stuff is important, and it’s complicated, but it can be surmounted by anyone. To quote Lawrence Lessig’s excellent book title “Code is Law”, and using the transitive property “Law is Code”. All these licenses are written in legal code, and like any other language they can (and should) be read. If you are inexperienced, start reading licenses. When you get confused, ask for help (preferably from a lawyer). The more we all understand this, the better the world will be. Thanks for your time, and feel free to reach out if you have questions!

-Carl @edolnx on Twitter and on the HangOps Slack There is a discussion of this on my blog: https://www.gigofham.com/post/2017/07/23-sysadvent/

No comments:

Post a Comment