Updated [12-Jun-2006]: Added some subheadings and a little editorial clean-up based on suggestions from friends.
At the end of the USENIX panel, someone from the audience asked if one could use open source to provide a time to market advantage. In the past couple of days, I have also had a discussion with a friend in regards to open source licensing and product architecture, and risks with respect to the GPL. I wanted to outline how "flexible" the GPL can be, and demonstrate a
product architecture that certainly worked for us and never exposed the Windows source base. Far too many people have too many opinions about what "might" happen when a company uses free and open source software. I thought a little experience might open up the view. A little developer education, and some sound technical architecture goes a long way.
Interix and the Windows Architecture
Softway Systems was founded in September, 1995. Interix was our product and allowed you to migrate UNIX applications to Windows NT. The NT
architecture was essentially a kernel, a collection of functional
subsystems (I/O, Security, etc.), and a set of environment subsystems
(Win32, POSIX, and originally an OS/2 one).
Every application process
runs as a client of an environment subsystem that provides the system
services interface, and the services are then provided by the
underlying functional subsystems via a fast local messaging system
provided by the kernel. Each environment subsystem is its own
executable and runs as a separate process. (If you look at the process
list on Windows XP, you will see CSRSS — the Win32 subsystem — as
well as PSXSS if you're running Interix.)
Interix was a replacement POSIX environment subsystem, providing much
more of the system service interface of a UNIX system than was captured
in the original Microsoft subsystem. The original Microsoft subsystem was an exact
implementation of the original POSIX standard. Interix included a
complete collection of the UNIX commands and utilities, and a complete
development environment that included the GNU C Compiler suite, and
libraries from a number of sources (including Sun). Interix also
included a complete port of the X11R6 suite (with the exception of the
X11 server itself).
The Collection of Licenses
Here's how the licenses worked. We had:
- A [unique] Microsoft source license for the original POSIX
subsystem code, and an ability to distribute derivatives (i.e. the
Interix subsystem).
- Our own assets, both in the modifications to the subsystem, a
number of utilities we created, plus changes to the open source
utilities and libraries to internationalize them and otherwise bring
them into alignment with the relevant standards.
- All manner of free and open source software. We had 300+
utilities and libraries in the product covered by 20+ different
licenses, ranging from the GPL, MIT Athena, BSD, and Sun licenses (ONC
RPC code) to some more esoteric home grown licenses, to some true
public domain code. (We shipped a derivative of the Public Domain Korn
Shell, pdksh.)
We shipped our separate product as a collected work under our Interix
End User License Agreement. [Copyright law supports
the idea of collected works such as magazines and encyclopedias having
their own copyrights above and beyond the individual copyright
associated with each individual component.] We charged a license fee
for the product (collection), but some of the tools (covered by licenses
like the GPL) were provided as part of the distribution for customer
convenience.
Interix is the core of Microsoft's Services for UNIX (SFU) product
now. For the past year or so, the SFU product has been a free
download from Microsoft. Parts of it are to be integrated into Windows
Server 2003 R2 if I recall correctly. (I've a blog entry that
describes that plan as of last September).
If you download and unpack SFU 3.5, you should be able to see all the
copyright notices (attached here for convenience) from all the open source software components in a single file
in the /docs
directory of the install root.
We respected the terms and conditions of each component or library's
license. Copyrights were maintained. Code to our derivatives that we needed to make available was published, e.g. our changes to the RCS tools and gcc suite were all published
because they're licensed with the GPL. That's what the licenses required, but we also wanted the community engagement of collaboration and enhancements.
The Business Economics of Community (with a gcc example)
Using all this source code from
the open source world (and we didn't call it "open source" back when we
started the company in 1995) gave us huge advantage in time to
market. The code was also robust as it had stood the test of time and
the real world across a host of architectures.
We didn't just take from the community, but also engaged with the
community. We worked a lot of changes back to the pdksh team, had a
developer that still maintained the ex/ed editors, and worked with the
gcc community to contribute our changes and bug fixes back. This is obviously not the same level of business engagement as MySQL running its community around its own project, or Red Hat and its engagement with the Linux community. At the time we had different goals and requirements.
Our use of gcc is interesting on a couple of levels: one from a community perspective and the other from a technical architecture and license obligation perspective. First we'll tackle the community economics.
When
you download gcc from the web, it's a bundle of the compilers, linker,
binary formats library, assembler, and debugger all in one tidy
package. Our compiler developer was an 18 veteran in compilers and
operating systems (formerly of HP) and made a coherent set of changes
to the tools to get them to behave properly developing debuggable
executables for the Interix subsystem. When we began to contribute
changes back, we discovered FIVE different communities hiding behind that single download with varying
degrees of interest in accepting our changes. It was quite the negotiation.
In the end, we tried to hire Cygnus (as the gcc experts) to make/facilitate the
changes, but in the late '90s this would have cost US$100K+ and they
couldn't start for at least 14 months as they were so backed up with
work. (This was prior to their acquisition by Red Hat.) We finally hired Ada Core Technologies, as they too employed a primary committer
on parts of gcc that could best facilitate a set of changes across the
tool set back into the core. It was considerably less expensive and
they could begin immediately.
The gcc compiler also provided technical architect challenges that might be subtle for some. We used gcc to build
our own world (with the exception of the subsystem), because the gcc
compiler in those days was better than the Microsoft compiler, and we
needed to use gcc if we were to use gdb. (An artifact of the
environment subsystem world in the late '90s meant we could not use
Visual Studio and its debugger.) Because using the gcc headers and libraries would have effectively attached the GPL
to our own programs, we used the gcc compiler in conjunction with our
own libraries (derived from the Microsoft C library) and our own
headers. (We had a lot of experience building standards-based
portability headers.)
It wasn't that we were unwilling to work with the GPL or use the GPL in conjunction with any of our code that made us cautious. We first had the Microsoft asset we were required to protect, and we also wanted any code we licensed out into community to be by choice and not through the license obligations.
Recognize that these contributions back to the community weren't out of
altruism, especially considering the cost in engineering time for both
us and hiring ACT developers. This was a deliberate business decision. We wanted the engineering expediency of
working in a collaborative world. Despite the initial time to market
advantage of using open source, if we had continued to live on our own
code fork our team would need to make months of changes when a new
version of gcc came out, rather than a few weeks of changes if the
Interix related contributions were accepted.
This is the economic
strength of the open source collaborative development world — you are
distributing the cost of maintenance and development across a large
number of players while improving the software's robustness through a
test bed of users stretching the software in new ways. Considering all our investment, it was still cheap when compared to developing and maintaining a compiler from scratch, which wasn't a core competency (despite our talent), nor the primary value proposition to our customers. To us, this was no different than IBM's investment in the Apache web server.
(Frankly, in hind sight, I think we would have shared the subsystem
code as well, if the Microsoft license hadn't prevented us from doing
so. Our value proposition to our customer was in the distribution
build, test, and packaging, and not necessarily in the source code
itself.)
Developer Education and Oversight
As a set of developers, we were essentially responsible for our own
legal due diligence. Every Interix developer understood the
ramifications of the GPL. While überconservative lawyers at large
software companies may well fret about the legal ambiguity of some
clauses, spreading FUD before them, the average developer can read and
understand the license pretty quickly.
Every Interix developer knew that before using software in the product
covered by a new license, they needed to get approval from management
— which meant me and the core senior developers that had lived in this
space for a while. We always took the most conservative
interpretation. We had already seen what AT&T did to our friends
at BSDI with a law suit. (This was also why we began with the
4.4BSD-Lite distribution of the source for much of our early utility
base — it was the legally blessed clean version.)
I am not advocating that you skip getting legal advice. We did check licenses from time to time with out external counsel. But your developers need passing familiarity with the licenses and copyright anyway, so you can actually do a lot of the work from an architectural point of view prior to involving lawyers to both expedite the conversation and save on legal costs. Few companies have the legal exposure of a Microsoft and require that level of legal due diligence before developers touch code. That's their problem.
Mortice Kern Systems was just up the road from us as well, and while we
considered our product to be in a different space from theirs (we were
about applications migration and they were the UNIX tools on Windows),
we didn't want claims of source code infringement as many of us had
worked there. We were religious in our source code control so as to be
able to demonstrate code pedigree and cleanliness should the need arise.
We also had the Free Software Foundation keeping us honest. They would have loved for us to have
mistakenly introduced software covered by the GPL into the Windows code
base. Our survival as a start-up depended upon the Microsoft license.
Once inside Microsoft after the acquisition, we were even more
careful. And
then you had the Microsoft legal team looking over our shoulders as
well. We had email discussions twice with the FSF when we received
notification that we hadn't yet published our changes. (There was
always a latency in our making the code available while trying to get
the product out the door.)
The FSF also maintains the inbound contribution assignments for gcc. I
signed these assignments twice for the work of my compiler developer,
once as vice-president, R&D at Softway Systems, and once as the
product unit manager at Microsoft. (And you can imagine the legal
oversight I had the second time from Microsoft Legal and Corporate
Affairs.)
In Closing
So GPL code isn't the end of the world as we know it. You can get
highly creative in how you consider its use in a technology
architecture. The economic benefits of free and open source collaborative development are huge. Contributing back is essential.
The investment in developer understanding is necessary regardless of your use of free and open source software. (Consider all the other opportunities your developers have to bring in code from third parties and the "outside", regardless of open source.) Scanning tools can't replace developer education and an understanding of the goals. Frankly, it's just not that difficult to educate developers.
Indeed, collaborating and developing open source software may be a far more efficient, less risky, way of rapidly solving your customers problems as you develop products.
Post to del.icio.us
Recent Comments