Updated [12-Jun-2006]: Added some subheadings and a little editorial clean-up based on suggestions from friends.
At the end of the USENIX panel, someone from the audience asked if one could use open source to provide a time to market advantage. In the past couple of days, I have also had a discussion with a friend in regards to open source licensing and product architecture, and risks with respect to the GPL. I wanted to outline how "flexible" the GPL can be, and demonstrate a
product architecture that certainly worked for us and never exposed the Windows source base. Far too many people have too many opinions about what "might" happen when a company uses free and open source software. I thought a little experience might open up the view. A little developer education, and some sound technical architecture goes a long way.
Interix and the Windows Architecture
Softway Systems was founded in September, 1995. Interix was our product and allowed you to migrate UNIX applications to Windows NT. The NT architecture was essentially a kernel, a collection of functional subsystems (I/O, Security, etc.), and a set of environment subsystems (Win32, POSIX, and originally an OS/2 one).
Every application process runs as a client of an environment subsystem that provides the system services interface, and the services are then provided by the underlying functional subsystems via a fast local messaging system provided by the kernel. Each environment subsystem is its own executable and runs as a separate process. (If you look at the process list on Windows XP, you will see CSRSS — the Win32 subsystem — as well as PSXSS if you're running Interix.)
Interix was a replacement POSIX environment subsystem, providing much more of the system service interface of a UNIX system than was captured in the original Microsoft subsystem. The original Microsoft subsystem was an exact implementation of the original POSIX standard. Interix included a complete collection of the UNIX commands and utilities, and a complete development environment that included the GNU C Compiler suite, and libraries from a number of sources (including Sun). Interix also included a complete port of the X11R6 suite (with the exception of the X11 server itself).
The Collection of Licenses
Here's how the licenses worked. We had:
- A [unique] Microsoft source license for the original POSIX subsystem code, and an ability to distribute derivatives (i.e. the Interix subsystem).
- Our own assets, both in the modifications to the subsystem, a number of utilities we created, plus changes to the open source utilities and libraries to internationalize them and otherwise bring them into alignment with the relevant standards.
- All manner of free and open source software. We had 300+ utilities and libraries in the product covered by 20+ different licenses, ranging from the GPL, MIT Athena, BSD, and Sun licenses (ONC RPC code) to some more esoteric home grown licenses, to some true public domain code. (We shipped a derivative of the Public Domain Korn Shell, pdksh.)
We shipped our separate product as a collected work under our Interix End User License Agreement. [Copyright law supports the idea of collected works such as magazines and encyclopedias having their own copyrights above and beyond the individual copyright associated with each individual component.] We charged a license fee for the product (collection), but some of the tools (covered by licenses like the GPL) were provided as part of the distribution for customer convenience.
Interix is the core of Microsoft's Services for UNIX (SFU) product
now. For the past year or so, the SFU product has been a free
download from Microsoft. Parts of it are to be integrated into Windows
Server 2003 R2 if I recall correctly. (I've a blog entry that
describes that plan as of last September).
If you download and unpack SFU 3.5, you should be able to see all the
copyright notices (attached here for convenience) from all the open source software components in a single file
in the /docs
directory of the install root.
We respected the terms and conditions of each component or library's license. Copyrights were maintained. Code to our derivatives that we needed to make available was published, e.g. our changes to the RCS tools and gcc suite were all published because they're licensed with the GPL. That's what the licenses required, but we also wanted the community engagement of collaboration and enhancements.
The Business Economics of Community (with a gcc example)
Using all this source code from the open source world (and we didn't call it "open source" back when we started the company in 1995) gave us huge advantage in time to market. The code was also robust as it had stood the test of time and the real world across a host of architectures.
We didn't just take from the community, but also engaged with the community. We worked a lot of changes back to the pdksh team, had a developer that still maintained the ex/ed editors, and worked with the gcc community to contribute our changes and bug fixes back. This is obviously not the same level of business engagement as MySQL running its community around its own project, or Red Hat and its engagement with the Linux community. At the time we had different goals and requirements.
Our use of gcc is interesting on a couple of levels: one from a community perspective and the other from a technical architecture and license obligation perspective. First we'll tackle the community economics.
When you download gcc from the web, it's a bundle of the compilers, linker, binary formats library, assembler, and debugger all in one tidy package. Our compiler developer was an 18 veteran in compilers and operating systems (formerly of HP) and made a coherent set of changes to the tools to get them to behave properly developing debuggable executables for the Interix subsystem. When we began to contribute changes back, we discovered FIVE different communities hiding behind that single download with varying degrees of interest in accepting our changes. It was quite the negotiation.
In the end, we tried to hire Cygnus (as the gcc experts) to make/facilitate the changes, but in the late '90s this would have cost US$100K+ and they couldn't start for at least 14 months as they were so backed up with work. (This was prior to their acquisition by Red Hat.) We finally hired Ada Core Technologies, as they too employed a primary committer on parts of gcc that could best facilitate a set of changes across the tool set back into the core. It was considerably less expensive and they could begin immediately.
The gcc compiler also provided technical architect challenges that might be subtle for some. We used gcc to build our own world (with the exception of the subsystem), because the gcc compiler in those days was better than the Microsoft compiler, and we needed to use gcc if we were to use gdb. (An artifact of the environment subsystem world in the late '90s meant we could not use Visual Studio and its debugger.) Because using the gcc headers and libraries would have effectively attached the GPL to our own programs, we used the gcc compiler in conjunction with our own libraries (derived from the Microsoft C library) and our own headers. (We had a lot of experience building standards-based portability headers.)
It wasn't that we were unwilling to work with the GPL or use the GPL in conjunction with any of our code that made us cautious. We first had the Microsoft asset we were required to protect, and we also wanted any code we licensed out into community to be by choice and not through the license obligations.
Recognize that these contributions back to the community weren't out of altruism, especially considering the cost in engineering time for both us and hiring ACT developers. This was a deliberate business decision. We wanted the engineering expediency of working in a collaborative world. Despite the initial time to market advantage of using open source, if we had continued to live on our own code fork our team would need to make months of changes when a new version of gcc came out, rather than a few weeks of changes if the Interix related contributions were accepted.
This is the economic strength of the open source collaborative development world — you are distributing the cost of maintenance and development across a large number of players while improving the software's robustness through a test bed of users stretching the software in new ways. Considering all our investment, it was still cheap when compared to developing and maintaining a compiler from scratch, which wasn't a core competency (despite our talent), nor the primary value proposition to our customers. To us, this was no different than IBM's investment in the Apache web server.
(Frankly, in hind sight, I think we would have shared the subsystem code as well, if the Microsoft license hadn't prevented us from doing so. Our value proposition to our customer was in the distribution build, test, and packaging, and not necessarily in the source code itself.)
Developer Education and Oversight
As a set of developers, we were essentially responsible for our own legal due diligence. Every Interix developer understood the ramifications of the GPL. While überconservative lawyers at large software companies may well fret about the legal ambiguity of some clauses, spreading FUD before them, the average developer can read and understand the license pretty quickly.
Every Interix developer knew that before using software in the product covered by a new license, they needed to get approval from management — which meant me and the core senior developers that had lived in this space for a while. We always took the most conservative interpretation. We had already seen what AT&T did to our friends at BSDI with a law suit. (This was also why we began with the 4.4BSD-Lite distribution of the source for much of our early utility base — it was the legally blessed clean version.)
I am not advocating that you skip getting legal advice. We did check licenses from time to time with out external counsel. But your developers need passing familiarity with the licenses and copyright anyway, so you can actually do a lot of the work from an architectural point of view prior to involving lawyers to both expedite the conversation and save on legal costs. Few companies have the legal exposure of a Microsoft and require that level of legal due diligence before developers touch code. That's their problem.
Mortice Kern Systems was just up the road from us as well, and while we considered our product to be in a different space from theirs (we were about applications migration and they were the UNIX tools on Windows), we didn't want claims of source code infringement as many of us had worked there. We were religious in our source code control so as to be able to demonstrate code pedigree and cleanliness should the need arise.
We also had the Free Software Foundation keeping us honest. They would have loved for us to have mistakenly introduced software covered by the GPL into the Windows code base. Our survival as a start-up depended upon the Microsoft license. Once inside Microsoft after the acquisition, we were even more careful. And then you had the Microsoft legal team looking over our shoulders as well. We had email discussions twice with the FSF when we received notification that we hadn't yet published our changes. (There was always a latency in our making the code available while trying to get the product out the door.)
The FSF also maintains the inbound contribution assignments for gcc. I signed these assignments twice for the work of my compiler developer, once as vice-president, R&D at Softway Systems, and once as the product unit manager at Microsoft. (And you can imagine the legal oversight I had the second time from Microsoft Legal and Corporate Affairs.)
In Closing
So GPL code isn't the end of the world as we know it. You can get highly creative in how you consider its use in a technology architecture. The economic benefits of free and open source collaborative development are huge. Contributing back is essential.
The investment in developer understanding is necessary regardless of your use of free and open source software. (Consider all the other opportunities your developers have to bring in code from third parties and the "outside", regardless of open source.) Scanning tools can't replace developer education and an understanding of the goals. Frankly, it's just not that difficult to educate developers.
Indeed, collaborating and developing open source software may be a far more efficient, less risky, way of rapidly solving your customers problems as you develop products.
Comments