Editor's Column

15 February 2001

An Interview with Wilfredo Sánchez, former lead developer, project Darwin (OS X)

In a previous column, Reviewing the Others: Darwin, I briefly examined Apple's ambitious Darwin project. The article was the first in an intermittent series on large open-source projects; the series as a whole is meant to provide insight for the OpenOffice.org. community. The below interview provides a lead developer's perspective on the issue of Open Source.

Wilfredo Sánchez was, until recently, the lead developer on project Darwin, Apple's open-source core to its new operating system, OS X ("ten"). Fred still maintains his links to Darwin, where he is an active developer, but he is no longer with Apple. Currently, as he phrases it, he is "the open source program manager at KnowNow, Inc.," where he is "working on technology to enable what's called '2-way web'." Fred very kindly allowed some time out of his schedule to respond to some questions I asked via e-mail this past week. My questions are in bold italics; his responses are below, in regular font.

As a glance at your résumé indicates, you seem to have been involved as a developer and project lead in just about every important open-source project for the last ten years. What got you interested in Open Source? And, do those same issues that got you started still exist?

My interest dates back to college. I was a freshman at MIT (Massachusetts Institute of Technology) in 1990, and all I knew about computers was how to use a few programs on IBM PC clones running DOS, and before that the Atari 800 and TRS-80. I didn't know much about programming. So I get to MIT with my pile of PC floppies (the actual floppy kind) and they had these weird computers with no floppy drives (DEC VAX and IBM PC/RT machines), and I was totally lost.

What I learned rather quickly is that the software that the vast majority of the world uses isn't all there is. In fact, in a lot of regards, it's not even the best there is. And what I learned much later was that almost all of it was enabled via collaborative work between people who were mostly just interested in seeing what they could do with software. It wasn't open source in the modern sense, I guess, but it was pretty similar, and the roots of what open source is today goes right back to those projects.

I had also met RMS (Richard Stallman) a couple of times (which he won't remember) and a lot of people who contributed regularly to the GNU project, which was born at MIT, and I've since had a lot of respect for what Richard and the rest of that community have done to create a body of software that can be shared and collaborated on without the restrictions associated imposed on the Unix community, which you and your friends had to pay a lot of money to participate in.

So I was really into the idea of being able to contribute to all of this activity that was going on, but not being much of a programmer, I mostly just used the software and thought it was cool. It took me a long time before I found a way to do something substantial. In fact, I didn't have occasion to contribute to open source in a meaningful way until I got to Apple in 1997.

When I started at Apple, the NeXT folks had just moved into the Apple campus, and we were working on the original Rhapsody Developer Release. The first thing I noticed was that we had a boatload of software in the system which I was familiar with, but was mostly out-of-date or incomplete.

I had used NeXTStep at the MIT Media Lab and again at Disney Online, and I thought it was cool that it had a Unix environment underneath, but every time you wanted to do some standard Unix thing, you had to tweak it a bit. So you'd see things like special-case exceptions in autoconf scripts for NeXTStep, where it's almost the same as any other (4.3) BSD system, but it did it just a little different, or was missing something. And those little tweaks add up after a while. We had inherited that software at Apple, and I thought I could help by cleaning that up some. So I'd get a newer GNU tar, or add tcsh, because every good Unix should have tcsh... and then BASH... and then Perl... and then Apache, and it kept on going like that. And I talked to some friends at MIT who hooked me up with NetBSD and I took on the challenge of actually trying to sync up the BSD userland.

What became obvious to me at that point was that we needed to maintain an ongoing relationship with the upstream providers of the software, so that we could push our changes up and pull down newer versions and avoid the problem NeXT ran into, which was basically due to diverging code, which makes it much harder to keep up with what's going on elsewhere. I wrote up a paper on this issue using Rhapsody as a case study and presented it at the FREENIX track during USENIX 1999. As a result of the work I was doing, I got more involved with many of the projects from which I was drawing code. And that's the long of how I got involved with open source. It took me the better part of the 1990's.

Your involvement has given you an extraordinary perspective on the movement, its main players, and the people who actually make it happen: the developers. How would you characterize the changes (if any) that have taken place among and for developers since your start?

Well, I think when I was still learning about it, there were those who worked on what was fundamentally proprietary (though still open in a significant sense) software like the BSD Unix work at Cal (University of California, Berkeley); and those who wanted to make software that wasn't just open, but freely shared with everyone without the worry of a centralized group that could basically take that away, like the GNU project. The GNU project was a really important effort, because free software wasn't the norm, and they were working very hard to ensure that there would be enough free code available that people could actually make use of it to do real work and solve real problems, and they used a special license to help ensure the freedom of the code. That software remains important today, of course, particularly as demonstrated by Linux.

But I think today we know something we didn't necessarily know then: the code isn't the critical component of "open source." The real magic is in the community that uses and develops the software. If you take a good idea, and you build a good community around it, you can end up with some excellent software, and from what I've seen, no person or company will be able to take that away. The thesis of the paper I wrote for FREENIX was basically that if there exists a good open source project, any company which takes the code from that project, extends it, and opts to not to become a part of the community will, in the long run, end up with a product that is less compelling than what the community comes up with, because the Internet facilitates very large and diverse communities that can invariably out-pace what a single business can do, at least in those problem spaces that have become commoditized by open source projects.

The flip side of that is that open source projects can benefit greatly by getting companies to put their energies behind open software instead of inventing something in parallel, because those companies have a lot of resources to throw at problems of importance to them. The Apache Software Foundation is a marvelous example of this; we have seen truly significant effort put forth by companies like IBM, Sun and Covalent which have both broadened the problem space which our community excels at and greatly increased the quality and usefulness of the code. They have not only contributed software, but they fund ongoing full-time manpower to our projects, and while there have certainly been bumps in the road, I know that all parties involved--including, of course, our combined user base--are far better off as a result of this collaboration.

On the other hand, I think that some of the best assets in the open source space--the GNU stuff--is still of the mindset that the license is what protects the software. That's not so bad in and of itself, but the GNU license has what I think are legitimate problems for a person or company that isn't, for whatever reasons, able to accept those terms for software other than the original GNU code, and I don't think that excluding such companies is the best thing for an open community. That is, in this context, the GPL (in it's current form) actually inhibits code sharing, instead of facilitating it.

Actually, it's more complicated than that: the GNU project is a proponent of their particular ideas about freedom in software, and is not so much about ensuring that the best software in the world is created in this collaborative form which is what I think open source is really about. So it's not accidental that things work out this way; I just think it's unfortunate, because my interest lies in sharing code and building communities. I also think a lot of GPL authors haven't thought this through. What I think I'm seeing now is a sort of "growing up" of both the open source development model and it's fascinating to see how the software industry can fit into it. It's an exciting time to be a software developer.

Open Source relies, famously, on the "community." But it's unclear who this community is, who comprises it, and how it is constituted. Can you give any insight into the open-source community? That is, into how the developer community forms, what are its key elements, and how it works together?

As I said earlier, I think the community is the fundamental component of an open-source project, which makes this a very good question, if not an easy one to answer. The way I think about it, a community is a group of people with some interest in a given problem space and a vision for how to tackle it and what to do with the result. They tend to rally around some code, because code sets some nice boundaries for the scope of that problem space, but the code might be tossed out and re-written or otherwise modified and extended to reflect the vision of the group, which evolves the members discover new ideas, or the membership changes.

The successful communities I know about have some common characteristics: They tend to have a core group of some manageable size, which is generally comprised of the most active developers, but also of other people who are known to understand the problem space well. The core group and the code are what provide continuity and direction, and they roughly represent the vision of the whole group.

There are many different organizational models is the various open projects, and I'm not really an expert on which works best, since I've only ever tried to help create one. For Darwin, we had a pretty obvious core group to begin with, that being the tech leads in the Core OS team at Apple which is already responsible for the code; and the initial vision is clearly defined by the Mac OS product: Darwin's primary role is to support Mac OS going forward. That may seem too narrow, but I think the vision is a fine one; over 100,000 people bought Mac OS X Public Beta, and I imagine the upcoming release will have a sizeable installed base by year-end. That will mean a lot of people with some interest in seeing the system improve, and that's not a bad itch to scratch.

The interesting thing to see will be whether the Darwin core group will grow to include people outside of Apple, thereby allowing the vision to grow beyond the bounds of what Apple is doing with Mac OS. I don't just mean committers, which already exist, but whether Apple can let other community members on the outside of the company drive the vision as well.

The best example of a new community forming that I've seen is the Subversion project at CollabNet. Karl Fogel and a few other gurus in the area of source control got together and started designing something fundamentally better than the current standard, CVS (Concurrent Versions System). And they did a great job at it. The group was intentionally kept small initially by not making it very public, not to be secretive, but to make sure that the vision was well defined before too many cooks jumped into the kitchen. Since then I've seen all sorts of progress, some truly wonderful ideas, and I fully believe that when they are done, I'll be switching to Subversion and I'll never look back at CVS again.

There's an element of magic pixie dust in how that all came together which I don't fully understand, so I don't know that I could ever replicate it, but it was cool to watch, and I'm keeping an eye out for other examples so as to learn from them.

Another problem facing an open-source project is bringing in new members. How have you been able to bring in new members? What strategies do you use?

This goes back to the economics of open source, and developers are a finite resource. Communities are fed by need. If you can't draw members, it's probably because there isn't that much demand. And I don't think that growth is always a requirement. I don't know how many people maintain fetchmail today, but I don't suppose they really need 20 more developers working on it.

It can be tough for new projects, because there is a bootstrap problem: You have to have the code so you can have the users who depend on it. This feeds into your developer pool, which writes the code. So someone has to be the first developer to write enough code to churn the cycle the first time around and help other people jump in. But once that's done, a good project feeds itself.

Some people think that open source means you make a tarball and a reasonable license, put that up on an ftp server and you have open source. Perhaps that's enough for buzzword compliance, but that's completely false if you think the community is the important thing. You have to clean up the code (for readability, docs, the invariable naughty language, legal entanglements, internal code names, etc.). You have to let people know it's there. You have to facilitate the people who take enough interest to contribute patches, so they don't fall through the cracks.

So you need things like mail lists, CVS servers and whatnot; but much more important, it means an ongoing commitment to working with the community you form to help them help you make the code better, and that includes convincing your internal developers that they should spend some of their time helping people they don't know out on occasion. You can get some excellent infrastructure from SourceForge, but that alone doesn't create a community.

And that's the rub. It's not a trivial operation to open source a project. It takes real work and real people on an ongoing basis. People wonder why some company doesn't just open source the code they just shelved, and this is basically why. If you have no resources to put into it, it's not always a justifiable thing to do. Honest, it's not easy to do; I've talked to people at other companies as well to confirm this.

Previous columns

9 February 2001 Organizing Open Source

1 February 2001 Open Source and Its Culture

23 January 2001 Community Action

16 January 2001 Quo Vadis OpenOffice.org?

9 January 2001 The 613 build: problems and opportunities

3 January 2001 Sun's open door

E-mail: Louis at collab.net