30 April 2008

A Standards Primer

Picture of Sundials
Photo by Dauvit Alexander

I have recently had several long discussions about the motivations and machinations that surround the development of technology interoperability standards. Over the past few years, I've also captured a lot of ideas and experience on the blog. I pulled it all together into one place in the following paper, "Understanding Technology Standardization Efforts" (PDF 86.2K).

For the record, I was a long term participant in the POSIX and UNIX standardization efforts. I was a working group participant, balloted many pieces of the standards and their amendments, and participated in the management of the standards effort at the IEEE as both an inaugural member of the Project Management Committee and a voting member of the Sponsor Executive Committee. I was an international participant at ISO, as document editor, and participated on behalf of three different national body delegations (Canada, U.S., UK) over a number of years. I began my participation in 1989 as a customer (working for EDS with GM and the U.S. government as their primary POSIX-interested customers), but quickly ended up as a vendor, working for MKS developing a conforming POSIX.2 implementation that formed the basis of implementations from IBM, DEC, HP, UNISYS and Sun. In 1995, I put my money where my mouth was on the importance of applications portability, standards and the coming juggernaut of NT and co-founded Softway Systems, implementing the POSIX and UNIX standards on NT to enable UNIX applications to be directly migrated to the platform. A large amount of free and open source software was incorporated into the product. Softway Systems was acquired by Microsoft in 1999, and I worked there for five years. Over the years I've been in regular contact with people standardizing C#/CLI, the Linux Standards Base, and ODF.

Several friends and colleagues from the standards world have reviewed the paper and provided excellent comments. The paper is much better for it. All mistakes obviously remain my own.


28 April 2008

Microsoft Office 2007 and Open XML: Lies, Damned Lies, and Statistics

Last week Joe Wilcox (Microsoft Watch) observed that Microsoft Office 2007 apparently doesn't conform to the Open XML standard (ISO/IEC 29500) that Microsoft has rammed through the system. Alex Brown has the full test here. No surprise. I've argued for the past year that the product must have diverged from the standard under construction. It's a normal thing in the standards world as Joe and Alex observe. They each challenge Microsoft to declare itself with respect to the standard and the future of the product.

But here's the problem: Microsoft already has declared itself. Last August Microsoft commissioned a study from IDC on the adoption of document standards. The "study" names Office Open XML as the obvious favourite. "Among the XML-based document standards, Office Open XML seems to be creating the most traction in the market." In the PR push leading up to the September 2007 votes on ISO/IEC 29500, Microsoft was already equating the standard with Microsoft Office 2007. That's what the sales field will be telling customers, with graphs culled from the "report". [srw — If you really want to read the report, follow the link from Mary Jo Foley's editorial. I still refuse to give the paid report link cred, small as it may be.]

Here's more writing on the ISO adoption and next steps:
Microsoft Claims Success with ISO and Open XML Standard


01 April 2008

Microsoft Claims Success with ISO and Open XML Standard

Picture of partially built Railroad
Copyright © 2007 by Kordite

"Another key factor is the fact that people recognize the broad use of Open XML in the market as seen by the hundreds of independent implementations of Ecma 376." [Jason Matusow, Microsoft Director of Standards]

Think of the confusion if we only partially implemented the HTML standard. Okay — bad example. What if we only partially implemented a railroad standard? The track gauge would be correct, but the rail width was incorrect, or there was only one rail? Or maybe the track stopped before reaching its destination. Microsoft continues to maintain the Rovian perspective that a standard with "support" (their language is improving to "implementations") rather than complete conformance is good news for the industry. In this particular case it even ignores the very conformance statement in their own standard. It's only good news for Microsoft. It means lots of people are encouraged to do partial things around documents produced by Microsoft Office 2008. The economics is in the vendor's favour, not the consumer's. It defeats the actual purpose of de jure standardization. [In the industry, we call it a vendor specification regardless of standards body imprimatur.]

We now enter the next phase of the dance. Customers will discover they don't get the benefits that they thought they bought. A customer of note [likely government] or a consortia will put together a conformance certification program around the standards in the space. Brands and certifications will be the rule of the day. Microsoft will discover it needs to actually ensure their own products adhere [formally] to the standards they produced. The Microsoft Office team will discover conformance testing to a specification is (i.) hard work, (ii.) different than normal product testing, and (iii) that their product is drifting off the very standard they launched. (The .NET runtime team learned this a few years ago and I'm betting there are still conformance bugs logged against the product as "won't fix".) Implementation conformance will become important.

"You keep using that word. I do not think it means what you think it means." — Inigo Montoya, in the Princess Bride

Other writing I've done in this space:


25 February 2008

The OOXML Ballot Resolution

[Update (2008-2-25 13:45): There's an excellent press release from ISO that outlines exact history and next steps and requirements for this ballot.]

I have long maintained that technology standardization is commercial diplomacy and the purpose of individual participants (as with all diplomats) is to expand one's area of economic influence while defending sovereign territory. This week a lot of people are gathering in Geneva for the ISO ballot resolution meeting for Office Open XML (OOXML), Microsoft's Office product specification. The debate no doubt will be contentious.

Microsoft had a perfect opportunity to participate in the Open Document Format (ODF) standard's development at OASIS. They ignored that opportunity. The best time for technology standardization arises when a problem space is well understood, with sufficient real implementation knowledge to discern what works and what doesn't. Microsoft had arguably the best experience to contribute. They chose not to participate. Standardized document formats with multiple product implementations posed a threat to their Office business.

That threat became real when the Commonwealth of Massachusetts chose ODF as a basis for product procurement to best serve its citizens. Microsoft's response was not to adopt the ODF standard that already existed with multiple implementations (and continues to act as a hub for alignment with other international work like China's UOF standard), but to rush their own product specification into the standardization process.

They have over the two year process done a remarkable amount of work to bring the specification through ECMA to ISO, and have made great gestures to enable others to support the Microsoft specification.

But there's a problem.

Microsoft is an adjudicated monopoly in the United States. The EU continues to investigate possible abuse of their market dominance. (Market leadership and innovation are not what's being punished, but rather the abuse of a dominant position.) Microsoft can complain all they want, but the practices that enabled their success continue to plague them. We cannot collectively rewrite history. Microsoft is indeed held to a different measure. They have forfeited some of the freedoms that other companies enjoy. In many ways, they have lost our trust.

One can not judge Microsoft's newly declared preference for "openness" against the work they've done promoting their own product specification, but against their continued refusal to adopt ODF. In the end, OOXML as an ISO standard (with its attendant market confusion) will best serve the needs of Microsoft over its customers, and that's a shame.

Andy Updegrove has an excellent essay on his blog as we go into this week's ballot resolution deliberations. He takes a different approach. In it he argues that a particular class of standards should be held to a higher bar for acceptance, because they enable fundamental technology access in the world going forward. He makes an compelling case for why OOXML should be flunked out of the ISO process.

This promises to be a fascinating week.


18 January 2008

GOSCON Discussion on Open Document Formats

Deb Bryant is blogging, which is great news. Deb is of course the creator and executive director of the Government Open Source Conference (GOSCON) that is held each year in Portland, OR.

The closing session of last Fall's conference was an executive panel on open document formats that included representatives from Sun, Microsoft, IBM, and Adobe. Deb's latest post points to the video of the panel, as well as the ongoing GOSCON forum discussion between the panellists. If you're interested in either the open document standards debate or government involvement in free and open source software, I would encourage you to have a read.

GOSCON Open Document Format Panel


11 September 2007

IBM Joins OpenOffice.org (The Quick Analysis)

A94FBCE7-3E57-44FA-8D17-3BC8F0B08770.jpg

It's official — IBM has joined the OpenOffice.org project. [There's good reporting and analysis from Andy Updegrove and Redmonk's Stephen O'GradyUpdate (12 Sep): Here's Andy's interview with IBM's Doug Heintzman, Director of Strategy for the Lotus division.] 

Here's the back of the envelop analysis.

From the OpenOffice.org community perspective, I'm guessing Louis Suarez-Potts (OO.o Community Manager) is feeling good to get a new injection of code/energy.  This is great for the community.  The OpenOffice suite keeps getting better and better, but new blood with new code could provide a much needed boost.

Overall Sun Microsystems is probably [very] happy IBM is supporting OpenOffice.org directly.  This is a much better situation than IBM building some form of ODF development platform inside Eclipse.org to enable ODF over OOXML, with OpenOffice.org hit as collateral damage.  [This would be sort of ironic since Eclipse helped to pull the Java centre-of-gravity away from Sun, and Visual Studio was collateral damage (or icing depending upon one's perspective).]  Collaboration is the much stronger market play here for Sun and IBM, and most importantly OO.o users and customers.

From the IBM perspective, this is brilliant business as usual.  ODF is the global leverage they need to crack open the Microsoft Office marketplace.  (I've written ad nauseam that ODF and Microsoft Office is just another example of Christensen economics in motion.  Microsoft has over-delivered on Office.  They mistakenly think more innovation faster is the answer.  Let the chips fall where they may.)  IBM will likely use OpenOffice to front-end Lotus and the Domino server product lines, and anchor their business messages to their customers's needs around standards and open source software, much the same as they do with Eclipse and the Websphere developer world.  Their claims are that much stronger with this announcement.

Sun gave Gnome a huge leg up about four years ago when they contributed a wealth of their accessibility technology R+D.  IBM will now contribute the same into OpenOffice.org.  It means they can easily manage their way through U.S. government procurement regulation in this space.  Once again brilliant IP management from IBM, and good for OO.o users and customers.  [For those that have heard me present, this is exactly what I mean about having a mature intellectual asset strategy, and being generous exactly in order to play to win.]

A strengthened OpenOffice.org will help Novell immeasurably to keep their distance with Microsoft on the desktop.  Novell has done a lot of work with OO.o in the past.  They have a great desktop Linux product.  They can simply take a ride on this one and eat the benefits.  There's really nothing Microsoft can say here.  Regardless of any agreements around OOXML that Novell may have with Microsoft, Novell comes out clean on the ODF front as customers demand it.

I noticed the press release includes a quote from Beijing's Redflag Chinese 2000 Software Co., Ltd., the makers of Redflag Linux and RedOffice.  This is significant.  Apparently last November I was one of the first people to blog about the document format work in China that led to a Chinese national standard (UOF).  Redflag Chinese 2000 was implementing UOF in Red Office (the Chinese packaging of OO.o).  There is work afoot to harmonize ODF and UOF.  And clearly Redflag Chinese 2000 remains committed to the OO.o effort.

So despite the bluff and bluster, the OOXML camp inside Microsoft should not be sleeping well at this point. 

"Don't blink.  Blink and you're dead.  Don't turn your back.  Don't look away.  And don't blink.  Good luck!"the Doctor


31 August 2007

Office Open XML Conformance (A Lesson in Claiming Standards Conformance)

So I'm a naive user of technology.  (No.  Really I am.  Ask anyone that's worked with me.) I am definitely not an expert in modern XML document standards.  (I have actually hacked troff escapes in a document production chain to insert commands in the PostScript output stream that would be recognized by the PDF generator to produce a hyper-linked document, so I know a little bit about the concepts involved, but that was also 10 years ago now.)

I am a [marvelously happy] Mac user for the past two years.  That means I already have iWork 2008 loaded with the new improved Pages '08 (the Apple word processor).  On the Apple web site, if I search for "office open xml" then I end up on this page (31 Aug, 2007), which tells me all about Pages '08:

Widely compatible.

Pages ‘08 supports industry-standard formats, so you can easily open documents created in other word processing applications and share documents with others. Whether they’re using a Mac or a PC.

Pages

Open for business.

Import your Microsoft Word documents into Pages ’08 with ease. Whether they’re Microsoft Office 2007 (Office Open XML) or earlier Word files, Pages will open them. Pages imports not only the text, but also the styles, tables, inline and floating objects, charts, footnotes, endnotes, bookmarks, hyperlinks, lists, sections, change tracking, and other elements of your original Word document.

COOL!  I'm in!  This is awesome.  I want to see how well I can read interesting docx files.  As it happens, ECMA International makes the Office Open XML standard available as both PDF and as docx files.  Clever — it's a document format standard see, and so they've provided it in its own format.  Perfect.   

So I download the .docx version of ECMA-376.  All 5 parts of it.  And I open "Part 1 - Fundamentals" and immediately get told some warnings occurred:   

Ooxmlfilewarnings

I choose to review and get:

Ooxmlwarnings

The file mostly looks good, but not quite as clean as the PDF image with the other font (Consolas?).  And clicking on the first warning (about the unsupported field) gives me NO additional information to understand what/where the error might be.  Now this is what we in the standards industry call "a quality of implementation issue".  Clearly Apple has not done a good job.   Get used to hearing this phrase a lot in the press — I'm predicting Microsoft will be forced to apply it liberally to their partners that helped them win votes and helped with the marketing message. 

Then I notice the paging problem.  I have no idea why, but there seems to be page drift between the PDF and .docx versions.  [More on THAT little problem in a minute.]  The paging problem does NOT mean there's necessarily a problem with the standard itself but rather the document production machine ECMA was using — we don't know what the definitive source and tool chain was that produced the PDF.  (Serious document production is the same as serious software production, something most word processor users fortunately don't get to experience.) Oh, and there are line numbers in the PDF that don't appear in the .docx as opened by Pages '08. 

Ignoring the document version skew problem, I decide to see what happens when I throw an even bigger docx file at Pages '08.  So I open "Part 3 - Primer"  and ...

Part3warnings     

A few more "warnings" to deal with here. More missing font problems.  Things were "removed".  No helpful information as to what or how. 

I asked a friend with Office 2007 to download and open the two .docx files.  You guessed it — no warnings.  So we're now on the slippery slope.  Apparently I can create files in Office 2007 that Microsoft marketing claims are "standard" Office Open XML that may (or may not) use proprietary extensions.  Or maybe Apple did a really bad job.  How would a government customer interested in preserving documents know?  But it gets worse.  The Office 2007 pagination perfectly matched the PDF version.  And there are line numbers in the Office 2007 version just like the PDF version. 

Ooops.

I'm betting the average business or government office person saving a file won't think twice about it.  You see Office 2007 gives you no way to save something as "strict" Office Open XML.  Not even not by default, but not at all.  Microsoft's definition of "Office Open XML" appears to be .docx itself. 

Indeed, even Apple's Pages '08 will only EXPORT to old Microsoft Office format (.doc) and not the standard Office Open XML (.docx) format.  So I appear to have no way to generate a OOXML file from Pages '08.  [Yes, yes, yes — Microsoft will again point out it's a quality of implementation problem.  Or they'll point out that Pages '08 is a "consumer" of OOXML only, which is allowed by the standard.  I get it.  It's not Microsoft's fault.  I'm beginning, however, to wonder at the quality of implementation on the Novell platform.  There's a business partnership under duress.]

So as an adjudicated monopoly of desktop operating systems, supplying an office productivity suite with 95+% market share, they will be able to claim instant victory for the adoption of their international standard because .docx files equal Office Open XML standard files.  Oh, wait — that's what was essentially done in the IDC study published this week that was "sponsored by Microsoft".

[Now we're about to get a wee bit tedious and exact as standards wonks are prone to be.  I'm going to try to explain the conformance game.  It can be subtle.  Apologies in advance for perhaps getting too ... well boring.  If you're not interested in standards mechanics, you can safely stop reading.]

So OOXML defines a couple of types of conformance.  There is Document Conformance, and Application Conformance.  And conforming applications can be producers (i.e. OOXML document writers) or consumers (i.e. OOXML document readers) or both.  Here's the text from the standard [Part 1, PDF edition, p. 3, lines 8-30]:

2.3 What this Standard Specifies
To address the issues listed above, this Standard constrains both syntax and semantics, but it is not intended to predefine application behavior. Therefore, it includes, among others, the following three types of information:

  1. Schemas and an associated validation procedure for validating document syntax against those schemas.  (The validation procedure includes un-zipping, locating files, processing the extensibility elements and attributes, and XML Schema validation.)
  2. Additional syntax constraints in written form, wherever these constraints cannot feasibly be expressed in the schema language.
  3. Descriptions of element semantics. The semantics of an element refers to its intended interpretation by a human being.

2.4 Document Conformance
Document conformance is purely syntactic; it involves only Items 1 and 2 in §2.3 above.

  • A conforming document shall conform to the schema (Item 1) and any additional syntax constraints (Item 2).
  • The document character set shall conform to the Unicode Standard and ISO/IEC 10646-1, with either the UTF-8 or UTF-16 encoding form, as required by the XML 1.0 standard.
  • Any XML element or attribute not explicitly included in this Standard shall use the extensibility mechanisms described by Parts 4 and 5 of this Standard.

2.5 Application Conformance
Application conformance is purely syntactic; it also involves only Items 1 and 2 in §2.3 above.

  • A conforming consumer shall not reject any conforming documents of the document type (§4) expected by that application.
  • A conforming producer shall be able to produce conforming documents.

This is the traditional way things are done with programming languages standards as well.  The concept of a strictly conforming C-language program is defined in the ISO/ANSI C standard so as then to define conformance of an actual implementation (i.e. C-language compilers).  In the OOXML standard, document conformance exists to be able to talk about implementation conformance, i.e. what readers/writers need to produce or accept if they conform to the standard. 

For completeness sake, the "document type" reference in 2.5 above is described in section §4 as [Part 1, PDF Edition, p. 6, lines 16-26]:

document type — One of the three types of Office Open XML documents: Wordprocessing, Spreadsheet, and Presentation, defined as follows:

  • A document whose package-relationship item contains a relationship to a Main Document part (§11.3.10) is a document of type Wordprocessing.
  • A document whose package-relationship item contains a relationship to a Workbook part (§12.3.23) is a document of type Spreadsheet.
  • A document whose package-relationship item contains a relationship to a Presentation part (§13.3.6) is a document of type Presentation.

An Office Open XML document can contain one or more embedded Office Open XML packages (§15.2.10) with each embedded package having any of the three document types. However, the presence of these embedded packages does not change the type of the document.

Now there is no statement of conformance to Office Open XML on the Apple web site beyond the above statement of "support".  A search in Pages '08 Help for "office open xml" finds no reference at all.  So Apple appears not to actually claim conformance to the OOXML standard anywhere.  They simply "support" it.  So they're not really guilty of not reading a conforming OOXML document. 

But the Microsoft standards and marketing machines are claiming "support" for their standard with the assured tones that "support" = "conformance".  Aside from the successful "adoption" claims in the aforementioned IDC report (where Office 2007 market share apparently equates to Office Open XML adoption) we have Tom Robertson (Microsoft General Manager of Standards and Interoperability) "citing support in products from Novell, Corel, Apple and others."    Disingenuous at best.

Jason Matusow points out on his blog:

A real litmus test for the viability of the ISO/IEC DIS (draft international standard) 29500 (Open XML) is whether or not there are independent implementations. The answer to this question for Open XML is an unequivocal yes. There are independent Open XML implementations based on the existing specification in applications that run on Linux, Mac, Palm OS, iPhone, and Windows. 

Again note the complete lack of reference to actual conformance per the definitions in the standard they have driven through the process.  These are the people that are responsible for standards management and messaging at Microsoft.  They are by definition the folks that should be defending the strict conformance of the standards in which they participate, and not merely suggesting that partial implementations are a "great start".

So where does this leave the government customer that thought they were buying an open document format for document exchange and interop?   It is indeed finally time to roll out the certification machine — for everybody.  Let the games continue.


30 August 2007

Microsoft's Failures with the OOXML Standard

The ISO "fast track" vote on approval of Microsoft's OOXML document specification happens next Monday (2 Sep.), and news is breaking fast and furious as various countries report out early.  An interesting bit of technical experimentation was published in the past week in the shadow of the vote.  It shows something more pragmatically damning than all of Rob Weir's hard work digging through faults in the OOXML specification. 

The work is published as a set of experiments.  The first experiment takes a trivial spread sheet created with Microsoft Excel 2007.  The next step is to unpack the Excel generated "standard OOXML" and make a trivial edit to it and repack it.  Excel then complains violently about the result (i.e. to the point of not reading the document). 

This is a catastrophic failure on two fronts for Microsoft's "standard": 

  • It means Microsoft Office DOESN'T ACCEPT WELL FORMED OOXML documents not produced by Microsoft Office.  This would be very bad for all those customers that believe the marketing and think they're buying a product implementing a "standard" that will [someday maybe] be supported by multiple implementations.
  • It means the Microsoft OOXML specification has sufficient problems in what it isn't saying in that implementors can't implement it.  (Changing a one character field in human readable XML with an editor is about as basic as it gets for implementation.)

Game over.  A standard that can't be implemented is WORSE THAN USELESS.  It really demonstrates that this standard they rammed through ECMA is nothing more than a vendor's product specification.   The rest of the experiments are equally telling in terms of the apparent use of product features outside the specification, etc.

This is not to say that Microsoft won't keep the steam roller crashing right along.  From Mary Jo Foley this week we see that Microsoft is still "buying research" from IDC in a study declaring great uptake for OOXML.  [I'm not linking to it because frankly it deserves to be buried.  I won't even give it the small link credit this blog engenders.]  While the IDC folks try to avoid complicity in the biased survey by openly declaring under the title that it was "Sponsored by Microsoft", we know the Microsoft marketing machine will happily be using the numbers and graphs in slides, attributing it all to an IDC study and failing to mention they paid for the "opinions".  Too bad for the credibility of the IDC researchers when this one comes home to roost. 

The incredible adoption for "OOXML" is likely a measure of market share on Office 2007 since it would be the only product claiming to implement the standard.  The way the "study" keeps conflating PDF with ODF/OOXML is broken as well.  So WHEN customers discover the truth of the implementation difficulties and continued lock-in of their document world, they'll respond the way customers always do when a vendor pulls a fast one on them.

Perhaps the most disheartening failure however is the way Microsoft has abused the machine that is the world's standards development organizations.  Andy Updegrove has a brilliant if painful read on the damage that Microsoft has caused in its "race to win."  Standards are needed and important.  In the days of film cameras, you wouldn't have been able to grab a roll of 35mm ASA 400 film from any store anywhere on the planet (produced by multiple different companies with their own value added services and technologies) and have it just work in your camera without such standards from such organizations. 

Microsoft is pushing organizational rules to the breaking point, ballot stuffing and buying its way to a win at ISO.  They have complained from the beginning of the process that they're doing nothing different than the likes of their competitors who are "aligned against them."  But this shows as much ignorance and naivete as the quality and implementability of their specification.  Companies like IBM and Sun have invested deep and long and globally in their standards participation activities.  They don't need to stuff the ballot box in a sudden end game -- they're invested in the long term infrastructure itself.   They absolutely play to win and sometimes surprise one in their creative use of the rules and organizational structures.  But they win because of a long term commitment to the machine itself, not to winning a particular battle at all costs. 

Standards at Microsoft has been run by lawyers instead of technology standards practitioners for too long.  It's been demonstrated time and again from the very beginning of the ODF working group formation in a succession of tactical and strategic failures by Microsoft.  To those lawyers that have never actually participated in standards working groups, or the standards management process, or implemented to the specification, or designed a certification process to demonstrate conformance, this is just one more battle to win and hang the consequences.  (Indeed, if we can "win" we can use it as precedent to "win" again!)

The sad part is that even if the ISO vote actually goes in Microsoft's favour, it still won't matter.  It buys them a few years of market ignorance at best.  This entire two year event is one for the standards text books on how not to respond to a business threatening standard.    In the end, Microsoft will need to implement ODF natively.  They don't know it yet, nor do they understand why, but it is just a matter of time.


07 June 2007

Openness, ODF, and OOXML from Sam Hiser

Sam Hiser has put together a comprehensive essay comparing the Open Document Format (ODF) standard and Microsoft's Office Open XML (OOXML) standard.  Sam is vice-president and director of business affairs for the OpenDocument Foundation, so has an obvious bias.  It is, however, a timely and very useful document considering the amount of government lobbying Microsoft is doing.  Sam compares the two standards based on:

  • Specification development practices
  • Usefulness of the specification from an implementors perspective
  • Clarity of IPR rights
  • Breadth of implementations (many ODF implementations on many platforms versus one partial OOXML implementation)

The document is an excellent summary.


14 February 2007

Microsoft Whining for Sympathy about OOXML

So what's the real message with Microsoft's open letter whining about IBM and ISO and the Microsoft driven OOXML standard? 

  • IBM is out performing us? 
  • We need sympathy after our record quarter? 
  • People should ignore our anti-ODF lobbying in Massachusetts because that was just business as usual? 

This is professionally embarrassing.  It is certainly not the company for which I used to work.  (CNet article is here.)

The Gapingvoid Blue Monster