« July 2007 | Main | September 2007 »
31 August 2007
Office Open XML Conformance (A Lesson in Claiming Standards Conformance)
So I'm a naive user of technology. (No. Really I am. Ask anyone that's worked with me.) I am definitely not an expert in modern XML document standards. (I have actually hacked troff escapes in a document production chain to insert commands in the PostScript output stream that would be recognized by the PDF generator to produce a hyper-linked document, so I know a little bit about the concepts involved, but that was also 10 years ago now.)
I am a [marvelously happy] Mac user for the past two years. That means I already have iWork 2008 loaded with the new improved Pages '08 (the Apple word processor). On the Apple web site, if I search for "office open xml" then I end up on this page (31 Aug, 2007), which tells me all about Pages '08:
Widely compatible.
Pages ‘08 supports industry-standard formats, so you can easily open documents created in other word processing applications and share documents with others. Whether they’re using a Mac or a PC.
![]()
Open for business.
Import your Microsoft Word documents into Pages ’08 with ease. Whether they’re Microsoft Office 2007 (Office Open XML) or earlier Word files, Pages will open them. Pages imports not only the text, but also the styles, tables, inline and floating objects, charts, footnotes, endnotes, bookmarks, hyperlinks, lists, sections, change tracking, and other elements of your original Word document.
COOL! I'm in! This is awesome. I want to see how well I can read interesting docx files. As it happens, ECMA International makes the Office Open XML standard available as both PDF and as docx files. Clever — it's a document format standard see, and so they've provided it in its own format. Perfect.
So I download the .docx version of ECMA-376. All 5 parts of it. And I open "Part 1 - Fundamentals" and immediately get told some warnings occurred:
I choose to review and get:
The file mostly looks good, but not quite as clean as the PDF image with the other font (Consolas?). And clicking on the first warning (about the unsupported field) gives me NO additional information to understand what/where the error might be. Now this is what we in the standards industry call "a quality of implementation issue". Clearly Apple has not done a good job. Get used to hearing this phrase a lot in the press — I'm predicting Microsoft will be forced to apply it liberally to their partners that helped them win votes and helped with the marketing message.
Then I notice the paging problem. I have no idea why, but there seems to be page drift between the PDF and .docx versions. [More on THAT little problem in a minute.] The paging problem does NOT mean there's necessarily a problem with the standard itself but rather the document production machine ECMA was using — we don't know what the definitive source and tool chain was that produced the PDF. (Serious document production is the same as serious software production, something most word processor users fortunately don't get to experience.) Oh, and there are line numbers in the PDF that don't appear in the .docx as opened by Pages '08.
Ignoring the document version skew problem, I decide to see what happens when I throw an even bigger docx file at Pages '08. So I open "Part 3 - Primer" and ...
A few more "warnings" to deal with here. More missing font problems. Things were "removed". No helpful information as to what or how.
I asked a friend with Office 2007 to download and open the two .docx files. You guessed it — no warnings. So we're now on the slippery slope. Apparently I can create files in Office 2007 that Microsoft marketing claims are "standard" Office Open XML that may (or may not) use proprietary extensions. Or maybe Apple did a really bad job. How would a government customer interested in preserving documents know? But it gets worse. The Office 2007 pagination perfectly matched the PDF version. And there are line numbers in the Office 2007 version just like the PDF version.
Ooops.
I'm betting the average business or government office person saving a file won't think twice about it. You see Office 2007 gives you no way to save something as "strict" Office Open XML. Not even not by default, but not at all. Microsoft's definition of "Office Open XML" appears to be .docx itself.
Indeed, even Apple's Pages '08 will only EXPORT to old Microsoft Office format (.doc) and not the standard Office Open XML (.docx) format. So I appear to have no way to generate a OOXML file from Pages '08. [Yes, yes, yes — Microsoft will again point out it's a quality of implementation problem. Or they'll point out that Pages '08 is a "consumer" of OOXML only, which is allowed by the standard. I get it. It's not Microsoft's fault. I'm beginning, however, to wonder at the quality of implementation on the Novell platform. There's a business partnership under duress.]
So as an adjudicated monopoly of desktop operating systems, supplying an office productivity suite with 95+% market share, they will be able to claim instant victory for the adoption of their international standard because .docx files equal Office Open XML standard files. Oh, wait — that's what was essentially done in the IDC study published this week that was "sponsored by Microsoft".
[Now we're about to get a wee bit tedious and exact as standards wonks are prone to be. I'm going to try to explain the conformance game. It can be subtle. Apologies in advance for perhaps getting too ... well boring. If you're not interested in standards mechanics, you can safely stop reading.]
So OOXML defines a couple of types of conformance. There is Document Conformance, and Application Conformance. And conforming applications can be producers (i.e. OOXML document writers) or consumers (i.e. OOXML document readers) or both. Here's the text from the standard [Part 1, PDF edition, p. 3, lines 8-30]:
2.3 What this Standard Specifies
To address the issues listed above, this Standard constrains both syntax and semantics, but it is not intended to predefine application behavior. Therefore, it includes, among others, the following three types of information:
- Schemas and an associated validation procedure for validating document syntax against those schemas. (The validation procedure includes un-zipping, locating files, processing the extensibility elements and attributes, and XML Schema validation.)
- Additional syntax constraints in written form, wherever these constraints cannot feasibly be expressed in the schema language.
- Descriptions of element semantics. The semantics of an element refers to its intended interpretation by a human being.
2.4 Document Conformance
Document conformance is purely syntactic; it involves only Items 1 and 2 in §2.3 above.
- A conforming document shall conform to the schema (Item 1) and any additional syntax constraints (Item 2).
- The document character set shall conform to the Unicode Standard and ISO/IEC 10646-1, with either the UTF-8 or UTF-16 encoding form, as required by the XML 1.0 standard.
- Any XML element or attribute not explicitly included in this Standard shall use the extensibility mechanisms described by Parts 4 and 5 of this Standard.
2.5 Application Conformance
Application conformance is purely syntactic; it also involves only Items 1 and 2 in §2.3 above.
- A conforming consumer shall not reject any conforming documents of the document type (§4) expected by that application.
- A conforming producer shall be able to produce conforming documents.
This is the traditional way things are done with programming languages standards as well. The concept of a strictly conforming C-language program is defined in the ISO/ANSI C standard so as then to define conformance of an actual implementation (i.e. C-language compilers). In the OOXML standard, document conformance exists to be able to talk about implementation conformance, i.e. what readers/writers need to produce or accept if they conform to the standard.
For completeness sake, the "document type" reference in 2.5 above is described in section §4 as [Part 1, PDF Edition, p. 6, lines 16-26]:
document type — One of the three types of Office Open XML documents: Wordprocessing, Spreadsheet, and Presentation, defined as follows:
- A document whose package-relationship item contains a relationship to a Main Document part (§11.3.10) is a document of type Wordprocessing.
- A document whose package-relationship item contains a relationship to a Workbook part (§12.3.23) is a document of type Spreadsheet.
- A document whose package-relationship item contains a relationship to a Presentation part (§13.3.6) is a document of type Presentation.
An Office Open XML document can contain one or more embedded Office Open XML packages (§15.2.10) with each embedded package having any of the three document types. However, the presence of these embedded packages does not change the type of the document.
Now there is no statement of conformance to Office Open XML on the Apple web site beyond the above statement of "support". A search in Pages '08 Help for "office open xml" finds no reference at all. So Apple appears not to actually claim conformance to the OOXML standard anywhere. They simply "support" it. So they're not really guilty of not reading a conforming OOXML document.
But the Microsoft standards and marketing machines are claiming "support" for their standard with the assured tones that "support" = "conformance". Aside from the successful "adoption" claims in the aforementioned IDC report (where Office 2007 market share apparently equates to Office Open XML adoption) we have Tom Robertson (Microsoft General Manager of Standards and Interoperability) "citing support in products from Novell, Corel, Apple and others." Disingenuous at best.
Jason Matusow points out on his blog:
A real litmus test for the viability of the ISO/IEC DIS (draft international standard) 29500 (Open XML) is whether or not there are independent implementations. The answer to this question for Open XML is an unequivocal yes. There are independent Open XML implementations based on the existing specification in applications that run on Linux, Mac, Palm OS, iPhone, and Windows.
Again note the complete lack of reference to actual conformance per the definitions in the standard they have driven through the process. These are the people that are responsible for standards management and messaging at Microsoft. They are by definition the folks that should be defending the strict conformance of the standards in which they participate, and not merely suggesting that partial implementations are a "great start".
So where does this leave the government customer that thought they were buying an open document format for document exchange and interop? It is indeed finally time to roll out the certification machine — for everybody. Let the games continue.
August 31, 2007 at 11:17 PM | Permalink | Comments (7) | TrackBack
30 August 2007
Microsoft's Failures with the OOXML Standard
The ISO "fast track" vote on approval of Microsoft's OOXML document specification happens next Monday (2 Sep.), and news is breaking fast and furious as various countries report out early. An interesting bit of technical experimentation was published in the past week in the shadow of the vote. It shows something more pragmatically damning than all of Rob Weir's hard work digging through faults in the OOXML specification.
The work is published as a set of experiments. The first experiment takes a trivial spread sheet created with Microsoft Excel 2007. The next step is to unpack the Excel generated "standard OOXML" and make a trivial edit to it and repack it. Excel then complains violently about the result (i.e. to the point of not reading the document).
This is a catastrophic failure on two fronts for Microsoft's "standard":
- It means Microsoft Office DOESN'T ACCEPT WELL FORMED OOXML documents not produced by Microsoft Office. This would be very bad for all those customers that believe the marketing and think they're buying a product implementing a "standard" that will [someday maybe] be supported by multiple implementations.
- It means the Microsoft OOXML specification has sufficient problems in what it isn't saying in that implementors can't implement it. (Changing a one character field in human readable XML with an editor is about as basic as it gets for implementation.)
Game over. A standard that can't be implemented is WORSE THAN USELESS. It really demonstrates that this standard they rammed through ECMA is nothing more than a vendor's product specification. The rest of the experiments are equally telling in terms of the apparent use of product features outside the specification, etc.
This is not to say that Microsoft won't keep the steam roller crashing right along. From Mary Jo Foley this week we see that Microsoft is still "buying research" from IDC in a study declaring great uptake for OOXML. [I'm not linking to it because frankly it deserves to be buried. I won't even give it the small link credit this blog engenders.] While the IDC folks try to avoid complicity in the biased survey by openly declaring under the title that it was "Sponsored by Microsoft", we know the Microsoft marketing machine will happily be using the numbers and graphs in slides, attributing it all to an IDC study and failing to mention they paid for the "opinions". Too bad for the credibility of the IDC researchers when this one comes home to roost.
The incredible adoption for "OOXML" is likely a measure of market share on Office 2007 since it would be the only product claiming to implement the standard. The way the "study" keeps conflating PDF with ODF/OOXML is broken as well. So WHEN customers discover the truth of the implementation difficulties and continued lock-in of their document world, they'll respond the way customers always do when a vendor pulls a fast one on them.
Perhaps the most disheartening failure however is the way Microsoft has abused the machine that is the world's standards development organizations. Andy Updegrove has a brilliant if painful read on the damage that Microsoft has caused in its "race to win." Standards are needed and important. In the days of film cameras, you wouldn't have been able to grab a roll of 35mm ASA 400 film from any store anywhere on the planet (produced by multiple different companies with their own value added services and technologies) and have it just work in your camera without such standards from such organizations.
Microsoft is pushing organizational rules to the breaking point, ballot stuffing and buying its way to a win at ISO. They have complained from the beginning of the process that they're doing nothing different than the likes of their competitors who are "aligned against them." But this shows as much ignorance and naivete as the quality and implementability of their specification. Companies like IBM and Sun have invested deep and long and globally in their standards participation activities. They don't need to stuff the ballot box in a sudden end game -- they're invested in the long term infrastructure itself. They absolutely play to win and sometimes surprise one in their creative use of the rules and organizational structures. But they win because of a long term commitment to the machine itself, not to winning a particular battle at all costs.
Standards at Microsoft has been run by lawyers instead of technology standards practitioners for too long. It's been demonstrated time and again from the very beginning of the ODF working group formation in a succession of tactical and strategic failures by Microsoft. To those lawyers that have never actually participated in standards working groups, or the standards management process, or implemented to the specification, or designed a certification process to demonstrate conformance, this is just one more battle to win and hang the consequences. (Indeed, if we can "win" we can use it as precedent to "win" again!)
The sad part is that even if the ISO vote actually goes in Microsoft's favour, it still won't matter. It buys them a few years of market ignorance at best. This entire two year event is one for the standards text books on how not to respond to a business threatening standard. In the end, Microsoft will need to implement ODF natively. They don't know it yet, nor do they understand why, but it is just a matter of time.
August 30, 2007 at 04:51 PM | Permalink | Comments (11) | TrackBack
22 August 2007
Open Source Business Innovation and the Subscription Model
Stephen O'Grady has a great post (as always) on how companies using free and open source software in their customer solutions can make more money for their investors. Companies like MySQL, JBoss, and Red Hat, and now Canonical have all developed subscription offerings based on value added networks. There's good commentary from Javier Soltero (Hyperic CEO) and Zack Urlocker (Product VP at MySQL) as well — companies that are exactly engaged in this business. One of the business model innovations from the FOSS world that often goes unsung is exactly this subscription network model.
Traditional software companies have dreamed of "subscription revenues" for at least the last 7-8 years. These packaged proprietary software companies all claimed to want the smoothing of their revenue curves and its attendant predictability. They all speak in terms of "regular utility payments like your cable or phone or water", as if they actually want to be treated like commodity regulated utilities. Of course they don't want to be REGULATED like essential services. Of course they also don't want their pricing structures to reflect commodity pricing because of the unique value their technology represents. ("Commodity software? That's what open source is!") So they're left innovating small variations on maintenance and support agreements marketed as "Enterprise Agreements". And in large enterprises, these often end in Faustian deals negotiated with the friendly Draconian procurement departments.
Companies using FOSS in their solutions are delivering on a real subscription for value model. In each case they make it easier to use their solution (convenience = value) and easier to derive more value out of the solution (better ROI and better enabled new business opportunities), not simply easier to pay for it. Their networks each provide unique product related service beyond "support and maintenance".
These network subscription models also have another side that perfectly fits the "conversion problem" faced by so many open source software providers. [Insert classic Marten Mickos time/money trade-off quote about the early/late community here.] Subscription services are something that might scale well in the DOWNWARDS direction to enable more community users to become company customers.
Traditional packaged proprietary companies invariably end up caught when trying to reach new "low end" markets. (And that's when they even perceive they're missing a market opportunity that will evolve into their technology space or profitability profiles over time.) Their only perceived option is to hobble existing functional products, but there's no easy way to do this from an engineering perspective (most of the time), and there's invariably an internecine war fought in product marketing over market cannibalization and lost profits. See any of the struggles Microsoft has gone through in emerging markets or in the Vista menu of price/function options for examples of this.
By designing the network offering with a little forethought to this problem, a branded scaled set of offerings can be put into place that may well serve the low end user today to gain a small value add revenue stream scaled across the thousands of users that are not yet customers, will scale as these new customers evolve, but can also serve the existing larger enterprise customer base.
It's not that traditional packaged/proprietary companies couldn't think about such businesses, but their existing margins and profit profiles don't typically allow them to work this way. New companies with open source software focused solutions have the advantage here, necessity being the Mother of invention and all.
August 22, 2007 at 03:26 PM | Permalink | Comments (0) | TrackBack
03 August 2007
Microsoft's Open Source Tipping Point Approaches
Mary Jo Foley invited me to guest post on the ZDNet blogs. "Frost sightings in Hell" is my thoughts on last week's announcements from Microsoft at OSCON.
I would not presume to suggest that Microsoft is embracing openness and transparency and forgetting its competitive roots. I still believe ODF will be to Microsoft what the Internet bubble bursting was to Sun Microsystems. The ODF wars show lobbying at its ugliest and it will only get worse as they fight for survival on the other half of the revenue stream. It's a big company with lots of players and culture comes from the top of an organization.
As PJ points out, Ballmer recently announced during the financial analysts meeting, "we've worked very hard on making the value of a commercial company surpass what the open-source community can deliver, because frankly, it's not a business model we can embrace."
Ballmer still doesn't understand (perhaps deliberately) that it isn't "open source" or "software freedom" against which he's competing, but a collection of business models with different margins and different value propositions. So be it. It's not my problem.
I do however believe the company is learning. Larry Lessig rightly suggested that code is law; wearing my commercial hat I would also suggest code is currency. Some at Microsoft are finally participating in the coin of the realm.
August 3, 2007 at 09:38 AM | Permalink | Comments (0) | TrackBack




