« Microsoft's Failures with the OOXML Standard | Main | Microsoft, Moonlight, and Open Source Software (and Novell's Brilliance!) »

31 August 2007

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c57b753ef00e54eea4c8d8834

Listed below are links to weblogs that reference Office Open XML Conformance (A Lesson in Claiming Standards Conformance):

» Simon Phipps: links for 2007-09-02 from Mirror
Office Open XML Conformance (A Lesson in Claiming Standards Conformance) Stephe describes what most of [Read More]

» Issues with Document Standard Conformance from Bytesfree
Stephen Walli writes about his adventures with the OOXML format and his attempt to import OOXML documents into Pages 08, Apples new word processing software that is part of the iWork 2008 suite. The long and short if it: multiple parties ... [Read More]

Comments

orcmid

Great analysis, especially about the document-production tool and transformation chain.

I suspect that font changes will change pagination, as will differences in printer metrics (with Office 200x) and these are probably beyond the scope of OOXML. The early ECMA drafts were stricter on conformance, as I recall, and the room for greater variation came later (it is now closer to the ultra-loose ODF language), as it usually does.

It is important to raise attention on the difficulty of interchange fidelity and what that may mean (versus be assumed/presumed) in one case or another.

In fairness, it would be good to conduct this same experiment with two ODF-supporting desktop products (OpenOffice.org claims ODF "support," not conformance). I think this is going to be eye-opening all the way around and should also calibrate people's expectations about whatever "fidelity" translators will be able to preserve and what round-tripping is unlikely to accomplish (although OOXML has built-in support for roundtripping and OpenOffice.org takes advantage of ODF alternate-rendition roundtripping provisions when interchanging to and from Microsoft Office formats).

Of course, neither ODF nor OOXML specify presentation fidelity, unlike Adobe Postscript (and Microsoft XPS) and the much-ignored ODA (a genuine Open Document Architecture scheme with allowance for layout fidelity).

To be clear: I think this is a great analysis. The next step is to point out that all of the current standardization efforts for office-document formats suffer this problem with regard to the difficulty of fidelity preservation across implementations. This needs to be understood much more broadly. It is critical for understanding of collaboration, interchange, and preservation prospects moving into a document-standards based future.

I have been waiting to see how government procurement agencies learn to qualify products and also see what happens when these practical difficulties are recognized. As far as I can tell in the Massachusetts poster-child case, ODF has simply come to mean whatever OpenOffice.org does sort of like ANSI COBOL became whatever a particular IBM compiler did. I thought we'd do better here, but apparently not.

orcmid

Oh, a PS: You said "[Microsoft] are by definition the folks that should be defending the strict conformance of the standards in which they participate, and not merely suggesting that partial implementations are a 'great start'."

Once we move into the standards world, there might be competition around who is more compliant than who else, but it is no longer Microsoft's job. Turning governance over to a standards body relieves them of any ability to enforce compliance (e.g., the way Sun did with Java licenses).

Likewise, the standards bodies eschew enforcement and for the most part certification/qualification of vendor offerings. It is going to come down to procurement practices and any third-party arrangements for certification of the conformance of a product. Test suites will be good, although they might not deal with presentation fidelity issues in these case. Introduction of NIST into this process would be useful, although my sense is that NIST has been rather defanged and defunded in this area over the past several years.

Vendors make the claims vendors make unless there is some penalty for their ingenuity. How many ODF-compliant products are there and what does it mean to say/claim that?

This is important, and it is important for all efforts to adopt standardized document formats. There are important lessons here.

stephe

Morning orcmid: Thanks for the excellent commentary. I completely agree that those with the economic need for certification need to put the model in place that works for them.

NIST did that for the U.S. for a long time for U.S. government procurement, and the commercial world gained the benefits as well. (I believe NIST was castrated when the head of NIST became a presidential appointment instead of the traditional civil service position. NIST's mission then began tracking White House policy instead of the boring stuff you want it to do with you tax dollars.)

I started on this meme of certification a while ago here:
Conformance and Certification: The ODF Standard and Microsoft's Office Open XML Specification

Ed Brill

Stephen,

Really interesting analysis. When the 'softies started crowing about how Apple was "supporting OOXML", I challenged them that I didn't really consider an import-only ability as even worthy of being labeled "support".
See comments: notes2self.net/archive/2007/08/14/iwork-08-supports-openxml.aspx

Their response (from Brian Jones): "I would think import is more challenging than export but I guess it really depends on how your application is modeled. ... I think the reason you see import built first isn't as much around difficulty as it is around scenarios."

Wesley Parish

I downloaded and somewhat forcefed Novell's Office Open XML Plugin to my Mandriva 2k7 OpenOffice.org 2.x install. I was pleased to see that it did at least appear in the Save As ... menu entry.

I have tried to open bona fide TC45*.docx files, with as much luck as you - OO.org tells me that the file is corrupt and I should allow OO.org to fix it for me. I don't think so ... I also saved a "almost-throw-away" half-started novella as odf, docx, sxw, and [MSO]xml, copied them to zip so that konqueror knew what to do with them, and opened them.

The docx file is indistinguishable from the odf one and the swx one.

So, either the Novell docx isn't working as an import filter, or it isn't connecting in any meaningful way to my version of OO.org. I don't know which.

If I could get a meaningful response to my application for a MS Office 2k7 Trial Edition serial number from Microsoft - ie, one that recognizes that someone who is sent there with the APC reader's reference, is by definition not in the US of A - I would find it worth testing further, to see just what it is - I've also got a copy of Novell SLED 10.2, and so should be able to see if saving as docx results in a file that is indistinguishable from a file saved as odf ...

But Microsoft being Microsoft, I doubt that they'll allow me this test - I made two posts about this sort of thing on Brian Jones' blog, and the second one got censored.

Wu MingShi

One thing I like about iWorks is that it give a list of "potential problem" during format conversion, which is better than the vague warnings given by OpenOffice.org and MSOffice.

The font problem is entrenched in ALL document format, not OOXML alone. If you do not have the font on your computer, rendering will suffer. That's why like "number of zeros after decimal point" I normally do not put font in the list of "must have" in document conversion process. Unfortunately with OOXML, in justifying its existence it wants to faithfully represent all of Microsoft document format to date, which to me means getting applications to render the document exactly, put the "font" and "decimal point" issue into the "must have".... then failed to deliver.

orcmid

@Stephen: Thanks for the link to your January post. We think much alike on this aspect. I will hone some sort of post about it eventually.

@Wesley: I'm not sure what the deal was with Brian Jones, but I can probably do a verification for you. It may be that the Novell plug-in is vetted with the Windows version of Novell's OO.o distro.

Here's what I can provide if you want to do some confirmation testing: On my Windows XP SP2 Machine, I have Office 2003 (SP2 not SP3 yet) with the Office Compatibility Kit and I work in OOXML almost exclusively now. I also have OO.o installed on that system and I will upgrade to the new OO.o 2.3 which I have just downloaded. I don't think the Sun translator comes with it. I don't remember why I decided not to install the Sun Translator beta, but if I do it goes here. On my Tablet PC I have Vista and Office 2007 and I can install the Novell OO.o here. I have the distro and their plug-in, though I should check for any updates before installing. I think I would install the Microsoft-sponsored translator here.

I'm willing to do reasonable experiments with this configuration and try various tests, and also report the results/problems to the appropriate parties. I just want to be careful of the mix and not destabilize anything very much. You (any anyone else interested in this kind of activity) can contact me at my e-mail address (see the contact information on my blog). Be sure to put ODF: or OOXML: in the subject so I will catch it when reviewing my junk mail folder for legitimate mail from unknown addresses.

del_piero

Usually when I work with word files I use-corrupt text recovery,because tool has many pluses,and as far as I can see has free status,also tool helped two my good friends,software can use a backup copy and restore all text files from scratch, but, this possibility is not accessible for all users,program for corrupt text recovery Word 2007 is efficient to restore damaged text files,repairing Word 2007 file will not take a lot of time,can work either on the slowest computers or on modern workstations,recovery and help to repair corrupt Word doc.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Essays

Creative Commons License

Blog powered by Typepad