[1+1=2]
OneAndOneIs2

Wed, Jul 04, 2007

[Link][Icon]Possibly the most nauseating story

..I've ever seen on the BBC web page.

The growing problem of accessing old digital file formats is a "ticking time bomb", they say.

Massachusetts, please take note.

How, just exactly how, did somebody manage to sit down and write about the problem caused by Microsoft's multiple incompatible, proprietary document formats - a problem they knew enough about to state that The root cause ... is the range of propriatorial file formats which proliferated during the early digital revolution. - and manage to represent it as a situation in which Microsoft were not only blameless but the "knights in shining armour" charging in to the rescue?

The video at the top demonstrates one of MS's proposed solutions: Use virtualisation to run all the different versions of Windows and all the different versions of Office.

Oddly enough, they don't mention that you will of course need to buy licenses for each and every version of their "legacy" software that you use, should you wish to use this approach yourself.

And I'm almost tempted to write in and complain that they even published the line about OpenXML being an open international standard under independent control.

They do earn at least a miniscule amount of credit by mentioning ODF, I suppose, but it barely gets a passing mention and the whole issue is dismissed with "Well, MS released a translator, didn't they? What more do you want?"

Well, call me picky - but when the problem we're having is accessing documents that were saved in, say, Word97, because it used a proprietary format that we can't understand any more, I'd like a standard that doesn't have formatting tags like "useWord97LineBreakRules" - which you'll find in the OpenXML specification, if you can bring yourself to plough through the four thousand page description of it.

(I'd also like a translator that actually works before accepting it as an alternative, but maybe I'm just fussy)

An open specification that anybody can use? Yes, if by "anyone" you mean "anybody with a complete understanding of all previous versions of Microsoft's proprietary formats", which would amount to... just Microsoft themselves.

Funny, that.

What I find worse that the sentiments in the article themselves is that the chief exec. of the National Archives, having lost some of their documents forever because of closed formats, can even be considering sticking with yet more closed standards instead of insisting unreservedly on open standards which already exist and are capable of doing the job. Even if they don't want to use ODF, what the hell is wrong with PDF? A genuinely open format designed specifically to preserve the precise appearance of documents.

The mind boggles. It really does. These people have nearly 600 TERAbytes of files that their job is to preserve. And yet they don't even seem to show any awareness of what an open standard is, and what advantages it brings with it.

Comments:

Comment from: hari [Member] Email · http://hari.literaryforums.org
SGML is the meta-document markup format of the future. Most of the digitization of documents will probably be in XML or SGML.

And of course, there's the good ole plain TXT format (probably Unicode encoding will take over from ASCII).
PermalinkPermalink 04/07/07 @ 18:04
Comment from: oneandoneis2 [Member] · http://geekblog.oneandoneis2.org/
The biggest problem with XML is that too many people have this weird idea that "if it's XML, it must be open - it's just plain text, right?"

They don't seem to understand that you can have something like:

<proprietary>n # Y v C n $ } I / . ` b 0 J r n v 9 8 N % I : 3 ? = Y ` K c E b x x f W S p y \ g L l $ C ? ) , 8 k o O ! w | \ 7 2 v A i O I p w 5 v O k 1 \ I ` s T u a </proprietary>

Just because the markup itself is text, doesn't mean that the content within it is in any way guaranteed to be legible.
PermalinkPermalink 04/07/07 @ 21:38
Comment from: hari [Member] Email · http://hari.literaryforums.org
I agree, SGML and XML are meta-format - you can use it for whatever you want.

What I was talking about was human-readable XML and SGML formats. It's not going to be really hard.

Again, having a standardized meta-format is half the battle. XML is just the framework for a thousand new usable and portable document formats.
PermalinkPermalink 06/07/07 @ 04:16

Leave a comment:

Your email address will not be displayed on this site.
Your URL will be displayed.

Allowed XHTML tags: <p, ul, ol, li, dl, dt, dd, address, blockquote, ins, del, span, bdo, br, em, strong, dfn, code, samp, kdb, var, cite, abbr, acronym, q, sub, sup, tt, i, b, big, small>
(Line breaks become <br />)
(Set cookies for name, email and url)
(Allow users to contact you through a message form (your email will NOT be displayed.))

Categories

August 2008
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Search

Misc

XML Feeds

What is this?
eXTReMe Tracker

Valid XHTML 1.0 Transitional

Valid CSS!

[Valid RSS feed]

powered by
b2evolution

blank