Haiku way to do a document based file format

I was comparing the file format of Open office writer and abiword. I made a simple text document document of each to compare. I know Abiword is lightweight, but surprisingly the output file of open office was slightly smaller.

If you look into the open office content file, you see that it lists all the paragraph types and then the document content. In abiword, it lists the paragraph types at the same time as the content, more like HTML.

As the Haiku community is all about efficiency, if the Haiku community created its own document editor, what would be the Haiku way to do a file format?

The content would be a plain text file. All the formatting would be contained in file attributes.

Maybe I should have said markup language. I mean
open office<style:style style:name=“P1” style:family=“paragraph” style:parent-style-name=“Standard”>
<style:text-properties fo:font-weight=“bold” style:font-weight-asian=“bold” style:font-weight-complex=“bold” />
</style:style>

<text:p text:style-name=“P1”>This is bold text</text:p>

vs
abiword


This is normal text

vs any other xml based document format

I prefer this way, it seems more simple and XML like; so using this simple format I should do in this way:

  • It's a compressed folder compressed in 7zip format to seem a simple file
  • The folder contains a sort of XML file containing our text and an optional folder named Binaries containing images, audio, video, swf and so on files
  • Our XML file has an optional id "binary" referencing the path of the files integrated on the document
  • BFS metadata are used for index, title, author, eventually a summary or argument of the document, last revision time and so on...

I suppose should be done in this way…

I’m with bbjimmy. The StyledEdit format should be extended and refined.

I’ve never seen a save file format like styledEdit’s before. How does it work?

Like I said, the file is in plain text, and all of the formatting is stored in file attributes. Move the file to a fat32 filesystem and all the formatting data is lost as the formatting is in BFS file attributes, but the plane text file is still readable.

But you can zip it in Haiku and use FAT32 as a transport between two Haiku volumes (as in a USB stick) without loss of formatting. And I have no problems with StyledEditPlus (or whatever) being able to Export to and Import from other formats. But the plain-text with attributes model cannot be beaten for sheer elegance.

Seriously, apart from a UI update, what would StyledEdit need to be a serious word processor?

  • Footnotes
  • Cross-references
  • A page layout view
  • Graphics

Give me that and I’ll start writing books with it. Integrate a reference manager and I will delete all other OS’s from my laptop (well, once my wifi is working …).

Styles could come later (and if somebody wants to add styles to StyleEdit, do yourself a favour and look at the incredibly intuitive way Apple’s Pages implements styles. The first word processor on which I’ve ever actually used them).

Why would we want the formatting to be lost upon OS transfer. Wouldn’t we want the file to be readable on all platforms?

Why would we want the formatting to be lost upon OS transfer. Wouldn’t we want the file to be readable on all platforms?

The formatting is all that is lost. think about it. right now if I send someone a document written in Word, he has to have a vesion of word that is at least as current as mine to read it. If, on the other hand, I send him a StyledEdit document it does not matter what version his particular word processor is, or if he is using word, open office, abbyword etc. they can all read the text. I want a document format that takes advantage of Haiku’s strengths, and does not limit its capabilities for the sake of document interchange. Done right, I can send you any document and you can read it. I can also print it to a pdf if formatting is needed.

Seriously, the only standard document file format that would make sense for Haiku is an interoperable, preferably standardized one. There is simply no point in using anything else and annoying users with not being able to easily share files - including formatting - with others using Linux, Mac OS, Windows or any other system. In this light, the only sensible choice is OpenDocument, which can be opened somehow on any OS and is not proprietary to any program (like the Abiword format is).

[quote=bbjimmy]Why would we want the formatting to be lost upon OS transfer. Wouldn’t we want the file to be readable on all platforms?

The formatting is all that is lost. think about it. right now if I send someone a document written in Word, he has to have a vesion of word that is at least as current as mine to read it. If, on the other hand, I send him a StyledEdit document it does not matter what version his particular word processor is, or if he is using word, open office, abbyword etc. they can all read the text. I want a document format that takes advantage of Haiku’s strengths, and does not limit its capabilities for the sake of document interchange. Done right, I can send you any document and you can read it. I can also print it to a pdf if formatting is needed.[/quote]

Yes the real interoperable format is PDF… only in this way you’re sure formatting it’s not lost…

By the way returning to our imaginary document format:

  • An Archived directory
  • in the top dir 2 files: 1 sort of xml file containing our text and 2 a file containing our attributes dumped to a file --> Haiku on first set attribute to our archive, another OS use this file :-))
  • A Binaries directory optionally containing images, audio, video, html5 fragments and so on...
  • As said above shall have a way to print as PDF

What do you think?

The big pro I see to the styled edit format is that text such as html, source code etc. can be formatted and the code can still be completely readable by compilers/web browsers.

The big con I see is that whereas Haiku is not a widely used OS, I can’t put a document on my flashdrive, go somewhere else, to have it edited and printed. All the formatting would be lost.

As a format that could be formatted and read on all platforms with the same program I like fano’s format idea

One other thing, say I made a text document in StyledEdit and then edited it with a plain text editor such as Pe. How would StyledEdit keep track of which text should be formatted after the file had been edited with Pe?

[quote=kidd106]
One other thing, say I made a text document in StyledEdit and then edited it with a plain text editor such as Pe. How would StyledEdit keep track of which text should be formatted after the file had been edited with Pe?[/quote]

First our new format StylEdit extended format, we can call this pour parlay, should have another extension different form plain text file so Pe couldn’t open it (remember is a 7-Zip compressed directory), if it can I suppose you’ve the same result opening a Word document in WordPad the “plus” of Word are lost… not really a thing to do for your important Laurea’s Thesis!

The alternative would be: Pe opens the file as a simple text markup XML (that is showing the tags in clear), so if you want to do the styles by hand you can, StyleEdit or who will be this new software re-opening the file (the Zip archive) all the formats are there again, but well better not to do this, IMHO :slight_smile:

If you want to be safe that none can change the formatting, I repeat, you’ve to save on PDF so our imaginary program should save in 2 formats:

  • hxd (Haiku eXtensible Document, do you like?)
  • PDF

What do you think?
A part for << Cool! Let’s do it>> obviously :-)))

Because Haiku isn’t yet a popular OS I think our imaginary program should also save in .rtf, .doc, .odt, and .abw, but have .hxd the default.

Yes… and in future we should add .epub, .odx, .docx and so on…

But I’d start with .hxd as default and .pdf as export format… as .hxd is so simple (IMHO) an exporter for OpenOffice will be very simple to be done, right?

In the end we’re talking of an extended (X)HTML!

Would it be practical to make one program that would do word, spreadsheet, and powerpoint?

If you think about it, a spreadsheet is just a page filled with a table. The word would just be a text box that takes up a whole page. The powerpoint would just be the document in full screen one page at a time with slide transitions

How does it sound?

It sounds correctly to me, too!

Yeah when the formatted text editor is done all is easy…