WordprocessingML schemas in the wild

InfostructureBase - facilitating intergration

Caught the link to the WordML schemas in my aggeragater this morning. Seems the Danish Government have published the full wordML scheams. Then dave reminded me to blog it later in the day.

The schemas are seriously, wildly complex and I cant believe what i'm looking at… Makes Docbook and Openoffice schemas look like little kiddie schemas. Took a few minutes to generate the documentation out of XMLspy, check out the beastie – ALL 14.9meg of html zipped into a 3.4meg file.

Comments [Comments]
Trackbacks [0]

XForms and Microsoft InfoPath

Straight down to the facts Xforms vs Infopath. For those too busy, heres the conclusion. Both InfoPath and XForms are version 1.0 efforts, and both are likely to improve substantially in future revisions. For organizations that have already licensed Office System 2003, InfoPath will provide an excellent means to automate data collection tasks. For use on systems not running Office System 2003, including Mac and Linux desktops, phones, PDAs, and even some PCs, XForms remains a better path.

Oh this isnt really related but a interesting case study anyway.

Comments [Comments]
Trackbacks [0]

Resin 3.04 upgrade

Just upgraded to Caucho's Resin 3.04 and theres no douht its super quick. I browsed my gallary of berlin pictures in extremely quick speed, much faster than resin 2.x. Did some tests and boy oh boy its quick. Caucho do claim that its almost as quick as Apache 2.0, and i'm not douhting it. I did do a test against IIS and you can guess which one lost out by along way. I actual couldnt believe the process power needed for IIS compared to Resin 3, and resin does JSP, XSP and Servlets.

Comments [Comments]
Trackbacks [0]

Multi channel publishing

Oh my life, this is where i want to be. I cant believe some of the ideas coming out HP's research labs in Bristol. It really makes me want to return to bristol when I hear such forward thinking ideas. Why am I not part of this, I will never know.

So yes it all started with this feed. I started reading it and thought yeah tell us something we dont know. And I actually prefered Adobe's Network publishing term as it was slicker and seemed alot more ubiquous than multi channel publishing. But then I got near the end and realised that not only have HP labs outlined the statergy but also created a opensource tool which works on the same ideaology. Formatting Objects Authoring Tool, or FOA for short.

Written by researcher Fabio Giannetti, FOA is a Java-based authoring tool that allows you to create document templates and styling information without having to write them in the XSLT or XSL-FO programming languages. (XSLT, or eXtensible Stylesheet Language Transformation, is used to convert XML to other formats, most commonly, to HTML for screen display. XSL-FO, or eXtensible Stylesheet Language Formatting Objects, is one component of the XSL language used to describe a format for XML documents.)

This comes at the same time as OpenOffice.org 1.1 final, Microsoft release Office 2003 and the W3C.org finalise XForms.
I'm going to give FOA the full run through while on holiday in germany on the tablet to see how good it is. One thing I did notice while browsing the FOA site is, FOA can only open XSL files created by WH2FO or by itself? Humm, doesnt sound good, but I shall see if that will be a problem or not. Oh I've added myself as a tester for good measure.

Comments [Comments]
Trackbacks [0]

Thinking Mindmaps again

Paul in 3rd Interaction pointed me at this nice topic map editor. I would say its pretty much the best one I've seen so far. First think its not written in Java, which doesnt bother me that much but does sometime have a effect when you have a large and complex diagram. It saves as SVG natively and imports/exports to XTM and a range of image formats. Its just a shame it doesnt import or export RDF but we all already know the massive aurguements happening in the semantic web community about RDF and XTM. So until that day I'm forced to transfrom between them and lose certain meaning and use other editors for RDF such as this one I found a while ago. The other one worth a mention which I use to use is, freemind which promises to have topic map and svg abilities soon.

Oh by the way I also saw this on plasticbag a moment ago, which is kind of related. It links to this pdf which shows Tom Smiths thought about social software.

Comments [Comments]
Trackbacks [0]

Xml for presentations

For ages now, i've been looking at better more effective ways of creating presentation material without manually doing pages in adobe illustrator or using microsoft powerpoint. See what I usually do is create a template in illustrator then edit that for each page before saving each one as a acrobat pdf file. Then I put them together using adobe acrobat and tag the whole thing for internet and presentation use.
And its been ok up to now. But now I want to start doing all presentations in xml format no matter what they may be, for example the same xml format for lectures, talks, teaching, etc.

I started looking around and decided that open office's presenter format (impress) was as close as I was going to get to useable and open. Its written into a soup xml file. So using the new xml file filter. I can write a xsl to turn it into anything I like. But lets not forget openoffice already lets you write to many formats including the dreaded flash and powerpoint formats.

But saying all that, I found SlideML today. And it does have the xsl to turn slideml into css xhtml and plain html. They seem interested in turning it into pdf, svg and docbook slides. So that would save me a lot of work.

At this moment I'm gonna stay with open office's impress because its simple and works right now, but I'll keep an eye on Slideml for the future.

Comments [Comments]
Trackbacks [0]

RDF/Topic Maps and reification

Saw this while browsing around the oscom site RDF/Topic Maps and reification

On that same note, I've also been looking around the extreme markup conference site and wishing I could afford to go to these kind of events. Reading the abstract from this years keynote – William Kent. Data and Reality, really sends the shivers up my spine. Kent says: “Many texts and reference works are available to keep you on the leading edge of data processing technology. That's not what this book is about. This book addresses timeless questions about how we as human beings perceive and process information about the world we operate in, and how we struggle to impose that view on our data processing machines. Wow, what a keynote that would have been.

Comments [Comments]
Trackbacks [0]

OpenOffice.org 1.1 final

OpenOffice.org 1.1 can be downloaded now.

I love the new XML file filter settings and cant wait to try it out. Maybe write one for some of the xml schemas I use. Now if it would only save back into a format of choice. Oh my after one look at my installed version, I quickly realised the xml file filter does support export as well as import. And the biggest thing is that its all done using XSL. Yeah I love open source technology.

Comments [Comments]
Trackbacks [0]

Office 2003 will ‘protect Microsoft’s monopoly’

Interesting news story about a internal document from Sun. Laurie Wong argue's that Microsoft Office 2003's document rights management system will turn the office market into a monopoly. I kinda of agree, that the whole office 2003 is open message is very mixed but you need to buy into the whole microsoft suite to take advantage of there DRM. So its not a big deal i feel, but i take the point about using PGP. I would ideally say DRM should be seperate from the program. Office 2003 surely does take us into dangerious teritory.

Comments [Comments]
Trackbacks [0]

Xforms revisited



I've beening having serious difficulties installing infopath 2003 on my wife's laptop. Keep on getting some error about install server not responding. So I've been trawling around the web looking for a answer. I was looking around the infopath newsgroup and hit across this site. And read this post about xforms.

Xml.com has a great review of some xform engines. Its making me think again about reconmending microsoft infopath 2003. I need to try these other solutions. And look at the xforms spec once again.

Chibacon Chiba = open source and java based, sounds great. might give it shot on my development server.

MobiForm SVGView Plus = SVG and XForms, the perfect combination I would say.

Mozquito DENG = Uses flash but could be worth checking out.

Orbeon Open XML Framework = It uses XForms along with XSLT, XQuery, SQL, and web services interfaces as building blocks that together can compose an entire application. Looks good but maybe not right for us.

Cant believe I forgot Xsmiles. Also very good to see xport have there interests in the right place. Also spied this Post about Adobe getting involved using acrobat for forms – nothing new i guess. But I kinda of forgot about Adobe. Some stiring words to end.
The only missing InfoPath ingredient is a forms designer that nonprogrammers can use to map between schema elements and form fields. Thats just what the recently announced Adobe Forms Designer intends to be. I like where Adobe is going. The familiarity of paper forms matters to lots of people. And unless Microsofts strategy changes radically, those folks are far likelier to have an Adobe reader than an InfoPath client.

Comments [Comments]
Trackbacks [0]

A couple of days with Microsoft infopath

During my jury service, i took my tabletpc and installed office 2003 beta refresh 2. In the hope to create a suitable authoring environment for people who know little about xml to write it.

So how did it go?
Well I first tried word 2003, and had no luck. Word 2003 does now support my more complex schemas unlike the previous beta. which is progress at least, but you still have to understand xml to get word to write xml. Working to a schema is a nightmare to say the very least. Not quite as user friendly as it should be.
Please also note the wordml xml is a joke to say the least. Check this monster out! This doc bellow is just the wordml saying this is a test. View source if you can not see the xml tags.


< ?xml version="1.0" encoding="UTF-8" standalone="yes"? >
< ?mso-application progid="Word.Document"? >
< w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" w:macrosPresent="no" w:embeddedObjPresent="no" w:ocxPresent="no" xml:space="preserve" >
< o/images/emoticons/grin.gifocumentProperties >
< o:Title>This is a test
< o:Author>Ian forrester
< o:LastAuthor>Ian forrester
< o:Revision>1
< o:TotalTime>1
< o:Created>2003-09-22T22:09:00Z
< o:LastSaved>2003-09-22T22:10:00Z
< o/images/emoticons/silly.gifages>1ages>
< o:Words>2
< o:Characters>13
< o:Company>Ravensbourne college
< o:Lines>1
< o/images/emoticons/silly.gifaragraphs>1aragraphs>
< o:CharactersWithSpaces>14
< o:Version>11.5329
< /o/images/emoticons/grin.gifocumentProperties>
< w:fonts>
< w:defaultFonts w:ascii="Times New Roman" w:fareast="Times New Roman" w:h-ansi="Times New Roman" w:cs="Times New Roman"/>
< /w:fonts>
< w:styles>
< w:versionOfBuiltInStylenames w:val="4"/>
< w:latentStyles w:defLockedState="off" w:latentStyleCount="156"/>
< w:style w:type="paragraph" w:default="on" w:styleId="Normal">
< w:name w:val="Normal"/>
< w:rPr>
< wx:font wx:val="Times New Roman"/>
< w:sz w:val="24"/>
< w:sz-cs w:val="24"/>
< w:lang w:val="EN-GB" w:fareast="EN-GB" w:bidi="AR-SA"/>
< /w:rPr>
< /w:style>
< w:style w:type="character" w:default="on" w:styleId="DefaultParagraphFont">
< w:name w:val="Default Paragraph Font"/>
< w:semiHidden/>
< /w:style>
< w:style w:type="table" w:default="on" w:styleId="TableNormal">
< w:name w:val="Normal Table"/>
< wx:uiName wx:val="Table Normal"/>
< w:semiHidden/>
< w:rPr>

< /w:rPr>
< w:tblPr>
< w:tblInd w:w="0" w:type="dxa"/>
< w:tblCellMar>
< w:top w:w="0" w:type="dxa"/>
< w:left w:w="108" w:type="dxa"/>
< w:bottom w:w="0" w:type="dxa"/>
< w:right w:w="108" w:type="dxa"/>
< /w:tblCellMar>
< /w:tblPr>
< /w:style>
< w:style w:type="list" w:default="on" w:styleId="NoList">
< w:name w:val="No List"/>
< w:semiHidden/>
< /w:style>
< /w:styles>
< w:docPr>
< w:view w:val="web"/>
< w:zoom w:percent="100"/>
< w:displayBackgroundShape/>
< w:doNotEmbedSystemFonts/>
< w:proofState w:spelling="clean" w:grammar="clean"/>
< w:attachedTemplate w:val=""/>
< w:defaultTabStop w:val="720"/>
< w:characterSpacingControl w:val="DontCompress"/>
< w:optimizeForBrowser/>
< w:validateAgainstSchema/>
< w:saveInvalidXML w:val="off"/>
< w:ignoreMixedContent w:val="off"/>
< w:alwaysShowPlaceholderText w:val="off"/>
< w:compat>
< w:breakWrappedTables/>
< w:snapToGridInCell/>
< w:wrapTextWithPunct/>
< w:useAsianBreakRules/>

< /w:compat>
< /w:docPr>
< w:body>
< wx:sect>
< w:p>
< w:r>
< w:t>This is a test
< /w:r>
< /w:p>
< w:sectPr>
< w:pgSz w:w="11906" w:h="16838"/>
< w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800" w:header="708" w:footer="708" w:gutter="0"/>
< w:cols w:space="708"/>
< w:docGrid w:line-pitch="360"/>
< /w:sectPr>
< /wx:sect>
< /w:body>
< /w:wordDocument >

So that answers the question of using word 2003.
I then looked at Frontpage 2003 which was no good because it was more suited to someone like me who also knows xml very well. Its basicly a xhtml and xml editor. And not a good one at that.

So moving on quickly, Infopath.
At last something which works for users who dont understand xml. Its how we imagined we would setup cocoon xmlforms or better still xforms.

Anyway like word 2003, the xml schema parser now accepts my complex schemas without me modifing them alot. It even accepts schemas which are linked to other schemas. Anonoying problem which drove me crazy about infopath beforehand.

So I've created infopath documents for both Course units and course vps's. I noticed the infopath documents are not open, as one would first expect. Instead microsoft have opted for binary instead. I believe the document holds not only the forms but also the schemas. Correction:



Had another look at perfectxml.com, and it explains the XSN file format.
Even though InfoPath uses XSN as the file extension for form templates, these files are essentially CAB files that you can for instance open with WinZip and extract it to a folder. It consists of bunch of XML files, XSLT stylesheet, XSD schema file, script file, and manifest.xsf. For instance, if you wanted to update an InfoPath form, one option is to load the form in design mode and use File | Save As menu item; alternatively, you can unzip the files into a folder, update the files, create a text file containing list of files (with one filename on each line, enclosed in double quotes if contains spaces), and then run makecab command line utility to create a cab file. Finally simply change the file extension from cab to xsn.

The issues I have with infopath. Well how much will infopath cost standalone? Is the only way you can open and edit the final form via infopath? Why isnt the editor and designer seperate?
I would like to see options or scripts for saving files, so for example you can only save to certain places, via webdav or it takes a element and uses that plus the date for a filename. I believe theres advanced examples which can be downloaded from the microsoft office site.

I do love the way you can drop in many xml files and infopath in editor view drop the elements into place. So I was able to combine many units together into one large unit xml document with a few click of a button. I looked the final xml and it does do the job well. All xml was valid and well formed, otherwise infopath wont let you save it as xml. It does however add this to the document.


< ?mso-infoPathSolution solutionVersion="1.0.0.13" productVersion="11.0.5329" PIVersion="1.0.0.0" href="file:///intranetinfopathUnits.xsn" language="en-gb" ? >< ?mso-application progid="InfoPath.Document"? >


Which isnt half as bad as I thought would be added, specially after looking at wordml.

So generally talking, unless infopath is expensive or practically impossible to use. I would reckonmend it as it turns out valid xml correct to the schema and its easy enough for a non xml user to use.

Comments [Comments]
Trackbacks [0]

Tim Berners-Lee Royal Society Webcast after thoughts

Tim Berners-Lee Royal Society Webcast. And my previous post

Ok I'm live blogging this while I watch and listen from home. I unfortually have to say I missed almost one hour of it. The live stream was difficult to get half way through but I caught the last 10mins of Tim's talk and I am still listening to the questions.

Sam from Spiked asked a question about using semantic web for alternative reasons like making money. Interesting question and Tim made it clear rdf is like paper and can be molded into anything you like. There will be those who do just that, while others will use it for pro-human reasons.

Someone in the crowd asked if there something tim wished he could have changed in the last 14 years? He replied yes the slash slash. Great answer Tim. Not quite what I would have thought he would have said.

Interesting question asked about xlink came up too. Tim talked about the xlink in brief and touched on other areas of the w3c like svg, smil and x3d. Explaining how the semantic web was just one part of the w3c and the lecture had to be about something. Then he went back to rdf and touched on annonation – in the amaya browser. Suprising it would seem only a few knew what he was talking about it.

A good question came from the web. Should the w3c have been involved in streaming media standard? Tim makes it clear w3c dont impose standards, but maybe just maybe they should have been involved in the dissucssions at a earlier stage. The question also made reference to the fact you needed IE 5+ with Realplayer 8 to view the live stream. Cheeky but good point made.

Another cracking question came from the web.
Should there be unique ID's for web users to cut down on web fraud, etc? Tim had a good think about this one for a while, then replied with a sensible answer. For small communities yes, but not on a larger scale like nation. Everyone should be responsible for there words, but people have the right to be anonymous. He mentioned Slash dot's system as a balanced way of everyone being responsible but also allowing people to be anonymous.
Tim mentioned he will be involved in some talk about this issue somewhere in the uk very soon, it sounded like maybe within 2-3 weeks. Will have to check his blog for more details.

Then that was the end of the questions, which was a real shame. I'm hoping Fly on the wall will put up a clean version for vod very soon. I cant believe I missed almost a hour of the lecture. Shame on me, all i was doing was burning cds and watching the channel 4 news.

Comments [Comments]
Trackbacks [0]