Browser vendors now own the web?

On the face of it… W3C hands over development of HTML and DOM standards to browser vendors (WHATWG). Sounds like a good idea, right?

I mean the W3C was pushing for the semantic web, more rdf, more linked data and xml structuring.

Down with XML, down with linked data, rdf and the very idea of the semantic web – uggghhhh! (or something like that? I can hear you all say!).

Well hold on, remember how the web started? Remember the foresight which kept the web free and open. Insights like SVG when the proprietary alternative of flash was ruling the web. I for one really liked XML and the suite of technologies which came along with it. XHTML was a joy to use once browser vendors got on board and sorted there act out.

I was there during the fight from HTML4 to XHTML 1.0. Still remember fighting about Microformats vs RDF at BarCampLondon2 and to be fair WHATWG was likely right at the time but they didn’t have the foresight of looking further into the future. The semantic web was a big vision but whats the big vision of WHATWG now?

My fear is handing the web over to mainly browser vendors will lead us back to where the web was at during HTML 4.0. A mix of unspecified bits and bobs which rely on native browser capabilities. Whos fighting for accessibility, i18n, l10n, old systems, etc, etc? My only hope is because the w3c only handed over control of HTML and DOM, they will double down on CSS and ECMAscript?

I want the web to move forward and I know there was a lot of tension between the W3C and WHATWG but they kept each other honest. Handing the web over, I fear will ultimately make things worst for all?

What happened to attribution friendly Xpointer?

xpointer use for attribution

I was thinking while writing the last blog post. What happened to the Xpointer standard?

XPointer (the XML Pointer language) allows hyperlinks to point to specific parts (fragments) of XML documents.

I guess in the rush to move away from XHTML in favour of HTML5, the whole idea of compound documents got shuffled into a back alley and stabbed to death by the XHTML haters. So even if browsers supported Xpointer, it simply wouldn’t parse and therefore work.

Interestingly HTML 5.0 has embed but its not the same solution as Xpointer was solving. For example here’s wordpress creating a iframe which twitter (the 3rd party) can choose to put what they link in. I think originally it was oembed but got changed

I’m already slightly over the concern that one day my blog will be full of ads, spam, malware, tracking cookies and worst. The day that happens, I’ll be removing all iframes using XSL or a wordpress plugin.

Its a crying shame because attribution is the lifeblood of the creative industry and without it, were pretty much screwed. Its seems crazy that I can’t easily traceback my steps to how I found quotes, blog posts, etc. Right now this whole thing is broken, bookmarking isn’t the solution. It needs to be at the word level. Personal annotation style?

I have to favourite things on twitter, look through my play history and search my emails to find who actually recommended something to me. Maybe this can only be solved by the quantified self and lifestreams but I think there’s unexplored ways which xpointer was leaning towards.

Lightweight Attention Preference Markup

So this is the 2nd time I'm writing this because I forgot to save the entry when I upgraded the memory on my Dell. Yep 2gig of memory instead of 1gig now but still no decent Blogging tool for Linux. Wblogger and Ecto would have automaticlly saved the entry every few minutes or at least asked me what I should do with the unsaved entry before terminating and throwing my words to a black hole. Anyway enough moaning…

Previously I promised a couple of things in this entry

First up, I'm going to standardise some way of linking FOAF, OPML, OpenID and APML together. I expect I'll keep this very simple using the link element in (x)HTML or somehow combine this into a Hcard profile. Next up a APML microformat or APML lite for sure. I'll try it as I've been studying the others and the general methology of Microformats and I think it could be done. So I'll suggest it and draw up how it works and submit it for lots of review. I'm now exploring how to get APML out of Amarok and RSS Owl.

So how far have I got so far?

One : So I have linked all three (APML, FOAF and OpenID) together using links on my blog. So if you look at the source you will now see this. Which is cool but I think we can do better.

<link rel="openid.server" href=""/>
<link rel="openid.delegate" href=""/>
<link rel="meta" type="application/rdf+xml" title="FOAF" href=""/>
<link rel="meta" type="text+xml" title="APML"

When I say do better, I've been looking around a couple of things. First up is a better way to do the basic link element so it can be turned into a RDF triple later. It was found while looking at RDF/A examples which will be explained later.

When a meta or link is used within another meta or link, the internal triple has, as subject, the external triple. This is reification.

<link about="" rel="[cc:license]" href="">  <meta property="dc:date" content="2005-10-18" /> </link>

which yields:

[ rdf:subject <>; rdf:predicate cc:license ; rdf:object <> ] dc:date "2005-10-18".

Now I'm not that keen on the syntax, but its not over complex and I guess you could do something like this.

	<link about="." rel="[foaf/images/emoticons/silly.giferson]" href="">
	<meta property="apml:profile" content="" />
	<meta property="openid.server" content=""/>
	<meta property="openid.delegate" content=""/>

But I guess getting all those openID parsers to change now will be a nightmare, so to be honest I'm happy either way. But I think it does make sense to link everything in the HTML rather that rely on a OpenID parser to look at the HTML then find the URL for the FOAF file and then parse through that to find the Open ID url. Yes I already know you can put OpenID in FOAF thats why I'm saying its not a good idea, but there is no harm in having it in the FOAF optionally. Which is what I'm going to do, but I've recenly seen how out of date my FOAF file really is, so I'm going to try and update it soon. If anyone knows how to get FOAF out of Facebook, Flickr, Delicious, Linkedin, Dopplr, Upcoming, etc that would be useful. O'reilly's connections network use to allow for FOAF but somehwere along the line seems to have died or closed down, because I tried to find it and login, so I can at least start somewhere. So generally number one is done.

Two : So the huge challenge of building a Microformat for APML, so people can easily put in there preferences without building a very complex xml file. Because lets be honest, like RDF and other XML's this stuff was never meant to be built by humans. Also I like the idea of using standard HTML elements and attributes so people can instantly try this stuff out. I saw recently on the microformats blog that there is almost 450 million? examples of Microformats now and its growing everyday. Its not hard to see why when you consider how it is to try out some of them. For example adding a tag is as simple as adding another attribute to a link. Some of the other microformats are a little more tricky but generally with a example in front of most people they can work it out quickly. So whats the W3C's answer to Microformats? Well RDF/A which is a unified framework build around putting semantic meaning into HTML. A while ago it was meant to be for XHTML 2.0 but its been brough forward which is great news. Because the only other alternative seemed to be e-RDF which no one could work out if was royality free or not. Ok I have to admit I'm writing this entry over a couple of days. So I found my way on to the O'reilly connections network again. So you should be able to see my public view here. Anyway the point is that they already have FOAF, which makes my life slightly easier that starting from scratch again. Going back to APML, I'll try modeling it with RDF/A and see what happens. So far I think my plans is to keep the explict and implicit context and maybe attach it to a openID or unique ID. I'm not going to include stuff like the source because its too complex and not that relevent for a lightweight version of APML. I mean if you really want APML, just use APML. If you want something to indicate your preferences (< href="">beyond a link) in HTML, what I'm brewing up might just be right for you. I've also decided to call it LiteAPM, as in Lightweight Attention Preference Markup for now.

Three : Ok I'm not being funny but where the hell does Amarok store its configurations and database? I think I've found RSSOwl's basic configuration stuff but content i'm not so sure about yet. But then again I've not really tried really hard yet. I can't find a mention about Amarok anywhere. So I hit the web and found a way to pull almost anything I want out of Amarok via the command line. So honestly all I really need now is to learn how to program Perl or install something like XMLstarlet, and learn how to use stuff like the cron and unix pipes. Wow now I can do all that stuff I've been talking about for a long time. Stay tuned…

Comments [Comments]
Trackbacks [0]

Time to get semantic


Something which I didn't mention but others have is the fight between the large S semantic web guys and the small s semantic web guys. Aka Microformats vs RDF. You can see the video here. What us RDF-ish guys were suggesting was using eRDF instead of Microformats for extended semantic markup. We proposed to give RDF in XHTML a new name, Macroformats. Tom Morris, after a chat with some of the microformats guys like Tantek and Kevin Marks, changing the name. Tom Morris has now setup, which is a place where everyone writing semantic markup can get together and promote more semantic markup.

Wow, it's been an absolute mad panic of announcements. Firstly, “macroformats” is dead. It lasted all of a few days, but realism set in – assisted by some pissed off microformateers – and we ditched the name.
We've still got the domain names, but they will redirect and we aren't going to advertise them.
I'm just waiting for the Internet to catch up – specifically, DNS. Once the DNS machine has figured out what it's doing, then we can proceed to building the site.
I actually bought the licence for Snapz Pro X ($69!) because I feel that screencasts are going to be very important in what we are doing. Screencasts certainly helped with things like the Ruby on Rails project.
The plan is to help people understand the process of coming up with their own formats – which can be as simple as writing up a bunch of class names or as complex as coming up with a 3,000 item ontology. Of course, if they only want to do the first one, there'll be people who know how to do all the other steps and will do it for them.
I've sent out a sort of 'vision' statement to the people on the list, but I won't bore you with it here – my blog isn't the best place for it, after all. Once the site launches, something very much like it will be up there.
The first GetSemantic project I'm going to be pushing for is Embedded BibTeX. I use BibTeX a lot. The “citation” work at is suffering because there's no clear cowpath to be paved. But we have a BibTeX ontology written in DAML+OIL and it wouldn't be too hard to use eRDF to turn that in to HTML. I'm already writing academic essays in XHTML with CSS and having the tools to embed and extract those citations would rule.
The other thing that I might do is “hRSS”. hAtom is a great format, but not all web sites can be turned in to Atom – RSS 2.0 serves sites like mine better. I'll follow hAtom as closely as possible, but then move away when the RSS 2.0 specification differs from the Atom specification. Before I get flames, there are good reasons to choose RSS 2.0 if you have untitled blog entries. And, yes, there are good reasons for that too. You may not like the reasons, but they exist.
One of the key differences between GetSemantic and the more formalised microformats is that we're going to say “yes” more often. Think of them as science experiments – have fun, build something, see whether it works. We'll start herding cows down new paths and then if that works, then it might become a microformat. If it doesn't work, then we will learn why it doesn't work and try not to make that mistake in the future.

Anyway, I've graphed out where we're coming from, because its easy to think we're suggesting Microformats are crap. Well thats not what we're saying. We all love Microformats but sometimes we find them a little limiting. The example I always use is XFN vs FOAF. XFN has a limited amount of relationships, while FOAF has tons. Because you can put FOAF in eRDF, this means eRDF is more extensible. But on the other side, this all adds to the complexity and the amount of people who actually want to do this drops a lot.

Semantic markup graph

Thanks to Sheila who forced me to draw this out a while ago, when trying to explain how eRDF, RDF, XML, etc all fit in the grander scope of things. I'm considering updating it with one including XHTML 2.0 and RDF/A. Oh great work Tom.

Comments [Comments]
Trackbacks [0]

Openness in data formats

Me and Tantek

Tantek wrote this thought provoking entry about data formats and openness. Which I can't help but kind of agree on and disagree on. So first his entry.

  1. ASCII is dependable. Project Gutenberg insists on publishing their e-books as plain ASCII text as Mark Pilgrim noted, and their reasons are solid.
  2. Compatible XHTML is now also dependable. In the 15+ years since its public introduction, I believe that HTML has established itself sufficiently prominently worldwide that I feel quite comfortable declaring that HTML will be accepted to be as reliable as ASCII in coming years. In particular, authoring what I like to call Compatible XHTML, that is, valid XHTML 1.0 strict that conforms to Appendix C, is IMHO the way to author HTML that will have longevity as good as ASCII. Note that files in most file systems have no sense of “MIME-type”, thus the winged-mythological-creatures-on-the-head-of-a-pin style arguments about text/html vs. application/xhtml+xml that are often used to discredit either HTML or XHTML (or both) are irrelevant for the most common case of keeping archives of files in file systems.
  3. Plain old XML (POX) formats in the long run are no better than proprietary binary formats. XML, both in technology and as a “technical culture” is too biased towards Tower of Babel outcomes. I've spoken on this many times, but in short, the culture surrounding XML, especially the unquestioned faith in namespaces and misplaced assumed requirement thereof, leads to (has already lead to) Tower of Babel style interoperability failures. As this is a cultural bias (whether intentional or not) built into the very foundations of XML, I don't think it can be saved. There may be a few XML formats that survive and converge sufficiently to be dependable (maybe RSS, maybe Atom), but for now XHTML is IMHO the only longerm reliable XML format, and that has more to do with it being based on HTML than it being XML.
  4. Formats that are smaller (e.g. define fewer terms) tend to be more reliable.
  5. Formats that are simpler (e.g. define fewer restrictions/rules for publishers) tend to be more reliable.
  6. Formats that are more compatible with existing reliable formats tend to be more reliable, e.g. HTML worked well with existing systems that supported “plain text” (AKA ASCII)
  7. Formats that are easier to use, i.e. publish, and more immediately useful, rapidly become widely adopted, and thus become reliable as a breadth of software and services catches up with a breadth of published data in those formats.

The microformats principles were based on these observations. Now this doesn't mean I think microformats will replace existing reliable formats. Not at all. For example, I feel quite confident storing files in the following formats:

  • ASCII / “plain text” / .txt / (UTF8 only if necessary)
  • mbox
  • X)HTML
  • JPEG
  • PNG
  • WAV
  • MP3
  • MPEG

So my take on Tantek's thoughts.

Plain old XML (POX) formats in the long run are no better than proprietary binary formats. See I take issue with this, I understand what Tantek is getting at but I would say plain xml without a schema isn't leaning towards the Tower of Babel. And like Tantek already mentioned RSS and ATOM are pretty close to the non-tower of babel direction. I would also add FOAF and OPML to the list. I would love for SVG to also be included in this but alas its not. Formats that are smaller (e.g. define fewer terms) tend to be more reliable. Good point, hence why things should be broken down like how XHTML and SVG got Modularization.

My list of formats are slightly different too.

  • XHTML (Unicode)
  • XML (Unicode)
  • JPEG
  • PNG
  • MPEG3 audio
  • MPEG4 video
  • WAVE
  • SVG

Comments [Comments]
Trackbacks [0]

Semanticly changing cubicgarden

This page is xhtml 1.1 valid

Its been all of about a week since I wrote anything. I've been quite busy but I've actually been working on this blog. I've changed the structured of the pages which does cause some problems with some of you using Internet Explorer but most of you are using the RSS/ATOM so its low on my list of changes. I've also finally sorted out most of the issues with why the site didn't validate. As you can see, it now validates. This won't always be the case, due to that well talked about entity problem in copy and pasted url's. I'm also going to try and use Microformats more than I have in the past. I've not dumped OPML for outlining but I like XOXO and am actively looking for a application which supports it for quick editing. In the past I was using JOE (java outline editor) which is great because it allows you to runs python scripts which can do many things. But its not had much updates as of late. So can anyone suggest a XOXO editor besides the javascript one. If not there are XSLs to convert between OPML and XOXO so I'm not that worried.

Comments [Comments]
Trackbacks [0]

The challenges of validating cubicgarden


Its one of the dirty little secrets of my blog, I've never been able to get it to validate to xhtml because of a combination of things. So first up lets have a look at how many errors I currently recieve. 127 validation errors to be exact at the moment without this post. But its honestly not that bad, well it is but let me show you the better side first. If I just validate just one post with my current theme/style you will see there is only 4 errors and they all point towards my search box which actually links to

form method=”get” action=”″
input type=”text” size=”31″ name=”q”
input type=”hidden” name=”id” value=”1065″
input type=”hidden” name=”sortby” value=”date”

So to solve this problem I need to wrap the input elements in another element first. This is simple as I just added a div with a id around the input elements.

Ok so moving on, lets try another single post entry. The errors are varied, but the first one is Error Line 125, column 167: there is no attribute “border”. Yeah easy to fix, but why would I make some a school boy error? Well I dont its actually my blogging application which automaticly adds it when I make a image element. I just keep forgetting to remove it. So the easy thing to do would be to change blogging client, specially seeing how i've been meaning to change to something more powerful for quite some time. I tried to notify the author but had no reply and theres no forum or bug tracking. Worst still I cant actually change the element properties in wbloggar. So I'm going to try Performancing for Firefox and maybe even pay for Ecto. Till then I'm having to edit my posts to remove that border=0. Oh by the way Error Line 125, column 172: required attribute “alt” not specified is also because wbloggar puts the alt attribute as a title attribute instead. Another reason to move away from wbloggar.

My next error is my own fault. I've forgotten the fact that the Blockquote element should not contain text content only another block level element like a paragraph. So once again I need to go back through my entries and change that. I've also changed my wbloggar custom tag to add a paragraph element inside the blockquote element. When I change to ecto or something else, I hope it does this out of the box.

Ok so were almost there now. But wait here's the big problem. Lets take my last 5 entries including this one which I was typing at the same time as validating.

3. Warning Line 125, column 438: reference not terminated by REFC delimiter
If you meant to include an entity that starts with “&”, then you should terminate it with “;”. Another reason for this error message is that you inadvertently created an entity by failing to escape an “&” character just before this text.

4. Warning Line 125, column 438: reference to external entity in attribute value .
This is generally the sign of an ampersand that was not properly escaped for inclusion in an attribute, in a href for example. You will need to escape all instances of '&' into '&'.

5. Error Line 125, column 438: reference to entity “charset” for which no system identifier could be generated .
This is usually a cascading error caused by a an undefined entity reference or use of an unencoded ampersand (&) in an URL or body text. See the previous message for further details.

So basicly all the URLs need to be converted to include ampersands otherwise I will never be able to get a validating weblog. So I'm looking into my Velocity templates if there is anything which can be done. I thought I'd have a look around at other popular blojsom based blogs, see if the problem is the same. First up David Czarnecki, same problem. Ravensbourne's Mobile learning blog, same problem. IRIS at VeriSign, yep you guessed it same problem. A quick look across the web and the problem seems to be hit and miss. Ben Metcalfe, Robert Scoble, Jeremy Zawodny, Consuming Experience, etc. Geez, theres got to be a way to solve this without actually recrafting urls when blogging?

Comments [Comments]
Trackbacks [0]

Using Microformats in blog entries

I'm going to start using Microformats a lot more from now on forward. I've setup Wbloggar with a load of custom tags and hope to use them when blogging. I want to use it as a experiment to see how practical it is to use Microformats in everyday life. I even looked back into XFN, for describing relationships. I'll come back to how well it goes, but I'm considering using ecto instead as I heard it can have scripts which mean I could put in a real form instead of just code.

Comments [Comments]
Trackbacks [0]