del.icio.us vs. emailing

Michah Dubirko wrote this entry titled del.icio.us, blogging a while ago. I would take it slightly differently, and compare it to email. Since Del.icio.us applied the feature to send friends bookmarks to their bookmark inbox I've been really tempted to stop sending email too but I don't know if friends are getting them or not?

Comments [Comments]
Trackbacks [0]

SVG support in IE 7.2?

Don't know how I missed this in my aggregator but…

Microsoft publicly stated IE will have core engine support for SVG in IE7.x (most likely 7.2)

Honestly, I like to think this will happen, but I got a feeling there will be a clause. Something like support for SVG only works when using a XAML wrapper. Or you will need to enable it in the preferences somewhere.

But then worst that what I just wrote, it seems Microsofts Chris Wilson is stating that the above claim is bogus.

Actually, I did not state that IE7.X will have SVG support. I did say that I think SVG is gaining momentum as part of the interoperable web standards platform, and as such I expect we will add support for it in the future.

As for “IE7.2″ – I have not heard anyone inside or outside Microsoft say that, certainly not me. It’s a myth.

On the positive side, if Microsoft did some how suprise us all with SVG support, they would be joining the 2D vector graphics party. Firefox had SVG support ages ago now, Opera 9 just launched with even better SVG support and Safari Dev, Konqueror, Seamonkey, Camino and Amaya all have different levels of support for SVG.

Comments [Comments]
Trackbacks [0]

Openness in data formats

Me and Tantek

Tantek wrote this thought provoking entry about data formats and openness. Which I can't help but kind of agree on and disagree on. So first his entry.

  1. ASCII is dependable. Project Gutenberg insists on publishing their e-books as plain ASCII text as Mark Pilgrim noted, and their reasons are solid.
  2. Compatible XHTML is now also dependable. In the 15+ years since its public introduction, I believe that HTML has established itself sufficiently prominently worldwide that I feel quite comfortable declaring that HTML will be accepted to be as reliable as ASCII in coming years. In particular, authoring what I like to call Compatible XHTML, that is, valid XHTML 1.0 strict that conforms to Appendix C, is IMHO the way to author HTML that will have longevity as good as ASCII. Note that files in most file systems have no sense of “MIME-type”, thus the winged-mythological-creatures-on-the-head-of-a-pin style arguments about text/html vs. application/xhtml+xml that are often used to discredit either HTML or XHTML (or both) are irrelevant for the most common case of keeping archives of files in file systems.
  3. Plain old XML (POX) formats in the long run are no better than proprietary binary formats. XML, both in technology and as a “technical culture” is too biased towards Tower of Babel outcomes. I've spoken on this many times, but in short, the culture surrounding XML, especially the unquestioned faith in namespaces and misplaced assumed requirement thereof, leads to (has already lead to) Tower of Babel style interoperability failures. As this is a cultural bias (whether intentional or not) built into the very foundations of XML, I don't think it can be saved. There may be a few XML formats that survive and converge sufficiently to be dependable (maybe RSS, maybe Atom), but for now XHTML is IMHO the only longerm reliable XML format, and that has more to do with it being based on HTML than it being XML.
  4. Formats that are smaller (e.g. define fewer terms) tend to be more reliable.
  5. Formats that are simpler (e.g. define fewer restrictions/rules for publishers) tend to be more reliable.
  6. Formats that are more compatible with existing reliable formats tend to be more reliable, e.g. HTML worked well with existing systems that supported “plain text” (AKA ASCII)
  7. Formats that are easier to use, i.e. publish, and more immediately useful, rapidly become widely adopted, and thus become reliable as a breadth of software and services catches up with a breadth of published data in those formats.

The microformats principles were based on these observations. Now this doesn't mean I think microformats will replace existing reliable formats. Not at all. For example, I feel quite confident storing files in the following formats:

  • ASCII / “plain text” / .txt / (UTF8 only if necessary)
  • mbox
  • X)HTML
  • JPEG
  • PNG
  • WAV
  • MP3
  • MPEG

So my take on Tantek's thoughts.

Plain old XML (POX) formats in the long run are no better than proprietary binary formats. See I take issue with this, I understand what Tantek is getting at but I would say plain xml without a schema isn't leaning towards the Tower of Babel. And like Tantek already mentioned RSS and ATOM are pretty close to the non-tower of babel direction. I would also add FOAF and OPML to the list. I would love for SVG to also be included in this but alas its not. Formats that are smaller (e.g. define fewer terms) tend to be more reliable. Good point, hence why things should be broken down like how XHTML and SVG got Modularization.

My list of formats are slightly different too.

  • XHTML (Unicode)
  • XML (Unicode)
  • JPEG
  • PNG
  • MPEG3 audio
  • MPEG4 video
  • WAVE
  • SVG

Comments [Comments]
Trackbacks [0]

Blogosphere is more international than ever before

I've been meaning to blog this for weeks now. Dave Sifry's latest report on the state of the blogosphere. So generally the blogosphere has becaome a lot more international with english taking a step down in the most used language in the blogosphere. Its actually better that you think too, because english now count for less than 35% of the blogosphere. Theres lots of other interesting things in the report like the Chinese blogosphere growing a lot due to MSN Spaces and Chinese and Bokee.com. Dave suggests that Japanese bloggers blog small posts from there phone, hence the huge jump. In the same post but not really realted Dave talks about how Tags and Categories are used by 47% of the blogosphere now.

Talking about languages and blogs, the BBC blogs has new additions to its own blognetwork. Spanish, Arabic and Persian blogs. The Chinese and new Urdu blog are just around the corner too. I guess this is perfectly fitting with the latest report. I have yet to try out Native text (a free web service that translates RSS feeds from blogs and podcasts into foreign languages) but it certainly sounds useful. I hear the Persian Blog already has a large audience visiting it.

Chinese just launched yesterday in simplfied chinese which causes it own problems because its all in UTF-8. It seems a lot of chinese reading people set there browsers to the encoding GB2312 or Traditional BIG5

Comments [Comments]
Trackbacks [0]

The BBC 2.0, just got slashdotted

Its been a while now but Novus Ordo just submitted to Slashdot On The BBC 2.0. Its only gone up about a hour ago but its already recieved 80 comments. Quite a few sink into the usual BBC bias and BBC World vs BBC arguements. But there's a interesting related question about Slashdot's CSS redesign content and the BBC's reboot.bbc.co.uk contest. Lots of moaning about the fact you can't actually download or stream any clips or movies from the catalogue in a thread called great resource but incomplete.

On the plus side I caught ths comment by Lobais.

A thing I really think they should do 'to keep the BBC relevant in the digital age.' is to make xmltvfiles of all their tv and radio programme info. This would make them very useful for a lot of people, and sure wouldn't be very hard.

Although this only recieved a +1 and insightful mark, its easy to forget about the simple things we could be doing more of. Although there is a arguement that the Programme catalogues is just that. Plus as Pldms pointed out, we provide 7 day listings for all channels in TV anytime XML format.

A comment which I couldn't help but agree with was this one by Larry Lightbulb.

The first and possibly only thing they should change about the BBC home page is the fact that it's designed to be viewed at a resolution of 800×600. Surely a company as big as the BBC is capable of producing a web site that utilizes all of the screen space available in a browser window?

See I tend to strongly agree with this but I understand the reasons why its sticking to a 800 format. Personally I don't think there's any excuse for a 800 format when your using XHTML+CSS (unless thats the desired effect). So when we move in that direction I would like to see the 800 constrait dropped.

I'll be keeping a eye on the incoming comments…

Comments [Comments]
Trackbacks [0]

Blojsom 3.0 adds database storage and a even stronger API

My favorate blogging server Blojsom is shifting to Database storage for its next version. David Czarnecki the owner of the Open Source project outlined its very active history.

  • 01/29/2003 – blojsom project was registered on SourceForge and development was started.
  • 02/02/2003 – blojsom 1.0 was officially released. 18 releases were made in the 1.x cycle.
  • 09/10/2003 – blojsom 2.0 was officially released.
  • 06/28/2004 – Apple officially announces Tiger Server wherein blojsom is bundled as Weblog Server.
  • 03/14/2006 – blojsom 2.30 was officially released. 30 releases have been made in the 2.x cycle.

I remember running Blojsom betas, I think I started at Blojsom 0.7 when it could only handle one blog at a time. Then Blojsom 2.x came around and gave the whole project a real boost because it could easily handle many blogs under one install. I think the record is still 25,000 by some university in Australia. During the 1.x life of Blojsom, lots of plugins were developed and Blojsom was seriously deconstructed by the guys at HP research labs as part of there semantic blogging project. Its one of the things which I loved about Blojsom. Its nod towards something bigger than just simply blogging. Jon Udell did a talk about controlling our own data at Etech recently and one of snippits I heard was about he would run Xpath searches over his blog to pull out certain things. Its a step beyond tagging but one of the things which Blojsom has had for quite some time (Q3 2003 actually). Blojsom also has some other great stuff going for it like LDAP support!

Anyway, its a awesome blogging server and I believe Blojsom 3.0 will be better than Word Press. Its outgrown its roots in Bloxsom, which I believe is now struggling to stay around? And out grown all the Java solutions like Roller and Snipsnap. Being Java based will keep it out of the mainstream because most people have a LAMP setup on there hoster, but otherwise Blojsom 3.0 would be a bigger deal. Anyway more details about Blojsom 3.0

The first major change has been in the way blojsom is “wired” together. I've rewritten blojsom to use Spring for its dependency injection and bean management. There were aspects of the blojsom 2.x codebase that were more “patchwork” with respect to how certain components used or referenced other components.

The second major change has been in the datastore. I don't necessarily think I've exhausted all that can be done using the filesystem as a content database, but I've been feeling like there's a lot of development energy into making relations between data in the filesystem that can be expressed very easy using a relational database.

In blojsom 3.0, I've settled on using a relational database for the datastore. I'm using Hibernate as the ORM library to manage the data. This means goodbye to all the .properties files for configuration! It was fun while it lasted. The templates and themes are still stored on the filesystem, but I'd envision also storing the template data within the database as well. I've already prototyped use of the Velocity database template loader. I imagine removing any filesystem dependency will allow blojsom to be used in a clustered environment more easily.

Ultimately I think this will allow blojsom to scale much more than I think it can using the filesystem as a content database. I don't believe there are any esoteric relationships among the data in blojsom as to require a full-time DBA to manage an installation of blojsom.

The last major change has been in evolving blojsom's API.

For awhile now there are aspects of the API that were a throwback to needing certain data or referring to elements a certain way. I just wanted a more self-documenting and less redundant API.

For example, I've renamed the BlojsomPlugin interface to Plugin. I felt that having the org.blojsom.plugin package was declarative enough, but that keeping BlojsomPlugin was too redundant. None of the APIs have gone away, they're just more simple and straightforward.

The long and short of it is that you can do all of the things in blojsom 3.0 that were done in previous releases of blojsom. There are a few more components and plugins to migrate to 3.0, but I'm happy with how far things have come in such a short time given the scope of the changes.

You're more than welcome to start playing with blojsom 3.0 right now. All that you need to do after setting up your database is to add a blog and a user for that blog and you'll be able to login through the administration console.

If any of this interests you, feel free to participate on the blojsom-developers mailing list.

Being hosted with Hub.org, it would be wrong for me to not to choose PostgreSQL for my database backend. I would love to try other storage backends like a XMLDB but I can't quite experiment with this blog till I've tested it fully. Maybe there will be a way to run one blog on a Database and another on a filesystem or XML Database? Because that would be great. If worst comes to worst I will just run another copy of Blojsom for testing purposes.

Comments [Comments]
Trackbacks [0]

Microformating ID

Doc Searls posted a entry about Jeremy Miller's MicroID proposal. Its a Microformat as such which allows anyone to claim verifiable ownership of content they generate. You simply hash a communication ID like a email and then hash a URI of where the content will be published. Then hash the two together to generate your unique MicroID. Don't worry theres a generator on the MicroID site.

MicroID = sha1_hex( sha1_hex( “mailto:user@email.com” ) + sha1_hex( “http://website.com” ) );

The important thing to remember is that MicroID is just a way to claim ownership not a authentication. Its also very simple to add anywhere. One of the examples is to put the MicroID in your meta, which I have just done. You can also stick the Microid in a div tag using the class attribute. I'm not so keen on this method, I think semanticly it would be better if it was attached in the id attribute. But I guess it would break if you had more than piece of content from the same author in the page.

I do like the idea of generating a MicroID for every comment which gets published to a blog. Maybe this is one for the Blojsom groups.

Comments [Comments]
Trackbacks [0]

Semanticly changing cubicgarden

This page is xhtml 1.1 valid

Its been all of about a week since I wrote anything. I've been quite busy but I've actually been working on this blog. I've changed the structured of the pages which does cause some problems with some of you using Internet Explorer but most of you are using the RSS/ATOM so its low on my list of changes. I've also finally sorted out most of the issues with why the site didn't validate. As you can see, it now validates. This won't always be the case, due to that well talked about entity problem in copy and pasted url's. I'm also going to try and use Microformats more than I have in the past. I've not dumped OPML for outlining but I like XOXO and am actively looking for a application which supports it for quick editing. In the past I was using JOE (java outline editor) which is great because it allows you to runs python scripts which can do many things. But its not had much updates as of late. So can anyone suggest a XOXO editor besides the javascript one. If not there are XSLs to convert between OPML and XOXO so I'm not that worried.

Comments [Comments]
Trackbacks [0]

Tim Berners-Lee Semantic web lecture

Tim Berners Lee in Oxford

After the mad panic trying to get the train up to Oxford due to the Trainline machine at work not working. We arrived at the Oxford University venue well before the start time and picked a great spot for the lecture. Tim Berners-Lee was good to see live, you could see he certainly was no Steve Jobs. He was more like Bill Gates, a little uneasy with public talking but happy to talk about his vision and his work towards that vision. That vision is the Semantic Web. Rather than me explain every aspect of the talk its best I point you towards Tim's S5 presentation, a webcast (coming soon), this blog and my notes. I've also added my photos from the lecture to Flickr.

So generally I'm even more sure that the semantic web is happening but within certain domains. Will the semantic web happen across the web, doubtful at best. Recent developments in web 2.0 have really pushed the web towards a more richer smeantic web but away from top down ontologies and rules.

Oh and believe it or not, me and Miles were quoted in the Newstatesman blog

Comments [Comments]
Trackbacks [0]

Live clipboard from Microsoft

Before I've even had the chance to play with Microsoft's Simple Sharing Extensions, Ray Ozzie just shared a prototype they have been playing with internally. Its called Live Clipboard and basiclly is a clipboard for the semantic web.

Its a JavaScript-based solution which works in most browsers like Internet Explorer and Firefox. It stores data on the page as actual xml data trees which can be copied and pasted without having to select the text content. Its a difficult concept to explain but luckly Ray's got tons of screencasts to show how it works. The interesting thing is that not only does Live clipboard work in the browser domain but also in the desktop domain. Thanks to 25hours a day for the Etech trip report, which alerted me to Live clipboard in my RSS reader today.

Honestly when I first read the post, I did think this would be perfect as a Firefox Extension or even Greasemonkey script but you would miss out on the desktop side of things. I'll be interested to know how flexable Live clipboard is. For example will it read all types of Microformats? How about FOAF and XFN? Humm, I wonder if you could do something between a Firefox extension and a Yahoo Widget?

Comments [Comments]
Trackbacks [0]

A XSL transformation mindset

Someone asks on Metafilter.

When you imagine XSLT transformations happening in your mind's eye, what does it look like?

Its a really good question and opens up a whole range of thinking about the differences in peoples thought processes. So first Jeff talks about the question.

This is a very powerful question to ask, because ancient, procedurally oriented developers like me sometimes have trouble following the non-linear, pattern-driven processing that takes place when an XSLT template is applied to a tree of XML elements. In fact I have noticed that non-developers sometimes have an easier time with XSLT than do experienced developers, because they don't try as hard to figure out what is happening beneath the covers.

I would kind of agree with that statement. Theres something about XSL and XML which just makes sense in my head. I'm not from a traditional software or computer science background, so I still find it weird to be called a programmer by some of my peers. John wrote this fantastic comment.

My first project with XSLT a few years back was to actually generate XSLT *from* XML and XSLT and forced me to break my ideas of how it worked. When I finally got the whole “it happens all at once” approach, it started to make sense. However, every programmer that I've brought on board to an XSLT project since has had trouble getting out of the procedural thinking and that ends up being the biggest source for their mistakes.

Unfortunately, like MagicEye images, some people just aren't able to unfocus their minds in the right way to really grok XSLT beyond the simplest examples.

I have heard of programmers comparing XSLT to Prolog and even Lisp, I'm not sure how true this is but its certain that you can't approch XSLT in a regular way. Recursion is one of those things which seems to drive people mad. In XSL there's a lot of recursion and declaration which seems to fit the way I think. I always wanted to create a SVG of a XSLT process. So you can see in lines and boxes what templates are being called and add some kind of dimension to XSL. I'm sure its not that hard and even my experiements with transforming Cocoon's Sitemap file into SVG didn't require too much work. Talking about recursion someone posted this nice animated gif of how it all works. There's no douht that XSL requires a different mindset and working with a programming language like Java or Perl will be more of a hinderance that an advantage.

I posted this question to a few of the XSL developers I know and got a variaty of answers. In my own mind I see lots of lines and trees which get broken into branches

Comments [Comments]
Trackbacks [0]

Its all about Yahoo, again?

The one to look out for is certainly Yahoo. This week they released there UI library into the public domain under a BSD licence and then showed off there design patterns which I sent around to our designers for consideration. I also got the chance to read through Tom Coates fantastic presentation at the Carson workshops future of web apps summit and to top everything off. Yahoo is now hiring semantic web developers? Yahoo is once again on a roll. No wonder why Tom Coates moved. Oh by the way, don't forget to check out Simons notes which are great when flicking through the pdf presentation offline.

Forgot to mention Jeremy Zawodny has taken the main points and broke it down into something which can be translated for product managers. Yahoo are certainly on a run!

Comments [Comments]
Trackbacks [0]

Tagging which way? How about my way?

Story telling fest

Looking though my to read at somepoint in the future tagged catagory in Great News I found this useful summary of the problem with tagging online at the moment. Tag formats: Can’t we all just get along? covers the main tagging applications online and shows the confusion between spaced keywords and the comma seperated method.

So where do I fall on this issue? Well although I use Flickr and Del.icio.us almost everyday, I think they could both do benefit from using commas to seperate tags. All the latest services which I've used which support tagging have used commas because they make a lot more sense. As Victor says in the comments,

commas are faster than quotes.

as i see it (in my own experience) tags can be annoying if you don’t really care about them when you have to enter them. Usually you care about them later on, when you cannot find what you’re looking for. but they’re still a(nother) time-consuming task.

i’d use fast, thus i’d use commas.

The only thing which puts me off commas is the language issue, which is that some languages use commas for other things. There was a suggestion to use semicolon but I feel that would go down like a listening to your ipod in a church service. Other solutions which I've seen around the web include autosensing spaces or commas and the Amazon box model type thing. Which I personally think sucks because it takes too long to fill them in. I wonder why no ones written a greasemonkey script to allow people to pick a method which will be translated across all tagging services. So I can type commas into Flickr and it just translates it into spaces for me. Yeah its very lazyweb stuff. But as FataL points out, this can't be that hard.

Computer now smart enough to parse them all:
south asia, africa = [south asia] [africa]
“south asia” africa = [south asia] [africa]
‘south asia’ africa = [south asia] [africa]
(south asia) africa = [south asia] [africa]
south asia – africa = [south asia] [africa]
It’s not so hard to program all this I believe.

Comments [Comments]
Trackbacks [0]

Google Using SVG

A few days ago Google released a series of statistics on the way in which HTML (and a few other things, such as HTTP and scripting) is used in the wild, wild Web. As in any good statistics report they have accompanying graphical charts. The interesting aspect in this instance is that those charts are available only in SVG.

You will need a recent version of either Firefox, Opera, an SVG-enabled Safari build or Konqueror to see them, apparently due to minor markup issues that prevent IE and/or ASV from working. It certainly is interesting to see a major web site such as Google use SVG for live Web content. SVG support moving away from plugins and into browsers does appear to have the effect of helping it edge its way into the mainstream.

I thought about this the other day when looking through the fantastic series of Google Statistics. Good to see SVG used by a huge company like Google. I mean it makes sense to put the graphs in SVG format, but its a calculated risk on Googles part. And looks to have paid off, because I've not seen many people make a fuss about not seeing the graphs. Actually looking around the web svg is really starting to become a reality for general web use. It reminds me to check out the Canvas element and HTML 5 which were both mentioned at the last geekdinner with Dave Shea.

Comments [Comments]
Trackbacks [0]