xml and web 2.0 – Page 29 – Cubicgarden.com…

Tiger RSS screensaver

Kelvin was talking to me about RSS in the new Safari RSS browser. And I mentioned the buzz about the new RSS screensaver, which I forgot to ask him about last time. Well he sent me a link to a quicktime movie of the screensaver in action. Although its quite wanky, I love it! My pocketpc has a passive RSS section for the today screen which works for grabbing you eye once in a while with new news. Well this is the logical conclusion of the same idea. There has been lots of talk about its usability, but you know what I got a RSS reader on my xbox and its works in the same way. Just passively grabbing your eye when something interesting comes along.

So I'm not getting a mac anytime soon, so what can I do? Well with some searching I found this which RSS screensaver quite nice. Accenture also has one, which I have yet to try. This tutorial also looks quite useful. I give Apple the fact that these screensavers are no where near as juicy as theres but I'm sure someone will do something simlar soon. I was actually thinking, if you combine a VRML screensaver with a XMLhttpRequest and some well thought out XSL. You could build something quite striking.

Comments [Comments]
Trackbacks [0]

Structured blogging, semantics and tags

I have been meaning to blog about the structured blogging idea for quite some time now. The idea is this…

Structured blogging is about making a movie review look different from a calendar entry. On the surface, it’s as simple as that – formatting blog entries around their content.

On another level, it’s a bit more complicated – what we want to do is create structure (in the form of XML) around each of these types of entries, to organize the data inside and to let machine readers – other programs, sites, and aggregators – better understand the content.

Yep great can not disagree with that. Greg from Blogdigger has been thinking about this too. The overall concept is a interesting one for myself, but I see lots of connections with the semantic blogging idea which has been pioneered by the great guys at HP Research Labs, for quite some time now. Generally Semantic blogging for me seems to be adding machine readable meta-data to the content of the blog entry. Blojsom for ages has had meta ability for years now, which I have not really made full use of yet. But theres lots of work already moving ahead in this area around FOAF, XFN and Tagging or Folksonomy.

With Friend of a Friend (FOAF) its easy to put a link to a persons name and there FOAF profile. I'm sure it would be trivial to add a rel attribute to the link for more context. Which leads nicely on the XFN idea which is exactly that. Using rel attributes and a classification of types to represent human relationships using hyperlinks. Works well but I would like to get rid of the classification and go more fuzzy like tagging. The best examples of tagging full stop has to be Del.icio.us, Flickr and others but tagging on personal blogs is still quite new. Oreilly's radar points to whats possible with a little time and lots of meta-data in entries. Technorati makes this possible via there site. For example I have tagged semantic blogging and structured blogging in this entry and sent them a ping to make sure they pick it up. Theres nothing special in the links besides the rel attribute which is set to tag.

I also have to agree with Greg about what happens to the meta-data beyond the blog. I really want to see RSS feeds with the same meta-data making its way towards aggregator via namespaces and rdf. GRDDL is good and well worth looking into for the future. But once again this is all covered in the semantic blogging ideas. It has been a long time in coming but I disagree that the time is right. Blogging Clients need to change to reflect these changes otherwise no one except a handfull of people are going to manually put in meta-data into there posts. I will because i'm jazzed about meta-data and semantic blogging, others will not. Even if the benefits are different looking entries on there own blogs.

Comments [Comments]
Trackbacks [0]

The importance of open data

Miles sent me the link to the O'reilly Radar which I have never seen before. Anyhow the Tim Oreilly talks about the xtech conference which I'm also involved in and how open data is a wee bit more experimental than the other tracks. The BBC and RSS is mentioned which is good for myself, Kevin and Joel.

If you are going to Xtech 2005, please let me know. I'm really hoping to soak as much in as humanly possible. Edd's been doing some interesting things already in the name of Open data. On Saturday he turned the Xtech 2005 schedule into xml and asked people to remix it. The file can be found here as a grid.xml. However Dan Connolly was quick on the take and converted the raw xml into a ical which can be pulled into a modern calendar application. So cool, and all the XSL's are available for anyone to pull apart and learn more. I love remix culture.

Comments [Comments]
Trackbacks [0]

Syndication for a world wide audience

People have slowly caught on to the problems with RSS syndication and languages. If you follow the links back from the blogdigger blog entry
you will start to notice a pattern, of people not quite being able to put there finger on the problem. And the reason why is because actually its not a single problem, its more a muddle of a problem. Andy puts it well but I may have the killer paragraph which explains it all.

It is a chicken and egg problem. If the content publishers do not provide RSS feeds with correctly structured language meta-data which software engineers can cut there teeth and applications on, then the stalemate will proceed as it does today. Certainly this is one way of looking at it. The other view point is software engineers need to put language features into there software otherwise there is no point in content providers using correctly structured language meta-data and modules to describe language content…

This is taken from my draft Paper which I am currenly finishing on the same subject of RSS and languages. See Blogdigger are right but how many feeds do they get from non-latin languages which have language meta-data they can actually use? This quote comes from Mark Fletcher from Bloglines

But the more important question is, are the majority of feeds accurately labeled in terms of language. And in our experience, the answer is unfortunately a resounding no.

I would echo that fact too, when looking for examples of non-latin RSS feeds, they tended to have little language meta-data (some actually marked english still!) Is this a limitation of the RSS standards or something else? Well in my paper it would seem no one gets away clean. For a quick taste of what I mean look at the complete (you call that complete?) list of language codes which can be used in the RSS 0.91 spec. Yes I know its old but still quite scary for 2000. Try and find Arabic, Hebrew and other non-latin languages.

If your interested in more information in this area, please keep an eye on this blog where I will post my paper sometime in late May or early June. Or even better come and listen to my presentation on the paper at XTECH 2005 in late May.

Comments [Comments]
Trackbacks [0]

Ajax? asynchronous JavaScript and XML better known as Remote Scripting

Well it looks like theres some push behind the AJAX naming now. I believe it stands for Asynchronous JavaScript and XML and translates into this easy to understand list.

standards-based presentation using XHTML and Cascading Style Sheets (CSS);
dynamic display and interaction using the Document Object Model;
data interchange and manipulation using XML and XSLT;
asynchronous data retrieval using XMLHttpRequest;
and JavaScript binding everything together.

I couldnt give a crap what its called but certainly the way to do rich interaction in the browser without using Flash or anything non standard. Its all interesting because about 3-5 years ago Standard CSS and XHTML was becoming accepted in the modern browsers, and a standard browser DOM or javascript was along way off. Now it seems were hitting the point of where you can use standard javascript across all modern browsers. Which is indeed a good step forward.

Sam Ruby considers Ajax harmful but seems to have more a problem with bad use of it. Paul vents his frustration with others on the naming of Ajax. Dare covers the whole Ajax issue once and for all

Comments [Comments]
Trackbacks [0]

The joy of Cocoon, XML technologies and beyond

For quite sometime I've been talking and pushing the use of Open RESTful API's. Its so easy now, sign up get a API, Development, Session, etc key and your pretty much away. The only question then is what framework you choose to develop in?
I personally use Apache Cocoon because it allows me to do almost anything I like and its completely dependant on XML technologies like XSL. One of my aims back in 2002 was to use XSL as a tool for almost anything I needed to do again and again. Well it that aim is very real now. For example I was creating quite a few emails with slighly different information in each one. Well I thought about using copy, paste, find and replace. But no need I just wrote up some really simple xml files which contains parts of the infomation then a sitemap in cocoon using a little bit of simple logic to read the URL string and certain parts as a variable for the XSL. Before you know it (timed myself, it took only 45mins with instant message interuptions) I had what I wanted. The only thing which was missing was for the pipeline to send a email at the end. I still had to copy and paste the result into a email. So how would I solve such a problem?

Well from my understanding of Servlets, it shouldnt be too difficult to send messages/streams from Cocoon to a SMTP JAR which sits in the same servlet container? But I'm still a little unsure of how this is exactly done. The other thing I could do is to create a email file which Thunderbird would pick up and send or at least put in the outbox with little user interaction on my part.

Anyhow, one of the issues with cocoon for the longest time was getting content in to it. Yes you have many different neat ways like the Directory reader, Zip reader, standard xml reader, image reader, JSP reader, hey there was even a Database reader which would create a connect to any SQL database. But theres so much more needed in this area, for example a while ago I was looking for a way to analyse lots of CSS and in the end we had depend on a Perl lib which understands CSS and then had to be written to XML before it could be pulled into cocoon. Now theres nothing wrong with this method but you would think a CSS reader would be useful. Along with a EXIF reader which I swear would be so great. All the metadata is sooo useful!! However its all changed now thanks to webservices. I could upload all my pictures to Flickr and use there REST API to pull the metadata out and into Cocoon. And this is the thing, its pretty true of a lot of things now. Want to get the weather in Tokyo? Just grab it from somewhere else, I bet its also more effective than doing it yourself too. And I wonder where things are going with this? Will we get to a point where it will be more effective to get the date or even the time from a webservice?

So the input side is covered, the internal transformation is pretty much there now with XSL 2.0 and Cocoon. But what about the output? Cocoon can output or serialise to almost anything you can think about including SVG, PDF, OpenOffice, Zip, Text, XML, HTML, XLS, etc. But this is the thing which needs the most development at the moment. The user interface for services are still pretty poor. So what options do you have? Well my fav is still SVG but still there very few browsers which support this natively. Which will always be a problem. Then theres the whole Flash thing, which I still really really hate but have accepted a tiny bit in the absence of SVG for some things. There's also tons of Javascript + XHTML solutions being used now, which I'm actually thinking is not so bad now (A lot of these solutions are using standard DOM's which makes them work across all new and coming browsers). XML.com has a really good piece about client side processing using Sarissa which I have been messing with recently. XMLHttpRequest has made a huge difference to what can be done on the client side and has issued in things like google maps and google suggest.
Then of course if you want to get really rich theres a whole host of technologies just on the horizon including XUL, XAML, XForms, WebForms2, sXBL, etc. Being a opensource kinda of guy, i'm gunning for XUL with Xforms and sXBL with SVG for my own applications. Can't help be interested in XAML though…

Following on, I've been reading a couple of thoughts from others out there. http://www.Peej.co.uk dispells the meme that REST is not ready? Theres a great little quote from Mark Baker in the same entry.
The same tools that create Java servlets could be used to build REST-based Web services, Baker says. “They follow the HTTP specification, and by following it, they implicitly are following the constraints of the REST style,”
Bang on the money! I was explaining to David the other day over a drink why PHP is great but Servlets are just well different. You cant really compare them, it would be better to compare JSP with PHP but servelts are logically different and simple just like RESTful services. It fits perfectly with the next generation of web 2.0.

Comments [Comments]
Trackbacks [0]

Living in the long tail and the emergence of tagging

I have been meaning to blog about Stephen Downes' community blogging presentation for quite some time now. I've already touched on the Long tail stuf through the blog and recently in the Why I still listen to Dave Slusher's podcast entry. And Stephen's presentation was the spark for me adding more metadata to my RDF RSS feed. Anyhow here's some great quotes which should spur you to listen or read the presentation.

in Canada we have socialists and socialists always say, “We represent the working class” and that's kind of like the socio-economic way of saying “We represent the long tail.” And they come out with these platforms and these policies that identify with the working people. Ask any of the working people, they don't want to be working people. And so, they're more likely to choose policies that support the rich people, because they all want to be rich, and when they're rich, they don't want to be pushed back into that long tail again. So I don't see a virtue in the long tail.

Because the meaning of a post is not simply contained in the post. And this is where we have lots of trouble with meaning, because we all speak a language and we all understand words and sentences and paragraphs, and we think we've got a pretty good handle on how to say something about something else, and we have a pretty good handle on how to determine the meaning of a word. What does the word 'Paris' mean? Oh, no problem, right? 'Capital of France.' Right? But, you know, it might also be, 'Where I went last summer.' Or it might also be, 'Where they speak French.'

When we push what we think of as the meaning of a word, the concepts, the understanding that we have, falls apart pretty quickly. And the meaning of the word, or the meaning of a post, is not inherent in the word, or in the post, but is distributed.

We can't just blast four million blogs, eight quadrillion blog posts, out there, and hope Technorati will do the job, because Technorati won't do the job, because Technorati represents the whole four million things and I'm not interested in three million nine hundred and ninety-nine of those. What has to happen is this mass of posts has to self-organize in some way. Which means there has to be a process of filtering. But filtering that is not just random. And filtering that isn't like spam blocking. Filtering has to be a mechanism of determining what it is we want, because it's a lot easier to determine what we want than what we don't want.

So how do we do this? We create a representation of the connections between people and the connections between resources. The first pass at this I described in a paper a couple of years ago called “The Semantic Social Network” and the idea, very simply, is we actually attach author information to RSS about blog posts. It kills me that this hasn't happened. Because this is a huge source of information. And all you need to do is, in the 'item', in, say, the 'dc:creator' tag, put a link to a FOAF file. And all of a sudden we've connected people with resources, people with each other and therefore, resources with each other. And that gives me a mechanism for finding resources that is not based on taxonomies, is not based on existing knowledge and existing patterns, but is based on my placement within a community of like-minded individuals.

Great stuff, well worth reading and theres tons of links to learn more from in the page. Very cool presentation, even though I dont totally agree with everything said. The emergence of tagging is something well worth considering into the future. Even Miles has talked at great length about community driven tagging with aggregation playing a role in bring sense or even meaning to resources. Honestly we're not that far off the semantic web in my eyes.

Comments [Comments]
Trackbacks [0]

Why I still listen to Dave Slusher’s podcast

I stopped listening to Adam Curry's Daily Source code quite a while ago. Tell a lie, I do still download the podcasts, but Blogmatrix's Sparks usually does delete the files before I get around to listening to them. At first it was interesting, well produced and a great chance to get a feel for what was going in the podcast world. However podcasting has moved on, theres a lot more choice and there is no need to know whats going on as such. (Its a bit like a blog about the blogosphere, however I do listen to blogosphere radio now and then). Anyhow around the same time as listening to the Daily Source code, I was listening to Dave Slusher's Evil Genius Chronicles.

But why am I still listening? Well simply, Dave Slusher's podcasts have a much higher level of quality and narrowing that Adam's. I mean he knows whos listening and does not do this general radio style which I and others tend to hate. The Daily source code is a radio show as a podcast, its so general and does not take advantage of the nature of podcasts. Someone once said recently, Its NOT everybody (mimicing adam's voice). And in that statement, says it all. Dave Slusher plays music he loves and talks about subjects which interests him. Adam servers more like a radio dj reporting things which he has heard and been given. Yes he has a huge audience. Yes I do not like the music Dave plays, but screw it. Dave has a quality audience and the narrow band idea tied up.
Dave actually explorers this futher in this post and this podcast. And honestly I've been thinking about this whole area myself…

Its all about metric's, and Dave took the words out of my mouth.

The Podcast Alley fracas is mostly culture clash between the old methods and the new context. The more I think about this, the more I think the focus on the sheer size of listenership is taking the worst of the old situation and applying it to the new world. We don't need to think in channel-limited scarcity mode any more. It made sense when you could only have so many FM or AM channels max in any market, but it doesn't make sense when you have a nearly infinite variety of channels.

I dont really care whos number one on podcast alley, it makes no difference to who I listen to. But I do understand that old/dead media still does metrics by quantity not quality. This is echoed by Doug Kaye who is the owner and creator of IT Conversations. Who has a couple of times asked for listeners to vote on podcast alley, saying IT Conversations should be in the top 50 at least. While he and others (like myself) who listen may not care about what position its at, advertisers will be more interested if its closer to #1 at podcast alley. Its just the way they do metrics at this moment. The question is what can be done about it? Well there's hope from Doug Kaye. But in his answer, lies the actual issue…

I pitched the idea of a ratings system like Amazon, Netflix or IT Conversations, but as he pointed out, that doesn't work for his site. Chris can't just publish an 'average' rating for each podcast, even with some minimum number of votes required. Why? Because a podcast with five votes of “five stars” each, would then be rated higher than one with one thousand five-star votes and just one four-star vote. It's not a problem for IT Conversations and these other sites because 'ranking' isn't as important as the how-good-is-it rating for each item.

Why is the ranking system on IT Conversations, Amazon, IMDB, Netflix, etc not as important as the one on podcast alley? Is it because people realise that you can not compare one thing against another? That views are subjective and relative? What if the Daily source code is number one? Does that actually mean its better than IT Conversations? or vice-a-versa? What does being number one actually mean?

I blame the old/dead mediums for not growing up and moving on. THERE IS NO SCARCITY, anyone can podcast or write a blog, and the abundance of the internet through networking keeps the statement true. Its time to reconsider your metrics, because once again THERE IS NO SCARCITY and its no good trying to create a artifical scarcity. And the other point worth making…

The podcast infrastructure is very open to narrowcasting (I'd go as far as to say it is optimized for it). The popular podcasts in sheer volume of “units shifted” will always be the more general ones. However, a podcast that serves a small niche audience and serves it superbly well will always be lower in total downloads but could be very high in the axis of serving the needs of the listeners.

This was made very clear the other day when Doug Kaye asked listeners to send emails to people who could/would be interested in Underwriting with IT Conversations. With IT conversations narrowcasting to its target audience the Underwriting Campaign was a good success because of a quality audience. What more could a advertiser in the IT world want? Dave agrees…

People keep talking about how advertisers and sponsors want to see “big numbers.” I'm not so sure that is the best way. It is certainly not the only way. If a company has a product or service that is related to that niche interest, they might be getting a much better deal in sponsoring that podcast. The high affinity the listeners have for the show coupled with the focus of the interest may make it a great deal and a more efficient use of sponsor dollars that a general purpose show with a huge listenership.

There are no simple metrics to measure the relative affinity your audience has, or to determine the aggregate influence your listeners wield. In contrast, it is fairly easy to count concurent streams or determine download numbers so that will be what things are based on. This focus on volume, on popularity, on being the top in some ordered list – it all reflects vestigial thinking from the old way of doing things.

And in that lies the problem, its hard work. Its not something you can just count and be done with. I would go as far as say this is exactly what the long tail is all about. Of course large easy to count figures work well in the start of tail but as its spreads into the long tail you need to start thinking differently. Start thinking quality conversations with a your audience, not the old style everybody style broadcasts of yesterday. I know theres been some reaction to the long tail idea. One I heard recently was from Stephen Downes talk at northern voice where he asked, who really wants to live in the long tail?

So people talk, and people have talked a lot, about the long tail and they've said “Worship the long tail, mine the long tail, the long tail is where the action is.” And all of these people who are talking about the value and the virtue of the long tail have the unique pquality of not being part of it. I live in the long tail. And I can say from my own personal perspective that people who are in the long tail would probably rather not be part of it. They simply want to be read.

Stephen certainly has a point, but I don't believe its as simple as wanting to be read. For example, if I simply wanted to be read I could host cubicgarden.com on a dedicated server and spam all friends, family and there friends about it. Yes I would be read, but honestly knowing I'm read by people who are my peers and also my worst enemy's as such is much more interesting and also much more manageable. Imagine getting 100's of comments per entry? Is that better than recieving that one which points you in a direction you never considered before?I certainly think so and its the reason why I listen to Dave's podcast (with even the music i dont really like) over and beyond Adam's.

Comments [Comments]
Trackbacks [0]

Adding more metadata to RSS

I just added a whole load of extra metadata and content to my RSS 1.0 feed (RDF). So first thing which you will notice is comments and trackbacks are added so there is no need to visit the XHTML site unless you actually want to comment back and even then you could use the WFW:comment system, which I honestly have never really looked into. I also added a comment count before the actual comments, so you can skip around the RSS easily. This will also mean I need to get harsh with spam as I dont want spam in my RSS too. So I'm turning off trackback autodiscovery if I get any more spam.

Then I added the Creative commons licence to my RSS feed, which was easy enough. Ben Hammersley has a complete guide on how to do this with RSS 1.0. But this is where I got stuck, I was listening to some podcasts from Northern voice and heard Stephen Downes' Community Blogging session which talked about alot of things to do with the longtail, tags, metadata and emergence. But he also suggested people should link or put FOAF content into there RSS feeds. And I just thought this was fantastic! It makes so much sense, so I went about trying to link my FOAF profile into my RSS 1.0 feed. Well its not as easy it would first seem. I tried to work out how to link rather than just add FOAF to directly into the RSS. And in the end came up with this.

< foaferson rdf:ID=”Ian Forrester” >
< rdfs:seeAlso rdf:resource="http://adrenalin-online.demon.co.uk/profile/foaf.rdf"/ >
< /foaferson >

Which sits in the channel block. While this may not be ideal, its seems better than anything else i've seen. Now if I can only get feedburner to pass on the feed without interfering with the metadata.

Update – My friend David showed me evidance that my new changes to the RSS feeds are causing problems. So I've decided i'm going to remove the extra comment and trackback fuctionality from the RSS 2.0 and ATOM versions. Which means if you subscribe to the RSS 1.0 version, you will get trackbacks, comments and lots of other juicy bits and bobs. The change back should happen tonight which will be the morning of the 2nd March for most of you. Till then, my feedburner one is still turning out standard RSS combined with my bookmarks. Thanks David, if anyone else is having problems please leave a comment…

Comments [Comments]
Trackbacks [0]

XTECH 2005 schedule is now up

This years conference looks to be a very good one. There are already a couple of suprises presentations which brought a rye smile to my face. The first one has to be Dodds presenting just before myself. I usually read Leigh Dodds' blog at least once a week on my way into work, and I've always nick named him lost boy since.
Anyhow, people keep asking me for my paper, while I'm not going to post the whole lot I'm going to take some cues from Lost boy and post bits up as and when they come. Till then here's the basic description or abstract.

Open Data: RSS Syndication For A Worldwide Audience

The challenges faced while syndicating RSS to a global audience of people and machines. Can we syndicate in every single language, how does internationalization work in meta-data, and what does this all mean for the semantic web?

Some other interesting presentations I spotted.

Jakarta FeedParser – An Open Source RSS/Atom and Weblog API by Kevin Burton
Can OpenOffice be the new XML schema IDE? by Eric van der Vlist
Avalon & XAML: Windows' new Presentation Platform & Markup Language by Rob Relyea and XUL – Mozilla's XML User Interface Language by Ben Goodger. Honestly straight after each other, there's going to be some words I bet.
All XML Databases are Equal by John Snelson. Which is one for my work mates.

And I swear thats only Wednesday and does not include people like Micheal Kay and Tom Loosemore. I'm really hoping they do a good job of recording these sessions and hopefully pass them on to IT Conversations for podcasting and archiving after the event. There are so many I would love to hear and see, I'm really looking forward to speaking and hearing from others in the world of XML.

Comments [Comments]
Trackbacks [0]

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31