The perfect desktop aggregator

It feels like i'm on this never ending quest to find the perfect aggregator. I can not even remember how many I have tested and played with on my computers.

blogbridge
Rss owl
Great news
Fire ant
Blogmatrix sparks!
Blogmatrix jaeger
feed reader
AmpheteDesk
Flock
Sharp reader
Newsgator
Feed demon
Straw
News Monster
aKregator
Netnewswire
and more which I can't remember right now…

None of them quite have all the features need to make me quite stick.
Up till recently I was using Blogmatrix Jaeger but the lack of support from the authors, has me worried plus there are some really silly bugs which are very ignoying. Lets not go there right now. Before that I was using RSSOwl which was good even back in the older versions, but its lack of synchronization drove me spare. Theres nothing worst than reading through a ton of interesting news on one client then go and all of them being highlighted again! Well there is but trust me it becomes a pain after a while switching back and forth between different machines.
Just recently I tried Blogbridge which is ok, but resource hungry and not all that useful with 225 RSS feeds split into about 11 categories. Geez I'm not even close to Robert Scoble and others which have over 400 RSS feeds which need to be sorted, filtered and aggregated. But honestly I have not even started with my search feeds from yahoo, pubsub and blogdigger. So I expect the perfect will need to support almost 300 feeds without breaking into a sweat when I go for a search.

Attachment support

some other things which are needed in the perfect . Podcasting support, it needs to be as flexable as jaeger where it doesn't always need the enclosure tag to get stuff. And when I say stuff I mean not just Mp3s but videos like on channel9 etc. Bit torrent support would be cool but not essential. At least pass the Bit torrent files on to Azureus would be very useful.
Currently i'm trying fireANT to deal with all my attachment support but it doesn't do a very job with anything else. It actually kept crashing when loading my OPML. Yes I could just input the rss which supports podcasting, but what happens when someone I usually read starts podcasting? I don't fancy having to copy and paste the url just for a couple of entries. Jaeger works well because it highlights that this rss feed also has attachments (which can be enclosures or simply links to rich media in the post) this is great for example when epic 2015 came out. Someone I read regularly did a review of the changes between epic 2014 and epic 2015. And linked directly to the quicktime version. Jaeger automaticly downloaded because I have an option set to download all without asking me. So there was need to download the movie as it had already been done. This doesn't always work so well however, I'm subscribed to the TED feed and that usually has a lot of pdf attached with badly thoughtout file names. So I end up deleting a load of PDFs every week. talking of
cleaning up files, the perfect aggregator will need a user defined timelimit like in jaeger. This basicly removes all downloaded files unless there marked (maybe tagged, which I'll go into later. – idea being if you tag it, you care enough to keep it?). This stops the hard drive being filled up with junk, and makes synchronization on to a mobile/portable audio device a lot easier. So at the moment I got Jaeger deleting stuff after 2 weeks. Which works well for myself. But you can define from Never to every day. I expect a smarter way would be to have the time limit but also pioritse files by tagging, categories, read/unread status or even files size? I mean seriously if I haven't read or even glanced at the entry, will I be interested at the attachment? I would say no.
On this same thread, I would like to see more TVRSS features in the perfect aggregator. The plugin for Azureus is good for downloading media content, but I think that would be better done in an aggregator and the plugin simply interfaces with it using output of the files to download or picks up torrent files from a certain defined directory. So basicly the perfect aggregator would need a regular experessions engine or hey why not just use the search and smart search (almost playlist) feature which then also generates rss of the output which the TVRSS plugin could use. So for example if Kevin Rose releases another systm or thebroken I would have that feed set to automaticly download the media or torrent file (which would be passed on azureus). If i'm using one of the many RSS feeds with many torrent media links as items, I simply run a regex or simple search on it. Pop that into a smart search folder and tell the aggregator to give me RSS of what evers in that folder. azureus can do the rest.


RSS in and RSS out

Another thing which I rarely see, is RSS output. Now this may sound odd, but really want to beable to subscribe to the output of an aggregated source. For example I have a RSS screen saver on my machines, rather than it going and pulling the exact same feeds once again, wouldn't it be great if it pulled RSS from the perfect aggregator? Yeah starts to make sense right? Also there are other applications like widgets which I would rather pull from my perfect aggregator than from the web again. Currently I have a jabber bot which tells me what's new on slashdot, that's good but a little widget which just runs the latest headline in the corner of my screen would be highly useful. So generally a local RSS aggregator which also server's XHTML and RSS would be ideal. I have found ways to make Jaeger do this, but you can only have one or the other and it only serves to the local machine, aka no one on the same network can access my aggregator. I know Amphedesk does do this well but its highly useable and looks kinda of crap in my own mind. Maybe I should revisit it because its been a while now, and I do believe it supports themes or some kind of skinning.
But on the same front this is a problem when considering the speed issue.

Operating speed and cross platform capabilities

Blogbridge was so laggy when I had 225 feeds in it. RSS Owl copes well for a java application. While Jaeger and my new favorate Great news have no problems with 225 feeds and more. There seems to be a whole rash of .net based aggregators which work ok, but tend to be quite slow too. I do not know why this is, maybe theres some odd configuration on my laptop and desktop machine? It doesn't run as slow as some java aggreagtors but when loading 225 feeds in, its not far off. The compiled windows versions obvioudly run very quick, weirdly having an internal parser seems to be quicker still.
I have another odd requirement which I do not believe the perfect aggregator should service. I own a HP Ipaq (pocketpc) and I do read news on it quite a lot. Pocketpc RSS readers are slowly getting much better but are running into the same problems as there desktop counter-parts. But I don't believe the perfect aggregator should have a pocketpc version. Nono, that would be too much, plus through the steps talked about in this entry, any decent RSS reader on a pocketpc, palm, symbian device or smartphone device should beable to suck down an opml file using synchronization. Its not ideal because the synchronization would work both ways allowing you to also upload read and unread items along with downloading them. This feature seems to be a couple versions off pocketrss at least. I did notice one of its rivals now supports bloglines synchronization which is what Great news also supports well – more on this later.
But generally the perfect aggregator would support aall the main platforms be open source so at least people can port it to platforms like Solaris if needed. I expect when I mean main platfoms I mean Windows, Linux and OSX if your really pushing me. I like the idea of BeOS but how many people still use it? If its open then its at least somewhat portable.

While i'm talking about the nitty grittys of the application it would also support ALL languages in all types of encoding. Most aggregatorss fall back to IE or a mozilla core for display which is ok. But this would be best served as a choice by the user. On the mac obviously Safari would also be a choice too. Ben metcalfe recently linked to a stress test for RSS readers to check if they were subsectical to common dirty html tricks. This includes things like iframes, remote exploits via javascript and other natry things. I did a test with Jaeger and it passed all the test with no problem becaause when ever there was any html type stuff, it would pass that on to firefox which has all the adblocking, popupchecking, etc features I need and have defined already. I have yet to try this on Great news (tried it finally and it crashed Great news – I submitted a bug report, but i'm worried because great news uses IE for its internal browsing and there seems to be no way to change it the firefox core. So I assume a lot of thos dirty html tricks will work in great news? Unless great news doesn't pass any of that crap on to the IE core render?
This may seem the overboard comments of a open souce fan but this is important if like me you haven't really setup IE for your own preferences because your using something else such as Firefox. Security is a large problem and will get worst if the perfect aggregator also because a good way to exploit systems. RSS spam is only the start of things.

Tagging and Categorization

Talking of Great news, I'm really starting to love the abaility to tag and categories rss items. Jaeger has this but its not quite the same. Both also support a feature which I can only call search folders, basicly you define a set search and the aggregator will mark or highlight new items which match the criteria. This is essential and very useful. But tagging is great for reminding yourself of useful things which you may have not searched for. For example I have setup a tag in great news which is simply called blogthis. When I have time during lunch, I quickly run through the tagged entries in blogthis and sometimes blog it. I guess the perfect aggregator would have a compatablity with del.icio.us, technorati, etc so I could use the same tags to categories entries and maybe store them in del.icio.us using its public APIs. I personally think we are only scratching the surface with tagging and categories and maybe something combined between Thuderbirds straight forward search and smart search folders with the tagging ability of technorati and del.icio.us would be useful for making sense of content in an aggregator.

synchronization

Ah my new favor talking point when it comes to RSS aggregator. Like how most RSS readers only read RSS and don't allow any RSS out. Most Aggregators let you input OPML but not export it with equal ease. But lets get the basics right before considering advanced features.
The perfect aggregator should support at least.
OPML input via File and URL.
OPML output via File.
On the next step up in perfection.
OPML input and output via FTP
If you next consider the usage of idisk and .mac drives (better know as WebDav/deltaV) the natrual next step up would be.
OPML input and output via WebDav (unsecure and secure – https)
honestly I do not know any which have WebDav support but it will make sense with more services like idisk and .mac drives. Some support FTP like Jaeger for synchronization and storage.
The final and perfered way of synchronization is via webservice. Generally its using something else to send messages back and forth. There are growing propsals in this area, one I like is attention.xml which has backing from technorati. At this moment most people are using some modified opml to do synchronization which works but is so adhoc its untrue. I have never seen a opml carry what's modified and what's unread information across from one application to another. They tend to only work in there own application domain which sucks. The most used webservice type synchronization is done using bloglines open API service. For example in Great News all I needed to do is enter my email address and password I used to logon to bloglines and it will download not only my subscriptions but which items are unread and read. I do not believe the api supports locked items or any other states but read and unread is usually all you need. When your done with a feed in great news it will send the changes to bloglines. Perfect you would say but what incase I do not want to use bloglines? Well at the moment you could adopt the newsgator package which is a rss aggregator and web service which support syncing between the both of them but outside of that your kinda of stuck.

This maybe an area where Yahoo could really blow away the market. Allowing synchronization with your myyahoo subscribed feeds and maybe further the yahoo!360 service. But it strikes me that myyahoo isn't a bloglines and isn't meant to be either. There are few online aggregators as popular and as powerful as bloglines at this moment, so unless mymsn or myyahoo step up and give real aggregation features, bloglines will be the one and only service to sync with. Maybe in the future others will come along and even adopt the bloglines api in an attept to short cut into the synchronization market?
So in the case of the perfect aggregator, bloglines synchronization is key and needs to be there above syncing over opml.

Other features

Here's a list of key features which are highly recommended
Automatically generated search feeds and Pubsub support (also known as Watch feeds). PocketRSS has this nice feature which allows you to enter a search term and it automatically makes a new RSS search feed and subscribes to it. Practically this is usually a slight url change using a RESTful api but its dead handy when doing research.
The ability to filter a large feed (cited the case for torrents and regex), filtering on top of a search rss feed (like that of yahoo, technorati and blogdigger) is essential because not everything you get through is of interest. I would almost go as far as to say search and torrent feeds should be treated slightly differently. They tend to update quickly and contain many items. It would be good to automaticly remove old items, or archive them away like how azureus rss plugin does.
Comment support. The ability to automatically see the comments to a blog entry or at least click through to see the comments is handy and shouldn't go a miss in the perfect aggregator.
Blog this support. This is usually just a link which sends the permalink url and maybe the content to an external blogging application. I don't think this is quite standardized but this shouldn't be over looked in the perfect aggregator. I would also suggest Email this would be included in this same space.
Search this item. The ability to search all related or at least linked to blogs to that first item is vital when doing research or gaging views on the first item. Blogdigger and technorati do this well allowing you to search for all blogs which link to that first link or search via related terms used in the first item. Having the feedback in the aggregator is tricky but at least having that feature in the perfect aggregator which then sends the results to a browser is good enough. Like the search feed, this is usually just a RESTful method call with the correct URL.
All types of autodiscovery supported. It should support drag and drop from the RSS link and the page link which will cause the aggregator to search for all alternative link feeds in the head of the html page. It should also support feed:// and that weird USM (Universal Subscribe Mechanism) feature.
For display as mentioned before, it should support RSS output but also some simple templating system for displaying the HTML to the local browser or network client. CSS for style should be used while something like VM (velocity), XSLT or Groovy should be used for layout. I'm favouring XSLT. But it should be trivial to add different flavors or templates depending on a simple url query string. Yep this borrows from Blojsom's flavor idea but it works well.
Social aggregation. The idea of what your friends tag and recommend is not exactly new but it would be great to see what your friends recommend directly in the aggregator. Ampheterate was one way of doing this but it requires too much commitment on a user. Attention.xml has ways to deal with this which are well worth checking out.
Microformat and rss extention support via a community. I was orginally thinking just rss extention and microformat support but I'm thinking if the core rss parser engine simply outputs what it gets in, the template engine could deal with the extra rss support. However this is not strictly true. For example Amazons A9 open search is little use after its passed the parser, it would be good for the parser to understand the attributes and elements while it could be argued that Microsofts list extentions should also be parsed at the parser, most of the grunt work would be done at the display/template engine. I 'm not sure how easy it would be to patch the parser but having templates which can be shared around a community certainly makes a lot of sense. Saying that, Flock a decent but underdeveloped aggregator uses XSLT for all its parsing so its possible I guess.
Aggregator engine seperated from the front end. Memory is a worry no matter what operating system you maybe using. It would be real nice to have the actual simple rss catcher/fetcher and parser separate from the actual output and gui. So for example you could settup all your feeds and just let it run every couple of hours and collect a store of RSS and attachments. You could then run the gui to search/view the feeds, add/remove feeds or do anything else like this. I know theres one application which actually does this quite well already. Blogwave is not your typical rss aggregator, it has a front end but that's mainly to setup the actions and the like. Once its setup it will just download rss and run a set process on it if specified. Its good at what it does and I believe it is possible to get another RSS reader/aggregator to read the downloaded feeds but I have not tried this combination yet. I assume its possible to do more with blogwave than I'm suggesting, but why bother? Let something like Jaeger, Great news, Feedreader do the gui stuff and just build links and remote calls to blogwave.
But in the same vein I would like to see more RSS screensavers, widgets, etc which make use of the features of the RSS output. I currently also read RSS news via bloglines on my xbox using xbox media centres bloglines python plugin. Its pretty cool and allows for syncing back and forth as its based on the bloglines api (more on this later) but is it really necessary for each mini application to make calls to bloglines each time? As mentioned I have a bloglines script on the xbox, greatnews installed on 2 machines so far and a RSS screensaver on 2 machines and I haven't even started on the widgets yet. Each one of those are making calls to bloglines and individual blogs/news sites. It would be so much more effective if they all made calls to a proxy of some kind instead? I expect blogwave isn't quite enough for this, but maybe it is?
One, Two and Three pane support. Yeah this is always a conversational firestarter. I like Jaegers one panel solution but I've grown up on three pane rss readers with feedreader, rssowl and now great news. Sparks! has the ability to switch to two pane view if needed. So I pose the question of why not support all three types of RSS reading?
So you can use your browser for RSS reading using the templating engine ala Jaeger and Amphetedesk.
Two and Three pane reading via the browser core ala Sharpreader, feedreader, Feed demon, etc, etc. If anyone knows any other types of mass RSS reading outside of these three methods please let me know.
Automatically change items to read when selected. Unbelievable but I have encountered aggregators which do not support this feature. So you have to manually select read or wait anything up to 10 secs before it counts it as read. Standard feature which should be included.
Alternative grouping. This reflects tagging and categories section but the thrust of this feature is to get away from hiararchical folders to categories rss feeds. It works ok but honestly it gets ignoying when trying to manage the folders. Tagging the feeds means they can come together in groups when ever needed. This should apply to not only the feed but also the individual items.
User selectable views or Stylesheets. Great news does this so well using CSS I can't praise this feature any more, its hard to go back to anything different now.
User configable keyboard navigation. This should allow anyone to setup there own key combinations for going to the next/previous unread/read items.

Some may ask why I don't just build this magic aggregator which supports all these great features. Well honestly I would if I could. I was planning to build something using combinations of other aggregators out there. For example the ability to have just a aggregator engine which simply parses RSS without a front end is kinda of almost there if I use something like Blogwave or even something more complex like Apache Cocoon. Yeah over kill for a lot of programmers but a step in the right direction.
The main purpose of this mini essay/long blog entry was to inform others of the growing conclusions of RSS aggregators. I 'm actually hoping that within a year most of these things I and many others have suggested will be there as standard on most rss aggregators. I expect like the itunes 4.9 intergration of podcasting that longhorn/windows vista will not cover all the bases well first time but it will get better. However there is a burning need for advanced RSS aggregation beyond the usual RSS reader. Even in the naming I have started to make the distiction between the two. RSS readers would include the xbmc bloglines plugin, my pocketpc reader (pocketrss), widgets, screensavers and tons of simple but effective desktop readers. Aggregators on the other hand must have that ability to output content too. So I would include anphetedesk, flock, feeddemon, jaeger, sparks! To this list. I'm also thinking aggregators are much more hackable or remixable using standard technologies such as templating, scripts and other things.

Comments [Comments]
Trackbacks [0]

Note taking and outlining

I thought I had it covered. Joe on the TabletPC and Pocket Thinker on the ipaq, both support OPML naively but pocketthinker does not import OPML from Joe. So I'm back to the start with note taking.

I'm seriously douhting if OPML is the right thing for the task. Uche goes one step futher and suggests XML formats for outlining are complete rubbish. Danny Ayers also gets the boot in on OPML. Honestly he has a point but offers up a couple of others which I had not looked into before. OPL in reaction to the ugliness of OPML. Looking at the spec, I'm not sure it goes quite far enough. XBEL on the other hand looks too wildly different but useful for outlining. Uche also did a follow where he reviews. I like the idea of XoXo but prefer the idea of using XHTML or RDF which is easily parsed and integreated into other processes.

Then I found Wikipad… and had high hopes for a pocketpc version like this palm version or even this mobile phone type version. Wikipad doesn't have the name of something like Voodoopad but it certainly does do a good job of notetaking for now…

Comments [Comments]
Trackbacks [0]

Talking to Microsoft

I dont believe theres anything wrong with mentioning the recent conversations I had today. One with Mike Munn from Apple and the other with Sean Lyndersay from Microsoft. I didnt plan it that way, it just turned out that way. Anyhow I wanted say the phone conversation with Sean was very interesting in regards to Microsoft's love of RSS. So I had to blog a couple of things.

Simple List Extensions.may not be the only module from Microsoft and although writen with RSS 2.0 its not exclusively for RSS 2.0. Sean's fully aware of rdf:Bag and rdf:Seq and all the beautiful fuctionality of RDF but suggests that the marketplace has spoken when it comes to RSS 1.0. However there parser will support all the extensions/modules which are practical and bring some benefit to the end user including the well known syndication, dublincore, etc modules. I asked about commercial type modules like Yahoo's and now Apple's iTunes Media module and Amazon's Opensource modules. Sean was clear that the same benefits need to be met as with all modules but they need to be very careful about the licences with commercial modules, Microsoft putting out there module under a creative commons licence was mindblowing and Sean suggests thats only the start of things – but also hopes it pushes other commercial modules makers to consider how they licence the modules much more. He actually hopes Microsoft have set the standard and all modules will be very clear about there licencing from now on. xml:lang at the item level was discussed and may make it into the microsoft rss parser as a way to tell language and somewhat directionality. We talked about the smaller language bases which tend to be ignored or at least missed by the mainstream media outlets and how we could foster RSS usage and subscription within these languages with IE7 and World Service content. Interestingly IE7 is coming out in to public beta quicker than I first imagined. However network/bamdwidth usage was discussed and Sean was serious about the huge number of undisclosed users IE7's release could send to
our or anyone else website. Its certain, this is the year when RSS grows up and hits the mainstream

Comments [Comments]
Trackbacks [0]

While Microsoft gets RSS, others ponder the future of the web

IE7 with RSS support

So while Microsoft shows its RSS hand at Gnomdex 5.0 just recently, Miles was posing the question about the future of the web (and I mean the web not the internet).

How often do you look at the web?

It was posed after talking with him about an idea floating around recently and getting side tracked into a talk about Jabber and XHTML 2.0. Some of the justification came from the lack of interest in moving forward the XHTML standard and the move of internet content and services on to the desktop and beyond. Miles showed some of the clients he's been using including netnewswire and a experimental one which has only 4 letters (which I cant find or remember right now) Flow. And explained how access the web was not so much needed as it use to be. I had to agree, I can even do google, yahoo, etc searches directly from Blogmatrix Jaeger along side with RSS search. Then if you add watchlists, marking and categories to the mix you got a lot of the features which make browsing the web not hardly needed anymore. I mean generally you got everything except services and commerce.

But there is nothing to stop even the services and commerce sites from also serving the RSS marketplace. For example my online bank could supply less sensitive information over a secure http connection to my RSS reader, they already supply updates and bank statements over SMS, so RSS isnt that far off realisticly. But then hey why not skip the reader and go striaght to the application? A secure RSS feed which goes straight into Microsoft Money or Quicken can not be far off. I'm sure Microsoft are well aware of the possibilities within this idea and may provide a bridge between your service and there application.

I'm not a fan of huge applications but check out Flow's interface.

I can imagine there being anything this advanced on the pc platform, actual Blogwave is an attept at taking RSS beyond the pure reading point of view but relys on a hacker/development mindset and applications around it (which is not a bad thing).

I actually quite like Blogwave because like cocoon it can serve as a great pipeline arcitecture for directing structured content around without human interaction. Say for example it would be great to not only have my watchlists in Blogmatrix Jaeger but redirected to my email or instant messenger. I know its possible but would take some time to do, Flow seems to be working towards making this happen without development effort.

Anyhow the point I think Miles was making is the internet is evolving and RSS is a huge part of this. With RSS being structured content, its easy to take advantage of different feeds to do different things. Why not a Meetup feed going straight into my calendar? Flickr feeds into a screensaver or wallpaper/background changer? Local Government debates appearing in my email ? Software updates via RSS? etc etc. Some people disagree such as RSS the next plague?. But you only have to look at Apple's RSS Screensaver to get a feel for how great timely and relveant information can be in the correct context. With Widgets, RSS at the OS level and Applications which are RSS aware all coming or almost here, will we all be using the web less and less?

There is something else very interesting about the Microsoft announcement and reflects with the use of the web. Microsoft releases under Simple list extentsions for RSS under a dare we say it ShareAlike, Copyleft type Licence. Prof Lessig can't help but be perfectly balanced about the move while I cant help but say were all communist now Bill Gates. But yeah this quite mind blowing and could actually be the start of a Microsoft which can share, contribute back and play fair while still making a profit? Only time will tell. Its also interesting that between, Yahoo, Amazon and Microsoft there has been propsed extentions to RSS while in the HTML world, things have come to a stand still. Yes XHTML 2.0 is around the corner but how many developers of browsers are using it? I bet there are lots of developers keen to intergrate opensearch, rssmedia and Simple List Extensions. The only worry now is if people start pushing RSS into a place to compete with X/HTML, adding forms, css, etc. Its going to happen because all the innovation is happening in RSS not in XHTML at this moment, even with all the remote scripting (Ajax) stuff. So the question remains, how often do you look at the web?

Comments [Comments]
Trackbacks [0]

The problem with Language RSS

Shoshannah Forbes and I have been sending emails back and forth about the issue of RSS adoption in Right to Left languages like Hebrew and Arabic. It fits so closely with what I'm going to say tomorrow at XTECH (where I happen to be right now actually) its almost uncanny. I asked Shoshannah if I could blog her reply to my question about her RSS feeds. Basicly her RSS feeds include the HTML attribute dir to indicate direction of the text. Which makes it invalid and may break quite a few of the RSS readers out there. Anyhow here is the email complete with my new agreements and additional comments. Please remember as usual these comments are my own views and not BBC World Service's views (my employer).

Shoshannah Forbes wrote:

> The problem I am facing is simple:
> If I use valid RSS with no dir=rtl, then 99% of the RSS readers will display the text block as LTR, with punctuation digits and English in wrong locations, making the whole thing unreadable.
> When adding dir=rtl, at least I can get about 50% of the RSS readers to display the post body properly (titles are still a mess).

Agreed, but I feel there are two ways of looking at the problem. From your point of view it makes sense to include dir=”rtl” because very few software developers are going to change there code to take this into consideration. For us (the BBC World Service) we have the might to speak to developers and get them to change there code. Even if we do not do it for ourselves, we owe it to our audience (my own feelings).

> I don't use unicode control characters for a few reasons:
> * They are a real pain to input- it is like entering the control characters for CR/LF or < font > tag manually (but worse)- there are just to many places to enter them.

Yep totally agree

> * Most keyboard layouts do not have a direct way to enter them.

Yeah were using virtual keyboards for some languages and there a nightmare!

> * They make a mess of the text- they are only used for the RSS, and unneeded for the editing or the html display, and can produce unexpected results when entered into the text.

Yep, agreed

> * There are many clients that incorrectly display them as visible characters in the text.

Yeah, its a shame and that will change but its too much trouble at the moment

> * They make the text much more difficult to edit- if you change the text, you need to go back and change them as well. And since they are invisible, you get an awful lot of trial an error.

Indeed! You really need to understand them to edit with them. This would require extra training for our language services

> * They force me to use explicit directionality, which complicates things and makes the text less portable.

Yeah, there is a idea of reuse through out our language services. This is tricky already, who knows how much more tricky it would be if text was unicode directional too

> * My web app that creates the RSS from my HTML does not know how to add them automatically.

Yep, I know my Blogger app (Blojsom) supports Unicode Directionality IF i put them in at the start but then were back to the editor problem of virtual keyboards and sticking in hidden characters! The same is true of the BBC World Service systems. We use XSL with Saxon so if the characters are there, it should (not tested by myself) pass through to the RSS.

> * Since they are rarely used in other contexts, I can't focus on the content when writing, and have to start thinking more closely about the presentation.

Yeah indeed! Our language services are already busy as hell, unicode directionality would just add a level of complex on top of a already stressful job.

> * Moving from me to other users- most Hebrew/Arabic users don't know about them, and don't want to know. You try to explain to your mother that when she is writing in her weblog, she can't write in here usual manner, but has to enter this strange codes in a foreign language which have complicated rules (I have seen many pros get confuses with these characters, I don't expect laypeople to understand them).

Right on the nail! One of my points for tomorrow is unicode directionality is too damm difficult and very confusing! i expect some will challenge me about this tomorrow and honestly I will just admit its too difficult for me its even more difficult for others. Plus we should be making things easier for people not harder. The barrier for entry should be at a level where your mum or my mum could use it and write it.

> * It doesn't scale- think about a an Israeli blog hosting service- they want to offer RSS feeds for all the blogs, with minimum work for the users. Relaying on unicode control characters just doesn't do it.

Yeah plus from the Israeli blog hosting point of view, you want to get people going quick and easily not putting them off with complex editiing. Its the reason why Blogger does so well, 3 steps and you got your own blog.

> * Since they are complex, it is difficult to create a GUI for entering them (unlike general RTL/LTR controls, which are available everywhere).

Yeah its almost needs to be just like the direction attribute in HTML. I'm suggesting tomorrow a attribute like this for RSS.

> Not having the dir attribute in RSS gets rid of some markup- in favor of lower level much more complex control characters. A bad deal, IMO, and one which is a major cause for the problems when dealing with Hebrew/Arabic RSS.

Indeed, it was a ideal solution but the real world use is too painful

> I think that the root of the problem is that bidi is part presentation and part structure. And since even in the best of cases (for example, the automatic bidi control in recent QT or GTK applications on Linux) there are still many many cases that can *not* be covered reliably by the display algorithms of the software, I tend to think that for practical prepossess, bidi is more structure then presentation.

Yeah agreed, theres lots of push to put bidi information inside of CSS instead of HTML even, which is correct if you see bidi as presentation.

> I sure wish there was a way in RSS to tell the client “this element is RTL” or “this area is LTR” without resorting to HTML hacks. But at the moment, those hacks are the only practical tool I have to get at least *some* of the readers out there to display the text properly (more like “mostly properly”).

I feel your pain and I'm not even writing my own content in a right to left language! Its such a shame that HTML hacks are the only way we can move forward on this. The crux of my presentation and paper is that developers and content providers need to work much closer together and the RSS specificiation needs to make full use of attributes like xml:lang and maybe some other kind attribute for direction.

Comments [Comments]
Trackbacks [0]

Planning for XTech

My setup for the XTECH conference

Edd Dumbill has used some type of aggregation system to pull together all the various Xtech information into Planet Xtech. It works pretty well except for the wiki changes but there short and easily scanable. I'm subscribed to the RSS feed as I prefer it to reading the page. The good thing is my global watch lists still apply so I'm able to be notified if something of interest comes up. Technorati is also a pain because it brings up older posts says Edd, and you can certainly see that. I'm also wondering why Technorati is not showing any of my posts? This one has the Xtech tag attached like previous ones. Anyhow theres some more indepth thought here. To me its all a no brainer, I had setup a special group in Jager and added technorati, flickr, recent wiki entries and del.icio.us entries into it. So I pretty much had PlanetXtech on my own machine. But this is exact the kind of thing Cocoon is great at doing, and I guess if I was asked to do the same, would have used Cocoon and XSL.

Anyhow, I'm all setup and ready for Xtech now. As you can see in the picture above. I got a feeling if Miles didnt buy me a Flickr Pro account for my Birthday (thanks again for that Miles, and what a good idea for a birthday present) I would have to buy one by the end of Xtech. Pixory is still doing me well at home, but is cripled by my 256k upload on the ADSL, plus it doesnt have the social aspects of Flickr of course. Like the decentralised nature of blogging it will come in time via methods like trackback, aggregation, tagging but not quite yet.

Comments [Comments]
Trackbacks [0]

XTECH 2005, less that a week off now…

Xtech Conference 2005

I noticed a new comment on the blogdigger development blog from Xslf.

Re: On Language

Hey, that was fun- seeing some Hebrew here 🙂 This issue indeed is painful. My blog publishes feeds in utf-8/Hebrew, and half the RSS readers out there have problems displaying them (esp. post titles) in proper RTL (even though my feed has dir=rtl in the right places). Speaking of Hebrew problems- I just attempted to create a blogdigger group with a Hebrew name/description, and all the system gave me after I submitted the form was question marks 🙁 Seems that the server here did not like Hebrew input (the group can be found at: http://groups.blogdigger.com/groups.jsp?id=2051 ) An English language group that I had opened ( http://groups.blogdigger.com/groups.jsp?id=2044 ) works fine. Sigh 🙁

This is almost exactly the crux of my presentation at Xtech. Right to Left Language RSS is so painful. Why is Hebrew language so difficult to work with for seriously most of the RSS readers out there? Xslf is one of a larger group of people who are puzzled why they can not communicate in there own native language with the modern tools and applications around them. One of my points is Unicode is an enabler, and it really is! But being unicode is not some magic bullet, much great language consideration needs to go into the whole process.

I had a quick peek around Shoshannah Forbes blog http://www.Xslf.com, I can not read Hebrew but I know a friend who does (whos coming around tomorrow). Anyhow I looked at her RSS feeds just out of interest to see if she was doing anything interesting, as her HTML meta-data was neat and considered. Shoshannah is basicly using RSS 2.0 with added modules which are common in RSS 1.0. She describes Hebrew by one head-level dc:language element and then inputs a div with HTML directional code inside like this < div dir="rtl" >. This is what we tried avoid at the World Service because we felt it broke RSS validation, caused presentation vs structure issues and generally did not work in most of the RSS readers on the fragmented RSS market. I recommend Shoshannah read some of the blogs I linked to in Languages in RSS a while ago. It would be great to hear what she makes of the whole Language RSS debate.

Do not forget XTECH is next week and I will presentating along side the other 3 BBC presentations (BBC News, Radio and Music, Backstage BBC). XTECH looks to be a great conference this year, its seems a crying shame that no one is going to be seriously podcasting or even recording the event and speakers. But I maybe wrong?

Anyhow here's my plan for the conference, remixed to show my choices using XSLT of course. All I quickly did, was add an attribute named choice then slotted 1st, 2nd, 3rd or even 4th. Here's my modified XML and XSL. For some reason I couldnt get the external css stylesheet working so I just inlined it in the short term. Do not forget there are many ways to get involved in .

I know theres free wireless at the conference and I'm bringing my pocket wireless hub in a aid to help extent the range but I dont know if electricity will be a problem because my laptop only lasts 1hour without being plugged in. If you happen to be sitting near me in the conference with a laptop, please tap me on a hand as I will have a English 3 way power adapter plugged in where I can.

Comments [Comments]
Trackbacks [0]

Backstage.BBC.co.uk launches into public beta

BBC Logo

So many things happen when you go away on Holiday for one week. Yep this is kinda of late news, but that secret project which I was not really able to talk about on my blog has gone live now.

backstage.bbc.co.uk is the BBC's new developer network, providing content feeds for anyone to build with. Alternatively, share your ideas on new ways to use BBC content. This is your BBC. We want to help you play.

Taken from the about page.

Backstage is part of the BBC’s wider remit to “build public value” by sharing our content for others to use creatively. How do you “build public value”? One of the ways is through supporting innovation as the BBC Governors response to the Graf report of BBC online makes clear:

“The BBC will support social innovation by encouraging users’ efforts to build sites and projects that meet their needs and those of their communities … The BBC will also be committed to using open standards that will enable users to find and repurpose BBC content in more flexible ways”.

backstage.bbc.co.uk aims to promote innovation amongst the design and developer community: if people are able to do interesting, productive things with the content then we’d like to support them. Finally and as a useful by-product of the above, backstage.bbc.co.uk is an opportunity to identify talent in the online community.

I have been aware of backstage bbc.co.uk for quite some time, but didnt take part in the closed beta due to work load. I urge everyone else to check it out and join the email discussion list which should be a friendly place for developers and designers to suggest ideas and team up with like minded people. I certainly will be on there with my designer/developer hats on.

I have to give a lot of credit to the backstage team. WELL DONE! Ben Metcalfe, James Boardwell and Tom Loosemore. Who all worked really hard to make this happen and without the concerns and conditions which could have been plagued the whole project and idea. I know lots more people were involved but these guys lived and died by this project.

Looking around so far, backstage got metioned on Channel9, Guardian Unlimited online Blog, The Oreilly Radar (cool!), Boingboing (double cool!), P2Pweblog (odd?) and even BBC News. The question still remains if they are ready for a slashdotting? Too late they already were via Stefan Magdalinski of course.

It time to crank out my Cocoon book and get working with the tons of open APIs and RSS feeds which now cover the web 2.0 landscape.

Comments [Comments]
Trackbacks [0]

I’m loving Konflabulator

Konfabulator is one of those things which I have been meaning to check out for quite sometime. Since seeing the Dashboard idea in OSX 10.4, I have thought it was a pretty good idea. For example, the amount of times I'm sitting at my laptop and I want to quickly do a complex sum with the built in calculator. Well I have to click start menu, find accesssorys, find calculator. Yes I know I could assign a keyboard short cut but come on. So using widgets in Konfabulator, I am able to click a hot key and get my calculator in without doing any navigating or finding. But thats only the start of things… Theres this really nice widget (UK Train Timetable) which grabs the latest train times from the official Train timetable website and displays any delay's or cancellations.

So from own understanding, Konfabulator is using the Mozilla javascript engine on top of the operating system which allows widgets to be built in the same way as you create applications on the web. This is very interesting when you consider the real push behind dynamic web applications like google maps. Ajax or as I prefer remote scripting creates a inituative user interface to underlined webservices. With widgets in konfabulator its easy to imagine taking applications right into the operating system. I understand this is exactly what the Avalon/XAML idea in Longhorn is meant to do. But you know what, its available now and being built on Standard Javascript/DOM is good idea. I have been meaning to improve my Javascript/DOM skills for quite some time now, this is the perfect reason why. I wonder how easy it is to connect to the web and the file system? Does it use xmlhttp or something else? There are a;ready plans in my head to talk to cocoon and other web applications. How hard would it be to read a xvid file on my filesystem and search for the informaton for it on IMDB.com? I'm sure the interaction with the user interface and filesystem is limited but not that limited surely? I mean for example there is a nice wireless strength indicator which can be downloaded. For it work, it must plug into the operating system and read that information in some way? There are also many system monitors which to me indicates that access to operating system information must be easy or at least flexable.

There are performance issue with layering a Javascript engine over an operating system, sure. On my 1.3ghz Tablet PC Konfabulator realisticly takes about 2 secs to switch on my button switch. I have about 8-10 widgets running. While on my desktop 1.8ghz Athlon box it takes nothing more than half a second at very most. Bar Avalon and somewhat Dashboard, what else could be used instead of the javascript engine? Java? Python? what else? See this is the thing, Javascript drops the entry level down real low, you do not need to learn a while programming language to create widgets. This is good! Yes performance is one of the trade-offs but come on a widget is meant to be small and simple. For example there is a nice little search box which I have as a widget, it simply converts my string entry and passes it on to the locally installed web browser. Makes sense. It wouldnt be too difficult to send the request to google search webservice and get results back in the widget as a small list. Cool, but why bother? Unless your building a browser as a widget, just pass the query to the browser at the start.

Not trying to do a lazy web, but heres some thoughts for future widgets.
Del.icio.us adder – What ever webpage you happen to be on can be added to del.icio.us through a widget instead of the browser. There is already a firefox extention which I use which does the same but it would be cool to see a widget too.
Upload to Flickr – Uploading to flickr is simple and there are 2 ways to do so right now. Download a application or do it via a webform. What about a widget which allows you to drag folders of files to be uploaded. Once its finished it opens a webpage where you add the metadata. Or you could do it from the same application too I guess (I prefer not to).
Blogging it – I like wbloggar and other blogging clients but come on, its all heavy weight for posting to a blog. My blog software already has a javascript bookmarket for posting, so why not a widget too? Actually it would work quite well because maybe it can accept drag and dropping urls and files too?
Tell me a route – I use transport for London's Journal planner all the time. It would be so cool if I could just put where I am and where I'm going and it could send the request to TFL. Then chart a route or send the response to the locally installed browser. This would also work with Google maps.

Following on from my last post about RSS and Azureus, I'm thinking once I finally get XML or something structured out of Azureus and its completed or in progress queue. I should build a widget which shows me the latest TV shows downloaded or in progress. Yeah I know Azureus has a little download bar but its too abstract and through cocoon I could get updates for in the same way even when I'm not on my local network but at work or even roaming. Pretty powerful, you have to agree? Yes all I'm doing is reformatting what comes out of cocoon but hey, it can be alot more with a lot more time and coding. Instead of just a list of whats downloading and whats done with percentages. How about a tv lookup so you get a image and nice small percentage instead? Simple and effective.

So generally, yeah I do not think Konfabulator is the best thing since sliced bread, but I do rate it. I will have a better feel for how I rate it once I get into writing widgets of my own. I may be so wrong and widgets could be based on a twisted version of javascript and requires super human knowledge to write. Maybe connecting to the web is difficult and reading machine or operating system information almost impossible. But on first judgement its good and effective. I'm happy with it for now. There was another program which I downloaded but didnt end up testing called Samurize. Its open source which is good because konfabulator is payware but quite cheap at only 30 dollars.

Comments [Comments]
Trackbacks [0]

Safari RSS automatic discovery woes

What did I learn today? Well thanks to Kelvin. I learned if you want Safari RSS (Safari 2.0) to automaticly discovery your RSS feeds which you have put into the head of a html page. You will need something like this…

< link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="http://www.cubicgarden.com/blojsom/blog/cubicgarden/?flavor=rss2" />

Important things to notice. If you use a type of application/rdf+xml Safari will ignore it. Do not put Alternative as the rel attribute if you want it to work. The funny thing is every single other RSS reader I have used and seen finds rdf feeds using application/rdf+xml no problem. Even Firefox does not have a problem with this attribute type. I have yet to test Safari RSS with ATOM feeds, but I'm deeply concerned that application/atom+xml will not be acceptable for Safari either. Well its only days till OSX 10.4 offically launches here so we shall find out more then. Apple, the little things you always got to do differently…

Comments [Comments]
Trackbacks [0]

Tiger RSS screensaver

Engadget RSS Screensaver

Kelvin was talking to me about RSS in the new Safari RSS browser. And I mentioned the buzz about the new RSS screensaver, which I forgot to ask him about last time. Well he sent me a link to a quicktime movie of the screensaver in action. Although its quite wanky, I love it! My pocketpc has a passive RSS section for the today screen which works for grabbing you eye once in a while with new news. Well this is the logical conclusion of the same idea. There has been lots of talk about its usability, but you know what I got a RSS reader on my xbox and its works in the same way. Just passively grabbing your eye when something interesting comes along.

So I'm not getting a mac anytime soon, so what can I do? Well with some searching I found this which RSS screensaver quite nice. Accenture also has one, which I have yet to try. This tutorial also looks quite useful. I give Apple the fact that these screensavers are no where near as juicy as theres but I'm sure someone will do something simlar soon. I was actually thinking, if you combine a VRML screensaver with a XMLhttpRequest and some well thought out XSL. You could build something quite striking.

Comments [Comments]
Trackbacks [0]

Structured blogging, semantics and tags

I have been meaning to blog about the structured blogging idea for quite some time now. The idea is this…

Structured blogging is about making a movie review look different from a calendar entry. On the surface, it’s as simple as that – formatting blog entries around their content.

On another level, it’s a bit more complicated – what we want to do is create structure (in the form of XML) around each of these types of entries, to organize the data inside and to let machine readers – other programs, sites, and aggregators – better understand the content.

Yep great can not disagree with that. Greg from Blogdigger has been thinking about this too. The overall concept is a interesting one for myself, but I see lots of connections with the semantic blogging idea which has been pioneered by the great guys at HP Research Labs, for quite some time now. Generally Semantic blogging for me seems to be adding machine readable meta-data to the content of the blog entry. Blojsom for ages has had meta ability for years now, which I have not really made full use of yet. But theres lots of work already moving ahead in this area around FOAF, XFN and Tagging or Folksonomy.

With Friend of a Friend (FOAF) its easy to put a link to a persons name and there FOAF profile. I'm sure it would be trivial to add a rel attribute to the link for more context. Which leads nicely on the XFN idea which is exactly that. Using rel attributes and a classification of types to represent human relationships using hyperlinks. Works well but I would like to get rid of the classification and go more fuzzy like tagging. The best examples of tagging full stop has to be Del.icio.us, Flickr and others but tagging on personal blogs is still quite new. Oreilly's radar points to whats possible with a little time and lots of meta-data in entries. Technorati makes this possible via there site. For example I have tagged and in this entry and sent them a ping to make sure they pick it up. Theres nothing special in the links besides the rel attribute which is set to tag.

I also have to agree with Greg about what happens to the meta-data beyond the blog. I really want to see RSS feeds with the same meta-data making its way towards aggregator via namespaces and rdf. GRDDL is good and well worth looking into for the future. But once again this is all covered in the semantic blogging ideas. It has been a long time in coming but I disagree that the time is right. Blogging Clients need to change to reflect these changes otherwise no one except a handfull of people are going to manually put in meta-data into there posts. I will because i'm jazzed about meta-data and semantic blogging, others will not. Even if the benefits are different looking entries on there own blogs.

Comments [Comments]
Trackbacks [0]

The importance of open data

Xtech 2005 conference

Miles sent me the link to the O'reilly Radar which I have never seen before. Anyhow the Tim Oreilly talks about the xtech conference which I'm also involved in and how open data is a wee bit more experimental than the other tracks. The BBC and RSS is mentioned which is good for myself, Kevin and Joel.

If you are going to Xtech 2005, please let me know. I'm really hoping to soak as much in as humanly possible. Edd's been doing some interesting things already in the name of Open data. On Saturday he turned the Xtech 2005 schedule into xml and asked people to remix it. The file can be found here as a grid.xml. However Dan Connolly was quick on the take and converted the raw xml into a ical which can be pulled into a modern calendar application. So cool, and all the XSL's are available for anyone to pull apart and learn more. I love remix culture.

Comments [Comments]
Trackbacks [0]

Syndication for a world wide audience

People have slowly caught on to the problems with RSS syndication and languages. If you follow the links back from the blogdigger blog entry
you will start to notice a pattern, of people not quite being able to put there finger on the problem. And the reason why is because actually its not a single problem, its more a muddle of a problem. Andy puts it well but I may have the killer paragraph which explains it all.

It is a chicken and egg problem. If the content publishers do not provide RSS feeds with correctly structured language meta-data which software engineers can cut there teeth and applications on, then the stalemate will proceed as it does today. Certainly this is one way of looking at it. The other view point is software engineers need to put language features into there software otherwise there is no point in content providers using correctly structured language meta-data and modules to describe language content…

This is taken from my draft Paper which I am currenly finishing on the same subject of RSS and languages. See Blogdigger are right but how many feeds do they get from non-latin languages which have language meta-data they can actually use? This quote comes from Mark Fletcher from Bloglines

But the more important question is, are the majority of feeds accurately labeled in terms of language. And in our experience, the answer is unfortunately a resounding no.

I would echo that fact too, when looking for examples of non-latin RSS feeds, they tended to have little language meta-data (some actually marked english still!) Is this a limitation of the RSS standards or something else? Well in my paper it would seem no one gets away clean. For a quick taste of what I mean look at the complete (you call that complete?) list of language codes which can be used in the RSS 0.91 spec. Yes I know its old but still quite scary for 2000. Try and find Arabic, Hebrew and other non-latin languages.

If your interested in more information in this area, please keep an eye on this blog where I will post my paper sometime in late May or early June. Or even better come and listen to my presentation on the paper at XTECH 2005 in late May.

Comments [Comments]
Trackbacks [0]

Ajax? asynchronous JavaScript and XML better known as Remote Scripting

Well it looks like theres some push behind the AJAX naming now. I believe it stands for Asynchronous JavaScript and XML and translates into this easy to understand list.

  • standards-based presentation using XHTML and Cascading Style Sheets (CSS);
  • dynamic display and interaction using the Document Object Model;
  • data interchange and manipulation using XML and XSLT;
  • asynchronous data retrieval using XMLHttpRequest;
  • and JavaScript binding everything together.

I couldnt give a crap what its called but certainly the way to do rich interaction in the browser without using Flash or anything non standard. Its all interesting because about 3-5 years ago Standard CSS and XHTML was becoming accepted in the modern browsers, and a standard browser DOM or javascript was along way off. Now it seems were hitting the point of where you can use standard javascript across all modern browsers. Which is indeed a good step forward.

Sam Ruby considers Ajax harmful but seems to have more a problem with bad use of it. Paul vents his frustration with others on the naming of Ajax. Dare covers the whole Ajax issue once and for all

Comments [Comments]
Trackbacks [0]