The problem with Language RSS

Shoshannah Forbes and I have been sending emails back and forth about the issue of RSS adoption in Right to Left languages like Hebrew and Arabic. It fits so closely with what I'm going to say tomorrow at XTECH (where I happen to be right now actually) its almost uncanny. I asked Shoshannah if I could blog her reply to my question about her RSS feeds. Basicly her RSS feeds include the HTML attribute dir to indicate direction of the text. Which makes it invalid and may break quite a few of the RSS readers out there. Anyhow here is the email complete with my new agreements and additional comments. Please remember as usual these comments are my own views and not BBC World Service's views (my employer).

Shoshannah Forbes wrote:

> The problem I am facing is simple:
> If I use valid RSS with no dir=rtl, then 99% of the RSS readers will display the text block as LTR, with punctuation digits and English in wrong locations, making the whole thing unreadable.
> When adding dir=rtl, at least I can get about 50% of the RSS readers to display the post body properly (titles are still a mess).

Agreed, but I feel there are two ways of looking at the problem. From your point of view it makes sense to include dir=”rtl” because very few software developers are going to change there code to take this into consideration. For us (the BBC World Service) we have the might to speak to developers and get them to change there code. Even if we do not do it for ourselves, we owe it to our audience (my own feelings).

> I don't use unicode control characters for a few reasons:
> * They are a real pain to input- it is like entering the control characters for CR/LF or < font > tag manually (but worse)- there are just to many places to enter them.

Yep totally agree

> * Most keyboard layouts do not have a direct way to enter them.

Yeah were using virtual keyboards for some languages and there a nightmare!

> * They make a mess of the text- they are only used for the RSS, and unneeded for the editing or the html display, and can produce unexpected results when entered into the text.

Yep, agreed

> * There are many clients that incorrectly display them as visible characters in the text.

Yeah, its a shame and that will change but its too much trouble at the moment

> * They make the text much more difficult to edit- if you change the text, you need to go back and change them as well. And since they are invisible, you get an awful lot of trial an error.

Indeed! You really need to understand them to edit with them. This would require extra training for our language services

> * They force me to use explicit directionality, which complicates things and makes the text less portable.

Yeah, there is a idea of reuse through out our language services. This is tricky already, who knows how much more tricky it would be if text was unicode directional too

> * My web app that creates the RSS from my HTML does not know how to add them automatically.

Yep, I know my Blogger app (Blojsom) supports Unicode Directionality IF i put them in at the start but then were back to the editor problem of virtual keyboards and sticking in hidden characters! The same is true of the BBC World Service systems. We use XSL with Saxon so if the characters are there, it should (not tested by myself) pass through to the RSS.

> * Since they are rarely used in other contexts, I can't focus on the content when writing, and have to start thinking more closely about the presentation.

Yeah indeed! Our language services are already busy as hell, unicode directionality would just add a level of complex on top of a already stressful job.

> * Moving from me to other users- most Hebrew/Arabic users don't know about them, and don't want to know. You try to explain to your mother that when she is writing in her weblog, she can't write in here usual manner, but has to enter this strange codes in a foreign language which have complicated rules (I have seen many pros get confuses with these characters, I don't expect laypeople to understand them).

Right on the nail! One of my points for tomorrow is unicode directionality is too damm difficult and very confusing! i expect some will challenge me about this tomorrow and honestly I will just admit its too difficult for me its even more difficult for others. Plus we should be making things easier for people not harder. The barrier for entry should be at a level where your mum or my mum could use it and write it.

> * It doesn't scale- think about a an Israeli blog hosting service- they want to offer RSS feeds for all the blogs, with minimum work for the users. Relaying on unicode control characters just doesn't do it.

Yeah plus from the Israeli blog hosting point of view, you want to get people going quick and easily not putting them off with complex editiing. Its the reason why Blogger does so well, 3 steps and you got your own blog.

> * Since they are complex, it is difficult to create a GUI for entering them (unlike general RTL/LTR controls, which are available everywhere).

Yeah its almost needs to be just like the direction attribute in HTML. I'm suggesting tomorrow a attribute like this for RSS.

> Not having the dir attribute in RSS gets rid of some markup- in favor of lower level much more complex control characters. A bad deal, IMO, and one which is a major cause for the problems when dealing with Hebrew/Arabic RSS.

Indeed, it was a ideal solution but the real world use is too painful

> I think that the root of the problem is that bidi is part presentation and part structure. And since even in the best of cases (for example, the automatic bidi control in recent QT or GTK applications on Linux) there are still many many cases that can *not* be covered reliably by the display algorithms of the software, I tend to think that for practical prepossess, bidi is more structure then presentation.

Yeah agreed, theres lots of push to put bidi information inside of CSS instead of HTML even, which is correct if you see bidi as presentation.

> I sure wish there was a way in RSS to tell the client “this element is RTL” or “this area is LTR” without resorting to HTML hacks. But at the moment, those hacks are the only practical tool I have to get at least *some* of the readers out there to display the text properly (more like “mostly properly”).

I feel your pain and I'm not even writing my own content in a right to left language! Its such a shame that HTML hacks are the only way we can move forward on this. The crux of my presentation and paper is that developers and content providers need to work much closer together and the RSS specificiation needs to make full use of attributes like xml:lang and maybe some other kind attribute for direction.

Comments [Comments]
Trackbacks [0]

Planning for XTech

My setup for the XTECH conference

Edd Dumbill has used some type of aggregation system to pull together all the various Xtech information into Planet Xtech. It works pretty well except for the wiki changes but there short and easily scanable. I'm subscribed to the RSS feed as I prefer it to reading the page. The good thing is my global watch lists still apply so I'm able to be notified if something of interest comes up. Technorati is also a pain because it brings up older posts says Edd, and you can certainly see that. I'm also wondering why Technorati is not showing any of my posts? This one has the Xtech tag attached like previous ones. Anyhow theres some more indepth thought here. To me its all a no brainer, I had setup a special group in Jager and added technorati, flickr, recent wiki entries and del.icio.us entries into it. So I pretty much had PlanetXtech on my own machine. But this is exact the kind of thing Cocoon is great at doing, and I guess if I was asked to do the same, would have used Cocoon and XSL.

Anyhow, I'm all setup and ready for Xtech now. As you can see in the picture above. I got a feeling if Miles didnt buy me a Flickr Pro account for my Birthday (thanks again for that Miles, and what a good idea for a birthday present) I would have to buy one by the end of Xtech. Pixory is still doing me well at home, but is cripled by my 256k upload on the ADSL, plus it doesnt have the social aspects of Flickr of course. Like the decentralised nature of blogging it will come in time via methods like trackback, aggregation, tagging but not quite yet.

Comments [Comments]
Trackbacks [0]

First night in Amsterdam

Hotel room from my Bathroom

I'm telling you I love this city, yes its full of shadey people at night and theres the hard drug pushers walking around offering tourist all types of anything but you cant beat the vibe in this city. Anyway I have to report that the Lloydhotel maybe quite a bit from Central Station (about 25mins walking, 5mins by taxi) but its certainly worth it. I'm using there very fast and free broadband connection which is plugged into every single guest room. Yep on the wall there is a nice network connection just waiting for you to plug in. I can pick up the 4 different Wireless nodes which are placed downstairs in public places like the bar and lounge areas. But being on the 3rd floor makes the signal a bit weak. Anyhow for some reason my ipaq seems to do a better job picking it while walking around the hotel. It seriously takes some getting use to. Being surrounded by free wireless while away from home does not quite compute in my own mind. But dont worry, I'll be taking full advantage. I already have noticed someones elses iTunes playlist is available via the local lloyd network. But I can not for the life of me connect to Jabber, skype is fine but not jabber. Its almost like the ports have been blocked… Email is also fine, so email me on my hotpop or rave address if you want to tell me something urgently.

The actual room and hotel are pretty good and this wireless aside has to be the best hotel I've been in. its not the biggest (vegas hotel room at Lady Luck was huge). But this hotel is smooth, simple, clever and stylish. Theres little touches like the lights which can be moved around the wall using strong magenets and a set of spare enthernet cables just in case you forget to bring or buy one. Sarah asked me to explain the room and hotel and I answered simple, sensible and clever. You can tell the designers had a great time.

I have to spare a thought for Matt biddulph, whos hotel did not have wireless and needed to go down the RAI centre to get some. I now have my stuff together and should be ready for tomorrow's talk. Just like Dodds, I also extended the week into the weekend so I wont need to rush home after the conference. Sarah is coming over so we can spend the weekend together in Amsterdam.

Comments [Comments]
Trackbacks [0]

XTECH 2005, less that a week off now…

Xtech Conference 2005

I noticed a new comment on the blogdigger development blog from Xslf.

Re: On Language

Hey, that was fun- seeing some Hebrew here 🙂 This issue indeed is painful. My blog publishes feeds in utf-8/Hebrew, and half the RSS readers out there have problems displaying them (esp. post titles) in proper RTL (even though my feed has dir=rtl in the right places). Speaking of Hebrew problems- I just attempted to create a blogdigger group with a Hebrew name/description, and all the system gave me after I submitted the form was question marks 🙁 Seems that the server here did not like Hebrew input (the group can be found at: http://groups.blogdigger.com/groups.jsp?id=2051 ) An English language group that I had opened ( http://groups.blogdigger.com/groups.jsp?id=2044 ) works fine. Sigh 🙁

This is almost exactly the crux of my presentation at Xtech. Right to Left Language RSS is so painful. Why is Hebrew language so difficult to work with for seriously most of the RSS readers out there? Xslf is one of a larger group of people who are puzzled why they can not communicate in there own native language with the modern tools and applications around them. One of my points is Unicode is an enabler, and it really is! But being unicode is not some magic bullet, much great language consideration needs to go into the whole process.

I had a quick peek around Shoshannah Forbes blog http://www.Xslf.com, I can not read Hebrew but I know a friend who does (whos coming around tomorrow). Anyhow I looked at her RSS feeds just out of interest to see if she was doing anything interesting, as her HTML meta-data was neat and considered. Shoshannah is basicly using RSS 2.0 with added modules which are common in RSS 1.0. She describes Hebrew by one head-level dc:language element and then inputs a div with HTML directional code inside like this < div dir="rtl" >. This is what we tried avoid at the World Service because we felt it broke RSS validation, caused presentation vs structure issues and generally did not work in most of the RSS readers on the fragmented RSS market. I recommend Shoshannah read some of the blogs I linked to in Languages in RSS a while ago. It would be great to hear what she makes of the whole Language RSS debate.

Do not forget XTECH is next week and I will presentating along side the other 3 BBC presentations (BBC News, Radio and Music, Backstage BBC). XTECH looks to be a great conference this year, its seems a crying shame that no one is going to be seriously podcasting or even recording the event and speakers. But I maybe wrong?

Anyhow here's my plan for the conference, remixed to show my choices using XSLT of course. All I quickly did, was add an attribute named choice then slotted 1st, 2nd, 3rd or even 4th. Here's my modified XML and XSL. For some reason I couldnt get the external css stylesheet working so I just inlined it in the short term. Do not forget there are many ways to get involved in .

I know theres free wireless at the conference and I'm bringing my pocket wireless hub in a aid to help extent the range but I dont know if electricity will be a problem because my laptop only lasts 1hour without being plugged in. If you happen to be sitting near me in the conference with a laptop, please tap me on a hand as I will have a English 3 way power adapter plugged in where I can.

Comments [Comments]
Trackbacks [0]

Jabber support in OSX 10.4, may not be as ideal

Jabber lightbulb

Is it safe to say Jabber support in OSX 10.4 (Tiger) is not all its cracked up to be? Apple did a good thing embedding Jabber support in not only the client (iChatAV) but also the OSX 10.4 server. But according this article it certainly sounds like it.

Apple chose to leave a few other pieces of Jabber functionality out of its client as well: Though it's able to use them if they've already been set up on another Jabber client, there's no option within iChat to do the service discovery needed to access Jabber gateways.

Off the back of this, iChat users have been sharing hacks around the gateway problem.

Whether iChat offers the ability to register with Jabber gateways or not, iChat users have been busy figuring out how to use third party clients to sign on to public servers, register with those gateways, then return to using iChat.

It was also noted by developer Missig, that iChat diverges from the XHTML-IM specification. Apple are using some kind of rich text which will need to be hacked or reverse engineered to allow for compatable applications elsewhere.

Yes before people start, most of these things are nitty picking and yes if Microsoft even dared to do anything like this it would likely be so far removed from the standard. Not that Jabber will ever come to the core of Windows ever, specially with Microsoft fully behind the Sip/Simple specification.

Comments [Comments]
Trackbacks [0]

New version of Konfabulator

After installing it and playing with the new version for a matter of minutes, I have noticed an increase in general speed and performance. But anyhow, here's the official list of whats new…

Whats New in Konfabulator 2.0 for users
Multi-Pane Preferences dialogs for Widgets
New Widgets
Improved Proxy support for web resources (AutoProxy)

Whats New in Konfabulator 2.0 for authors
COM support
Inter-Widget messaging
Image Tiling/Scaling
ClipRects on Images
vAlign on images and text areas
Colorization (colorize, hsl adjustment, hsl tinting)
Context menu addition support
filesystem.read/writeFile (utf-8 only)
filesystem.volumes array of currently mounted volumes
filesystem.move/copy
filesystem.getFileInfo
filesystem.getDisplayName
chooseFile/chooseFolder dialog functions
saveAs dialog function
chooseColor dialog function
Trash/Recycle Bin open/empty
Multiple Window support
Multi-Click handling
New Timer object

Comments [Comments]
Trackbacks [0]

Geek dinner with the scoble

Robert Scoble

Quick note to all who contacted me after reading about the last Geek Dinner
. There is only 24 places left for the next one which is planned for 7th June in the Texas Embassy Cantina, near Leicester Square and the Mall. So if I was you I would seriously make up your minds and get your name down on the wiki sharpish. It should be a good night, lots of bloggers, geeks and interesting people (not to say bloggers and geeks are not of course).

Comments [Comments]
Trackbacks [0]

The Play’s the thing…

Channel4 have launched a competition titled The play's the thing. Its an opportunity to write a play which may be performed in London's West end if its good enough. Now I like theatre but love cinema because I find theatre quite stuffy and out of touch (my thoughts). But I do like the idea of live theatre. So this strikes me as a chance to do something about my thoughts.

Me and Sarah have come up with a cracker of idea for a play which brings it right up to date and sends a message out about the society were in today and tomorrow. Yep you bet your bottom dollar its got something for the net generation but its also got something for people who just read about the internet in papers. Obviously once me and Sarah thrash through ideas and develop something concerete which we will submit. I'll open up the idea and development on my blog. Maybe if its a little too riskque for Channel4 someone else may be interested in the idea. Submission has to be done by 1st July 2005 which is the same deadline for the Microsoft IP video thing, hence I only got time to do one or the other.

Comments [Comments]
Trackbacks [0]

Backstage.BBC.co.uk launches into public beta

BBC Logo

So many things happen when you go away on Holiday for one week. Yep this is kinda of late news, but that secret project which I was not really able to talk about on my blog has gone live now.

backstage.bbc.co.uk is the BBC's new developer network, providing content feeds for anyone to build with. Alternatively, share your ideas on new ways to use BBC content. This is your BBC. We want to help you play.

Taken from the about page.

Backstage is part of the BBC’s wider remit to “build public value” by sharing our content for others to use creatively. How do you “build public value”? One of the ways is through supporting innovation as the BBC Governors response to the Graf report of BBC online makes clear:

“The BBC will support social innovation by encouraging users’ efforts to build sites and projects that meet their needs and those of their communities … The BBC will also be committed to using open standards that will enable users to find and repurpose BBC content in more flexible ways”.

backstage.bbc.co.uk aims to promote innovation amongst the design and developer community: if people are able to do interesting, productive things with the content then we’d like to support them. Finally and as a useful by-product of the above, backstage.bbc.co.uk is an opportunity to identify talent in the online community.

I have been aware of backstage bbc.co.uk for quite some time, but didnt take part in the closed beta due to work load. I urge everyone else to check it out and join the email discussion list which should be a friendly place for developers and designers to suggest ideas and team up with like minded people. I certainly will be on there with my designer/developer hats on.

I have to give a lot of credit to the backstage team. WELL DONE! Ben Metcalfe, James Boardwell and Tom Loosemore. Who all worked really hard to make this happen and without the concerns and conditions which could have been plagued the whole project and idea. I know lots more people were involved but these guys lived and died by this project.

Looking around so far, backstage got metioned on Channel9, Guardian Unlimited online Blog, The Oreilly Radar (cool!), Boingboing (double cool!), P2Pweblog (odd?) and even BBC News. The question still remains if they are ready for a slashdotting? Too late they already were via Stefan Magdalinski of course.

It time to crank out my Cocoon book and get working with the tons of open APIs and RSS feeds which now cover the web 2.0 landscape.

Comments [Comments]
Trackbacks [0]

…I come back and some torrents sites are gone?

MPAA - you can click but you cant hide properganda

So yes, I come back from a nice holiday away expecting all my TV programmes to be downloaded ready to watch through-out the rest of the week but oh no, the bloody MPAA have targeted TV torrent sites. Damm you! I use to use ShunTV and BTefnet for all my American Television fixes now I'm going to gave to look elsewhere. Some good news is that PQRT has changed to http://www.rokanova.com and http://www.seedler.org has just launched. Shame there trying to be jack of all suprnova trades, and seedler does not have rss feeds. Oh well, as Sarah says, there will be others and there are other ways to get TV shows that just these sites. It also seems BTefnet was sued according to the IRC channel. Someone left a comment, saying BT Website is currently down. Releases are on hold until we have a better understanding of the current situation. We have NOT been sued or been contacted by the MPAA! More information about the closures can be found here at Slyck or the P2P website. It looks like either http://www.demonoid.com, http://www.torrentspy.com or http://www.zonatracker.com are the places to go for TV torrents now.

Comments [Comments]
Trackbacks [0]