March 16, 2006 – Cubicgarden.com…

I've been wanting to blog this for quite some time now. When we think of blocking and censorship, everyone goes on about China. Well theres many other nations which have levels of censorship and blocking. So it started with the blockage of BBC Persian content in Iran, then we started to syndicate more via public RSS and Email. Then Mario wrote a Instant messenger bot which takes BBC Persian RSS feeds and republishes them on the MSN network if you subscribe to the bot (just add bbcpersian@hotmail.co.uk to your buddy list). Then Mario added support for the Jabbber network (just add bbcpersian@menti.name to your buddy list) and tried to get YIM (Yahoo) working, as thats the most popular Instant messenging tool in Iran. Now he's trying out JRS which is a publishing tool for the XMPP (jabber) network as the Perl Yahoo module is broken or/and out of date. Then Hoder (Hossein Derakhshan) gave a good talk about censorship in Iran to the BBC.

Some observations along the way. Although right to left text should be easy with most unicode complient instant messenging clients. This simply is not the case. The markup of right to left languages is still a very difficult thing to do. Dan Brickley send a good email into the W3C internationalisation core group. I keep meaning to respond myself, but still have a draft ready which I keep rewriting. I'm happy Martin Duerst and others have read my paper from Xtech 2005. But I would like a little more clarity on Martin's reply.

In Ian's article and in Mario's messages, there is also some extent of confusion with regards to bidi. If the text in a line or paragraph contains only rtl characters, or neutral characters such as punctuation, any application is supposed to display it in the correct order. No attributes are neccessary, except for where to start the line (flush left or flush right), which can be considered a matter of taste (in mixed English/Farsi text, I wouldn't consider having all English messages flush left and all Farsi messages flush right necessarily
always the best display) and which could be handled by a switch in the user agent.

It's only when a line or paragraph mixes both rtl and ltr text where having additional information becomes really necessary, to indicate whether the text is a (e.g.) Farsi sentence with some English embedded or the other way round (or even a more complicated structure).

See this is great in theory but the practice or reality Applications don't do this correctly. Its good to see I was correct about ATOM and RSS when it comes to language support.

It very clearly shows that more thought should go into supporting internationalization markup in all kinds of document or document-like (in the sense that they use free text rather than data items) formats.

The only blog format that got that right (sic!) from the start is Atom (http://www.ietf.org/rfc/rfc4287.txt). Elements such as title all allow for embedded XHTML markup, which then can take a dir attribute. RSS 1.0 has a content module that could do the same thing, but I'm not sure how well it is supported.

Certainly, its hardly supported in the RSS space. ATOM is the only one which had this from the start, so all the developers who build there readers have build in the ability to have markup inside of content module including directionality.

Comments [Comments]
Trackbacks [0]

After the mad panic trying to get the train up to Oxford due to the Trainline machine at work not working. We arrived at the Oxford University venue well before the start time and picked a great spot for the lecture. Tim Berners-Lee was good to see live, you could see he certainly was no Steve Jobs. He was more like Bill Gates, a little uneasy with public talking but happy to talk about his vision and his work towards that vision. That vision is the Semantic Web. Rather than me explain every aspect of the talk its best I point you towards Tim's S5 presentation, a webcast (coming soon), this blog and my notes. I've also added my photos from the lecture to Flickr.

So generally I'm even more sure that the semantic web is happening but within certain domains. Will the semantic web happen across the web, doubtful at best. Recent developments in web 2.0 have really pushed the web towards a more richer smeantic web but away from top down ontologies and rules.

Oh and believe it or not, me and Miles were quoted in the Newstatesman blog

Comments [Comments]
Trackbacks [0]

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Day: 16 March 2006

The censoring and blocking inside of Iran

Tim Berners-Lee Semantic web lecture