People have slowly caught on to the problems with RSS syndication and languages. If you follow the links back from the blogdigger blog entry
you will start to notice a pattern, of people not quite being able to put there finger on the problem. And the reason why is because actually its not a single problem, its more a muddle of a problem. Andy puts it well but I may have the killer paragraph which explains it all.
It is a chicken and egg problem. If the content publishers do not provide RSS feeds with correctly structured language meta-data which software engineers can cut there teeth and applications on, then the stalemate will proceed as it does today. Certainly this is one way of looking at it. The other view point is software engineers need to put language features into there software otherwise there is no point in content providers using correctly structured language meta-data and modules to describe language content…
This is taken from my draft Paper which I am currenly finishing on the same subject of RSS and languages. See Blogdigger are right but how many feeds do they get from non-latin languages which have language meta-data they can actually use? This quote comes from Mark Fletcher from Bloglines
But the more important question is, are the majority of feeds accurately labeled in terms of language. And in our experience, the answer is unfortunately a resounding no.
I would echo that fact too, when looking for examples of non-latin RSS feeds, they tended to have little language meta-data (some actually marked english still!) Is this a limitation of the RSS standards or something else? Well in my paper it would seem no one gets away clean. For a quick taste of what I mean look at the complete (you call that complete?) list of language codes which can be used in the RSS 0.91 spec. Yes I know its old but still quite scary for 2000. Try and find Arabic, Hebrew and other non-latin languages.
If your interested in more information in this area, please keep an eye on this blog where I will post my paper sometime in late May or early June. Or even better come and listen to my presentation on the paper at XTECH 2005 in late May.