Batch transliterating names into Kannada using Google API
Some times work at Janaagraha throws awesome challenges. Like as part of BEST project we are cleaning up voters list. Voters list in Karnataka will have names in both English and Kannada, Most of the volunteers have filled up only in English and hence we were left with transliterating names into Kannada. I was thinking about automating it. After all transliterating is not as complex as translation, right? Wrong. Its difficult to write one specially when there are so many spelling variations in English for the same name in Kannada.
Like for example both Sreenivas and Srinivas are ಶ್ರೀನಿವಾಸ್ in Kannnada. I found Google transliteration does that pretty well. But they have only Javascript APIs for web pages but nothing for server side code.
But Google worked and I found a non-public API of Google Transliteration API which gives JSON output for a given english input. Cooked up API in PHP to clean up JSON and give an array of results. Code is github for obvious reasons.
Using
$kn = transliterate("thejesh,ramesh, uthara,shreenivasa,reddy"); print $kn[0];
Probable drawbacks:
1. Its a non-public API provided by Google. Not sure when they will block it.
2. As of now it can transliterate only 5 words at a time
3. No information about API rate limiting. So be on the safer side.
Let me know what do you think.
Hi Thejesh, I am a regular reader of your blog here. I wrote a python module for the same. Sharing it with you as you recently started liking python;)
http://tech.pradeepnayak.in/?p=419