Web based Transliteration

The Transliteration Solution is a library that facilitates transliteration from English to Indian Languages viz. Hindi, Marathi, Gujarati, Bengali, Kannada, Tamil, Telugu, Malayalam, Punjabi, Oriya.

It also supports management of user dictionary that is given higher preference while transliterating.

The library is created using .NET Framework 4 and uses MS SQL 2008 at back end.

Thus, any web application / web Page that can add reference to an assembly (a Dll) created using .NET Framework 4 and can have connection & access to MS SQL Database can very easily integrate Web based Transliteration.

The APIs available with solution enables very easy and near to accurate Transliteration. The APIs available have wide scope that covers Transliterate, AddToUserDict, RemoveFromUserDict, UpdateUserDict and optionally Suggest.

The process of Transliteration happens as follows:

  1. Input is split into individual words separating on comma, space etc delimiting characters.
  2. For each word,
    • It is first searched in user dictionary
    • If found, corresponding language word is output
    • If not found in user dictionary, then it is search in main dictionary (The main dictionary comes along with solution. It has compressed of common words not following language rules)
    • If the word is found in main dictionary, then corresponding language word is output
    • If word is not found in main dictionary, then language rules are applied to create output word
    • (The language rules are made with guidance of language experts that most of the times give desired output)
  3. The output words are combined with addition of delimiters as in input
  4. Transliteration is Done!
Why Transliteration?

When a user views name, address and similar data, it is extremely helpful to view and refer to it in the user's native script.

When the user wants to perform operations such as searching and indexing, *ONLY* transliterated data can retrieve data in desired order.

Today, data is available in Databases in English for the reasons of simplicity and availability.
With introduction of Unicode and its awareness getting spread all over, the demand of making data available in Native script is increasing.

In short, there is increasing demand of transliteration from English to Indian Languages.

What is Transliteration?

It is an art of representing letters or words from one script to corresponding letters or words of another script. It is a process of systematic conversion. Transliteration is concerned with representing character accurately and unambiguously.  The desired result is roughly phonetic for languages in the target script.

A good transliteration is one that follows established standards, is complete, is predictable and most important - is pronounceable

Translation, Transcription and Transliteration

Translation is converting words / sentences / thoughts from one language to another. The major concern is keeping same meaning, conveying right idea.
Transcription notes sounds of characters and not spelling. Word is written exactly how it well be pronounced.

e.g. English word "Table" will be transcripted to English as "tebal", "You" as "u", "Dictionary" as "dikshanari", "Translation" as "transleshan", "Tough" as "taff",...
Transliteration is converting words / phrases from one language to another, without losing phonetic characteristics

Note that Translation and Transliteration are concerned with Language whereas Transcription is concerned with Script.

Challenges in Transliteration

Pronunciation of a letter or certain sequence of letters differs with respect to context. Certain sequence of characters are pronounced different than when appear individual. Same text can be presented in different ways. Final vowel is suppressed in many cases.

Smart Transliteration

There are some phrases that appear commonly in addresses. These are better read when transliterated smartly. The Transliteration solution does smart transliteration for such phrases, for initials, for apostrophe etc

Input English Text Transliteration Output
Near Post Office पोष्ट ऑफीस के नज़दीक
L & T Park एल अँड टी पार्क
D'cunha Joseph डी'कुन्हा जोसेफ

 

Suggest API

This API call can be used to get suggestions for an English word.

English Language character set being small in comparison to any Indian Language character set, a character or sequence of characters in English may be presented in multiple ways in any Indian Language.

MITSuggest() API assists by producing all possible formations in selected Indian Language for a given English word.

The number of suggestions produced depend upon word length and characters used in the word. If the English word is present in any of Main / User dictionaries then corresponding language word available in dictionaries are at the top of suggestions list produced as output of the API
If English word contains any delimiting characters i.e. ‘,’, ‘.’, …  in fact anything beyond a-z, output contains only 1 item as “Cannot Suggest”

Examples of using Transliteration APIs

Add reference of  MITTransAssembly.dll in your application. By default the dll will be copied in application/bin folder.

There is an exposed class MITTransMgr that is responsible for all operations related to transliteration.

In order to transliterate any data / update user dictionary /  etc, create an instance of MITTransMgr. Later continue using properties and methods available of this object.

 

To Transliterate

            using MITTransAssembly;
      //....
          MITTransMgr  objTransMgr = new MITTransMgr (“MAR”,1);
          String  sMarWord = objTransmgr.MITTransliterate("meena", out op);
          // After  execution, sMarWord will have value “मीना”
          

e.g. To Add to User Dictionary


          MITTransMgr   objTransMgr = new MITTransMgr (“MAR”,1);
          String sMarWord = objTransmgr.MITTransliterate("sai",  out op);
          // After execution, sMarWord will have  value “साई”
          objTransmgr.MITAddToUserDict("sai",  "सई");
          String sMarWord = objTransmgr.MITTransliterate("sai",  out op)
          // After execution, sMarWord will have  value “सई”
      

e.g. To use Suggest

          MITTransMgr   objTransMgr = new MITTransMgr (“MAR”,1);
          String[] saWords = objTransmgr.MITSuggest("sai")
          // After execution, saWords will be
        साई, सऐ, शऐ, सै, शै, साइ, शाइ, सॅइ, शॅइ, शाई, सॅई, शॅई, सइ, शइ, सई, शई, सएय, शएय, सअय, शअय
        

Subscribe our newsletter for attractive offers and product info.

Sitemap

Copyright 2000-18 Modular Infotech Pvt. Ltd.