Monday 18 February 2013

Sinhala unicode with Java


     Welcome you all to my new blog The Life 0.5. I hope to meet you with a collection of some different, different and different kind of topics, at least twice per week. Actually, I thought to start this blog on last 23rd, but time matters as you know.  I saw somewhere that there are two kind of people, one kind of people are really busy and they have no time to do anything, and other kind of people are really busy with doing nothing.  Unfortunately I’m in the later group :(..   And that’s the end of my welcome speech and thank you all… ;)

     Ok as the first topic, I’m going to write about using Sinhala Unicode in java. I was in a great trouble with this subject and I googled all over the internet and following are my final results.

     Actually Sinhala Unicode is pretty much common in these days. As we all know ASCII uses a single byte to represent a character, but Unicode may use few bytes to represent a single character. Unicode has different flavors, some of them are using a single byte, two bytes or some others are using 4 bytes to represent a single character. These different flavors are known as character encodings.

     Unicode Sinhala characters are mapped to the range 0D80 to 0DFF. So every Sinhala character can be represented by using a hex number. As an example “” can be represented by 0D85. In java we can represent characters in Unicode by their corresponding hex numbers with a forward slash and ‘u’ in front of them.

char c = ‘\u0D85’;
or
char c =

     But then if you want to represent a letter with ispili, paapili and etc. Then you have to use two characters. One for the letter and the other one for the ispilla, papilla or for whatever. So as an example if you want to show the letter ‘කැ’ then u have to use two characters one character for ‘’ (0D9A) and the other character for aelapilla (0DD0).

String s = “\u0d9A\u0DD0”

     The above string represents “කැ”. To represent some letters it has to use more than two characters.  For an example it has to use four characters to represent ර්ම (repaya). A complete list of character and the way that they should use can find in the links at the bottom of the page.  Following is a very simple programme. 

public class Sinhala { 

           public static void main(String args[]){  

                        char c1 = '\u0D85'; 
                        char c2 = ''; 
                       
                        String s1 = "බි"; 
                        String s2 = "\u0DB6\u0DD2"; 
                      
                        System.out.println(c1 + " " + c2);//prints
                        System.out.println(s1 + " " + s2);//prints බි බි 
                        
                        දුවපන්(5, 10); 
            } 
           
            public static void දුවපන්(int පටන්, int අවසානය){ 

                       for( ;පටන් < අවසානය;  පටන්++){
                                  System.out.println(පටන් +
                                         " - \u0DAF\u0DD4\u0DC0\u0DB1\u0DDD"); 
                       } 
           } 
} 

     Sometimes if you run the above code in a command prompt, it may not show the correct output. But this runs in eclipse very fine, keep in mind to save the code with Unicode. 

     Actually it’s not recommended to use Unicode letters in code as letters, it’s better to use them with their corresponding hex values. Except to that, for I used some Sinhala letters for method names and variable names and it works, because java lets to use Unicode characters for variable names, method names and etc. But it’s funny nuh  :D… 

     On the next part of the article we are going to discuss about using Sinhala letters with swing components. In order to do this, you have to change the font of the swing component to a Sinhala viewable font. As I think “Iskoola Potha” font comes with windows 7 and sometimes with Windows Vista and I will use that font to view Sinhala letters. To use Sinhala letters you have to change the font of the swing component by using the setFont method. Following is an example, 

import java.awt.*; 
import javax.swing.*; 

public class SinhalaSwing extends JFrame{ 

            public SinhalaSwing(){ 
                        
                        super("පාටතෝරන්න"); 
                        setLayout(new FlowLayout()); 
                        
                       JButton button = new JButton("රතු"); 
                       button.setFont(new Font("Iskoola Pota", Font.PLAIN, 14)); 
                       add(button); 
                       
                        button = new JButton("කහ"); 
                        button.setFont(new Font("Iskoola Pota", Font.PLAIN, 14)); 
                        add(button); 
                        
                        JTextField field = new JTextField("තේරූ පාට - රතු"); 
                        field.setFont(new Font("Iskoola Pota", Font.PLAIN, 14)); 
                       add(field); 
                        
                        setSize(100,100); 
                        setVisible(true); 
            } 
            
            public static void main(String[] args) { 
                        new SinhalaSwing(); 
            }
}










    I was lazy to type Sinhala characters using \u in the code, but it’s recommended. If you are not sure whether there is Iskoola Potha font on your system, and if you want to find the correct font type that you should use, you can view all the available fonts in your system by using following two code lines. Then use each of those and try to find the correct font.

 GraphicsEnvironment e = GraphicsEnvironment.getLocalGraphicsEnvironment()
 Font[] fonts = e.getAllFonts();  

     So that’s the end of my first article on The Life 0.5. On next article, I will introduce a Singlish to Sinhala conversion code and it would help to developers to develop their applications. C yaaaaaaaaaaa all……..  :D


මේවත් බලන්න…. 

http://www.nsrc.org/ASIA/LK/03-Jan-2003_UCSC-Paper-on-Unicode.pdf

https://docs.google.com/document/pub?id=1gaRbfdmt31W51Y6j2YVBbISQLll1x7mZd8NBEL9ftI8 

http://www.silumina.lk/punkalasa/20121216/_art.asp?fn=ar12121611

2 comments:

  1. Thanks... But. Doesn't type 'ක්‍ර' and 'ව්‍යා'. Please help me.

    ReplyDelete