Loading Arabic Fonts

mar-rih · **Posted:** Fri Apr 29, 2005 6:46 pm

Quote:

I can try to dig it up if that would help.

Thanx ...

yes, this could help me, could you dig it please...

Chris Giese · **Posted:** Fri Apr 29, 2005 10:48 pm

I once wrote some code that loads BDF bitmap fonts and displays UTF-8 text on the screen in various VGA graphics modes. I tried to make it work with Arabic:

http://my.execpc.com/~geezer/temp/arabic.zip

nashed · **Posted:** Sat May 07, 2005 3:19 am

please could you explain procedure

unsigned get_utf_char(char **str_p);

Pype.Clicker · **Posted:** Sat May 07, 2005 3:37 am

UTF-8 is a variable-size encoding of 16 bits unicode characters. Some characters (ascii 0-127) are encoded as a single byte but some other characters are split to 2 or 3 bytes. E.g. character "?" in latin-1 is unicode character 0x00e9. That will be split using the 2-bytes encoding as yyy.yyxx.xxxx == 000.1110.1001 and then encoding bytes 110y.yyyy 10xx.xxxx have the value 1100.0011 1010.1001 (which is 0xC3 0xA9)

Now when decoding the UTF-8 string, you wish to retrieve that it was "0x00e9" by calling

Code:

unsigned char utf8[]={0xc3,0xa9};
unsigned unicode=get_utf_char(&utf8); // should return 0xe9

What is done is to compare the highest nibble of the current char in the stream. If it matches "1110", then you have a 3-bytes encoding (that's the test [tt]if ((c&0xF0)==0xE0)[/tt])

If it matches "110*", then you have a 2-byte encoding (that's our case here). you then extract the "interresting bits" in the sequence, and combine them.

If it matches "10**" it's no good: it means you are trying to decode the middle of a multibytes character ...

if it matches "0****", it's a single-byte character which you can return without extra effort.

In any other case, it's not a valid UTF-8 stream...

If that still doesn't help, i suggest you try to figure better how masking and shifting works...

Gnome · **Posted:** Sun May 08, 2005 10:19 am

Sorry for the long delay.

Here's the relevant part of the code, what I nicknamed the Luigi subsystem. Some idiosyncrasies of note:

VMM::MemoryMap is one interface to the memory manager that creates a direct mapping of a region of physical memory to a region of virtual memory. Once initialized, it looks just like an array, and can be replaced by one that points to the memory location passed to VMM::MemoryMap::init()
I never finished implemented window management before I started working on other things and went back to a text interface, so that part is half-implemented.
My intention with Luigi's API was to have a back-buffer on all the windows that is always 32-bit colour, regardless of the screen's resolution. Then, when the ScreenMaster merges the windows onto the framebuffer, it will also convert it from 32-bit to the screen's colour depth (and apply alpha-blending, etc).

The biggest thing you'll be interested in is Init.cc, FontTable.c and Window.cc. Respectively, they define the initialization, font, and drawing routines. Remember, though, that GRUB will have already switched us into the video mode we asked for. All we have to do in the initializer is set up our own structures. In there, I have it draw all the characters to the screen.

http://www.uoguelph.ca/~mmelanso/luigi.tar.gz

nashed · **Posted:** Mon Jun 27, 2005 1:47 am

Sorry for the long delay , because exams
thank for all ,

how can i call charecter from unicode file,
how i know code charecter ( D is \xEF\xBB\xB5),or code any other charecter

Pype.Clicker · **Posted:** Mon Jun 27, 2005 2:32 am

nashed wrote:

how can i call charecter from unicode file,
how i know code charecter ( D is \xEF\xBB\xB5),or code any other charecter

well, i suggest you google for "UTF8". the Wikipedia, for instance, has nice reference table of unicode characters. Usually, what you'll have to do is giving an scancode map that will produce UTF characters for what's on a given keyboard and a font encoding that tells what each character each glyph is for ...

nashed · **Posted:** Mon Jun 27, 2005 4:23 am

thanks a lot

mystran · **Posted:** Mon Jun 27, 2005 10:05 am

For anyone (relatively) new to unicode, this might also be useful: http://www.joelonsoftware.com/articles/Unicode.html

marwan · **Posted:** Wed Jun 29, 2005 4:08 am

salam

i am sorry for my delay..
could any one help me about using the keymap in keyboard handler.

@Chris Giese
i have use the arabic program, and very thanx for this help, i tried to make the same thing in my arabic OS, but i coud't know how to use UTF-8 in the system which is working in text mode.

the pproblem is that your program is wirking in graphic mode, and i want to scale your progam to work in text mode.

can i make UTF-8 code work in text mode and how ?

AR · **Posted:** Wed Jun 29, 2005 4:57 am

Textmode is pretty much reserved for English only, the high codepage may be useful on Arabic computers (but if the person is Arabic but using a non-Arabic (eg. American) computer then it'll show gibberish) but you'll need a reference for the codepage (Don't know the ID for it, you'll have to look it up) then convert the unicode text to the codepage encoding at runtime as necessary and it should work.

You can avoid the unicode to arabic step by just writing all strings in the codepage encoding to begin with but as I referenced above, you'll end up having to convert the codepage characters to unicode before you can use a graphics mode on computers that aren't of Arabic origin (unless you create a special font with your own encoding, in which case you would convert from the codepage/unicode to the custom encoding, most available fonts will probably be unicode though).

If you want the most simplicity then you'll be better off with UTF-32, all strings are simply an array of "unsigned int" and there are no prefixs or byte by byte pattern tracing, since you want to write in Arabic to begin with using UTF-8 would be inefficient [as it was designed for compatibility with ANSI to begin with which is English only].

marwan · **Posted:** Wed Jun 29, 2005 6:16 am

utf-8 support arabic charcter, why i need to create special font ?

my problem can i use utf-8 to write arabic char in text mode ?

Pype.Clicker · **Posted:** Wed Jun 29, 2005 6:32 am

marwan wrote:

utf-8 support arabic charcter, why i need to create special font ?

my problem can i use utf-8 to write arabic char in text mode ?

This is what this whole thread is about, so please read through it, will you ?

mar-rih · **Posted:** Wed Jun 29, 2005 10:25 am

Pype.Clicker wrote:

marwan wrote:

utf-8 support arabic charcter, why i need to create special font ?

my problem can i use utf-8 to write arabic char in text mode ?

This is what this whole thread is about, so please read through it, will you ?

i understand that, nashed ask about arabic.zip program which posted by @Chris Giese, and this program i had tested it, it work in graphic mode. and did not work in text mode as i understand.

AR · **Posted:** Wed Jun 29, 2005 6:08 pm

Unicode is an encoding, it is a way to store characters from multiple (meant to be every) language in a common "addressing space" so to speak, like ASCII.

In graphics mode you can simply get a Unicode font and lookup the code point in the glyphs and draw it, in textmode you will have to translate the code points from Unicode to the Arabic codepage code points.

Textmode does not mean that you have to physically use the textmode on the video card, emulating it with a graphics mode is fine (and may also look better since you can have higher resolutions and smooth the edges on the fonts, etc).

OSDev.org

Loading Arabic Fonts

Who is online