Thursday, May 24, 2007

Unicode support

I have had a hell of a time with it, but I have finally managed to get the mysql database set up to use unicode. Unicode is a character encoding, like ASCII, but 32 bit instead of 8 bit, allowing it to store the character sets for foreign scripts and mathematical symbols that won't fit into an 8 bit scheme. It's essential for an application like this one that relies on being able to use foreign scripts.

This ought to be easy, but it isn't. The first problem is that the underlying language, Ruby, doesn't play nice with Unicode yet. The second problem is that while Rails is a lovely agile framework, mysql isn't really. Even using migrations, there doesn't seem to be any nice way of changing an aspect of the entire database like this.

Still, it does work. I also set up the database for the rest of the data structure. I did decide to include one more class, a Square, which holds one image, one sound file, and a transcript. That way, a page has a collection of squares, which seemed to me to be a nice, open, expandable data structure. I guess I'll find out how nice that is.

Update: Tested it with tengwar, elvish letters in the unofficial ConScript Unicode Registry. If you have a font that supports it installed, you should see the following as script instead of question marks or something:

                             

No comments: