Thursday, May 24, 2007

Idea for a Transliteration Editor

Another neat little application would be a transliteration editor for foreign scripts. One window would allow you to type a foreign word in roman letters and the other window would produce the same thing in the script associated with that language.

As an example:
You type: konichiwa
You see: こにちわ

The major point is that it would allow you to write these scripts on a roman keyboard. I don't think the differences between, say, US and Spanish keyboards would be important.

I think it would work fairly easily for Arabic, Hebrew, Cyrillic, Hindi, Japanese hiragana and katakana, Klingon, tengwar, and other alphabetic or syllabic scripts. Mandarin, Cantonese, kanji, Ancient Egyptian, Sumerian, and such might be somewhat more difficult.

Update: Apparently, the Russian word processor Hieroglyph has this capability in cyrillic.

Update 2: Someone at Microsoft has written a transliteration utility very like the one I described। It's free, has an odd selection of scripts, and appears to be fully customizable.

Update 3: A commentator pointed out to me that there's on-the-fly transliteration into Hindi on Blogger. It seems to work really nicely (although I don't read Hindi, so I can't really vouch), and the interface is really smooth and transparent, better, probably, than my idea (which is more or less identitical to the Microsoft utility) for most languages.

Unicode support

I have had a hell of a time with it, but I have finally managed to get the mysql database set up to use unicode. Unicode is a character encoding, like ASCII, but 32 bit instead of 8 bit, allowing it to store the character sets for foreign scripts and mathematical symbols that won't fit into an 8 bit scheme. It's essential for an application like this one that relies on being able to use foreign scripts.

This ought to be easy, but it isn't. The first problem is that the underlying language, Ruby, doesn't play nice with Unicode yet. The second problem is that while Rails is a lovely agile framework, mysql isn't really. Even using migrations, there doesn't seem to be any nice way of changing an aspect of the entire database like this.

Still, it does work. I also set up the database for the rest of the data structure. I did decide to include one more class, a Square, which holds one image, one sound file, and a transcript. That way, a page has a collection of squares, which seemed to me to be a nice, open, expandable data structure. I guess I'll find out how nice that is.

Update: Tested it with tengwar, elvish letters in the unofficial ConScript Unicode Registry. If you have a font that supports it installed, you should see the following as script instead of question marks or something:

                             

Wednesday, May 23, 2007

The user page



This is a rough setup for the opening user page -- what you'd see when you get to the site, more or less. Obviously the layout needs some work, but I'm not worrying too much about design at this point as long as the back end works

Not much going on here, just a layout, a style sheet, a page, and the languages listed in alphabetical order. I suppose I ought to fix that on the admin page as well, neh?

I'm not sure about the name either. It just came into my head when I started the blog. I think it's a good name for the software, but maybe not for a site. Anyway, lots of time to think about that later.

CSS on Admin Page


Here's the screenshot of the opening admin page listing the languages. I know, it's very exciting, but this is my first time writing a CSS file so I'm happy with it. Took my time picking out the colors, so no lip now.

I've been putting off working for about two days, just absorbed by recent events, but I'm finally getting some work done tonight. I'm also listening to some ridiculous music, ironic covers of songs. Right now that means "Bitches Ain't Shit" by Dr Dre et al. covered by Ben Folds. It's ironic and funny, but I swear Ben Folds can make anything sound poignant.

Let's hope this productive period goes on for a bit.

Sunday, May 20, 2007

Baby Steps

Well, that took far longer than it was supposed to. I suppose that's not surprising, and it will take less time next time.

I've managed to get a basic Rails application together, created a development database, migrated in a schema and some test data. I learned some things about CSS and ERb. Basically the A iterations in Agile Web Development for Rails, tailored to the Crude Language project.

The next steps will be to make a nice stylesheet and make a start on the user controller.

Controllers

As you might imagine, I'm the type to jump headfirst into things. Sometimes that doesn't work out so well, as my recent experiences can attest. Still, I think a person does learn and accomplish things that way, even if some efforts are wasted.

I don't really know the best way to set up my controllers, so I figure I'll end up doing a lot of recoding. (How do you deal with all of that generated code from Rails if you make a mistake? That's got me a bit worried.) Here's the plan.

I'm going to start with a classic admin and user model. The admin will be able to add and delete languages and modules, build and edit modules, and upload and delete files. The user will be able to select a language and module, and play the game.

Later I'll add a design section, which may be a separate controller or an extension of the user controller. The designer will be able to add modules, build and edit modules, and upload files. The designer won't be allowed to add languages or delete anything (except maybe their own files or modules). I'll also need login capabilities for users, designers, and admin.

Data Diagrams

I'm feeling a small wave of confidence come over me after understanding something about migrations and how to work with multiple tables. I'm sure it will pass soon. (Note: wave of confidence passed before this entry was finished....) Here's my preliminary idea for the data structure.

Language
  • name
  • has_many: Modules
Module
  • language_id
  • name
  • level
  • number
  • has_many : Pages
Page
  • module_id
  • number
  • has_many: Units
Unit
  • page_id
  • transcription
  • speech_url
  • image_url
I think I'm also going to need tables for the images and the sound files. These will probably have to be more complicated later to allow for searches, but for now...

Speech
  • transcription # these are repeated from Unit because they don't store the same thing
  • sound_url # someone might want different text (eg, two different writing systems)
Image
  • image_url

Whew. I'm a little concerned about how I structured the main data, but I wanted to leave it as open as possible to make recoding easier later. This also seams to fall into the agile philosophy.