Thursday, May 24, 2007

Idea for a Transliteration Editor

Another neat little application would be a transliteration editor for foreign scripts. One window would allow you to type a foreign word in roman letters and the other window would produce the same thing in the script associated with that language.

As an example:
You type: konichiwa
You see: こにちわ

The major point is that it would allow you to write these scripts on a roman keyboard. I don't think the differences between, say, US and Spanish keyboards would be important.

I think it would work fairly easily for Arabic, Hebrew, Cyrillic, Hindi, Japanese hiragana and katakana, Klingon, tengwar, and other alphabetic or syllabic scripts. Mandarin, Cantonese, kanji, Ancient Egyptian, Sumerian, and such might be somewhat more difficult.

Update: Apparently, the Russian word processor Hieroglyph has this capability in cyrillic.

Update 2: Someone at Microsoft has written a transliteration utility very like the one I described। It's free, has an odd selection of scripts, and appears to be fully customizable.

Update 3: A commentator pointed out to me that there's on-the-fly transliteration into Hindi on Blogger. It seems to work really nicely (although I don't read Hindi, so I can't really vouch), and the interface is really smooth and transparent, better, probably, than my idea (which is more or less identitical to the Microsoft utility) for most languages.

Unicode support

I have had a hell of a time with it, but I have finally managed to get the mysql database set up to use unicode. Unicode is a character encoding, like ASCII, but 32 bit instead of 8 bit, allowing it to store the character sets for foreign scripts and mathematical symbols that won't fit into an 8 bit scheme. It's essential for an application like this one that relies on being able to use foreign scripts.

This ought to be easy, but it isn't. The first problem is that the underlying language, Ruby, doesn't play nice with Unicode yet. The second problem is that while Rails is a lovely agile framework, mysql isn't really. Even using migrations, there doesn't seem to be any nice way of changing an aspect of the entire database like this.

Still, it does work. I also set up the database for the rest of the data structure. I did decide to include one more class, a Square, which holds one image, one sound file, and a transcript. That way, a page has a collection of squares, which seemed to me to be a nice, open, expandable data structure. I guess I'll find out how nice that is.

Update: Tested it with tengwar, elvish letters in the unofficial ConScript Unicode Registry. If you have a font that supports it installed, you should see the following as script instead of question marks or something:

                             

Wednesday, May 23, 2007

The user page

This is a rough setup for the opening user page -- what you'd see when you get to the site, more or less. Obviously the layout needs some work, but I'm not worrying too much about design at this point as long as the back end works

Not much going on here, just a layout, a style sheet, a page, and the languages listed in alphabetical order. I suppose I ought to fix that on the admin page as well, neh?

I'm not sure about the name either. It just came into my head when I started the blog. I think it's a good name for the software, but maybe not for a site. Anyway, lots of time to think about that later.

CSS on Admin Page

Here's the screenshot of the opening admin page listing the languages. I know, it's very exciting, but this is my first time writing a CSS file so I'm happy with it. Took my time picking out the colors, so no lip now.

I've been putting off working for about two days, just absorbed by recent events, but I'm finally getting some work done tonight. I'm also listening to some ridiculous music, ironic covers of songs. Right now that means "Bitches Ain't Shit" by Dr Dre et al. covered by Ben Folds. It's ironic and funny, but I swear Ben Folds can make anything sound poignant.

Let's hope this productive period goes on for a bit.

Sunday, May 20, 2007

Baby Steps

Well, that took far longer than it was supposed to. I suppose that's not surprising, and it will take less time next time.

I've managed to get a basic Rails application together, created a development database, migrated in a schema and some test data. I learned some things about CSS and ERb. Basically the A iterations in Agile Web Development for Rails, tailored to the Crude Language project.

The next steps will be to make a nice stylesheet and make a start on the user controller.


As you might imagine, I'm the type to jump headfirst into things. Sometimes that doesn't work out so well, as my recent experiences can attest. Still, I think a person does learn and accomplish things that way, even if some efforts are wasted.

I don't really know the best way to set up my controllers, so I figure I'll end up doing a lot of recoding. (How do you deal with all of that generated code from Rails if you make a mistake? That's got me a bit worried.) Here's the plan.

I'm going to start with a classic admin and user model. The admin will be able to add and delete languages and modules, build and edit modules, and upload and delete files. The user will be able to select a language and module, and play the game.

Later I'll add a design section, which may be a separate controller or an extension of the user controller. The designer will be able to add modules, build and edit modules, and upload files. The designer won't be allowed to add languages or delete anything (except maybe their own files or modules). I'll also need login capabilities for users, designers, and admin.

Data Diagrams

I'm feeling a small wave of confidence come over me after understanding something about migrations and how to work with multiple tables. I'm sure it will pass soon. (Note: wave of confidence passed before this entry was finished....) Here's my preliminary idea for the data structure.

  • name
  • has_many: Modules
  • language_id
  • name
  • level
  • number
  • has_many : Pages
  • module_id
  • number
  • has_many: Units
  • page_id
  • transcription
  • speech_url
  • image_url
I think I'm also going to need tables for the images and the sound files. These will probably have to be more complicated later to allow for searches, but for now...

  • transcription # these are repeated from Unit because they don't store the same thing
  • sound_url # someone might want different text (eg, two different writing systems)
  • image_url

Whew. I'm a little concerned about how I structured the main data, but I wanted to leave it as open as possible to make recoding easier later. This also seams to fall into the agile philosophy.

Saturday, May 19, 2007

Ruby vs Python

I'm new to web programming.

Sure, I built my first web page more than ten years ago, but I haven't really progressed much from there. I have used many different languages: Python, C, C++, various flavors of Basic, Java, Fortran, whatever, but (almost) always for desktop or multiprocessor work. But this is a new beast for me. I'm essentially a C++ programmer, specializing in numerical processing, graphics, and physics modeling.

It took a while to learn enough to make some kind of decision on which languages and frameworks to use, and I'm quite certain that there were several different viable choices.

The first big choice was Ruby or Python. I didn't really consider Java because speed for the code isn't really much of an issue with this kind of application, but speed of development is. I'd done a little code in Python at various times, and I've liked it, especially as my basis of comparison was more or less C++. I have an interest in using Python or Ruby as a scripting language for C/C++ applications. I think I'm sold on Python for doing most scientific numerics because of SciPy and pyGlobus.

Still, I sat down with the Pickaxe, the canonical beginning Ruby book by Dave Thomas with Chris Fowler and Andy Hunt. You can get the entire first edition online for free. I really liked the design and philosophy of Ruby. One can argue about this kind of thing, but it seemed to me that Ruby was more beautiful, more powerful, and more object-oriented than Python. Of course, Ruby is still kind of a new language, and it doesn't have anywhere near the amount of supporting libraries that Python has.

You know how this story turns out. I started looking into Ruby on Rails. The hype is incredible and not to be believed, I'm sure. I liked a lot of the methodology, especially the emphasis on agile development, which is part of what had drawn me to scripting before. It gave me an excuse to go with my bias toward Ruby, as well as being a potentially marketable skill. Not being a web programmer, I wasn't sure where to start with this, but then Dave Thomas (et al.) appeared with a new book, Agile Development with Rails, that looked like something I might be able to handle.

So here I go. I've built one practice Rails app, but this'll be my first real project in it. I know next to nothing. This will be a good test of how easy (and fun) Ruby on Rails can make web programming. I'll let you know here how it turns out.

How do you teach language online?

The real trick with an application that relies on user-generated material is to ensure some kind of overall quality without sacrificing individual creativity. My plan is to design the framework in such a way as to encourage certain general principles in the teaching while allowing more or less free choice of material.

In addition to the design, I'm planning on providing some examples for people to work from, which will also have a nice effect of making it easier for people to get started with a language. I'd also like, eventually, to set up moderators for some of the languages to make some modules and pick out the best from the user-generated content.

Through my experience with language classes, books, and computer programs, I've come up with some design principles for the app. Tell me what you think.

Language independent and immersive

The language module should ideally only refer to the language that's being learned. There are several advantages to doing it this way.
  • The application can be used by anyone, regardless of their native language.
  • The learner associates words and phrases directly with the idea rather than translating.
  • It's a more intuitive, as opposed to an analytical, approach.
  • It allows the learner to really discover the language which leads to better retention. Plus it's more fun that way.

I want to incorporate the idea of flow used in games. The idea is to keep a person at the point where they can just reach the challenge in front of them. If it's too far out of reach, or if it's too easy, it isn't fun.

The presentation of the modules is as a game, and I want it to be fun. I'd like to encourage people to have fun and be creative with the modules. One idea I think would be nice is if the modules had distinct characters and voices so that you could follow along and discover more about them as you move through the modules.


One of the common criticisms of language learning applications is that they don't teach the kind of conversational language you need just to get around. This has been true in my own experience. I'd like the lowest level modules to give the script and this background.

Statement of Purpose

My goal here is to create an application that provides a framework for learning languages on the web. There are two major parts to the application: a section that presents the language modules in an interactive environment to learn a language, and a section that allows the user to construct those modules. The idea is to give people a framework that they can use to present their own language.

The major goal of the blog is to give myself some structure for the project. If I manage to get some good advice, or if my experiences can help some other person to develop a project, or if the blog can help promote the future site, that'd be great too.