Jump to content
xisto Community
Sign in to follow this  

Localizing Python Some hacking to force python accept more natural languages words

Recommended Posts

Overall idea about this topic was inspired by my childrens. Some time ago I started to teach them programming. As a base language I choose Python. So the next step was an attractive subject to program examples and exercises . I create some "robot" program. There is nothing new in it: yet another logo-style game. (See attacments) The "robot" is actually lorry which moves at squire cells, concrete or ground and can't move through walls. It understands commands like "forward", "backward" and so on. All sounds good as for now. We started from program like this:

#!/usr/bin/python# -*- coding: utf-8 -*-from robot import Robotrobot=Robot('z4-4.maz')while robot.clear_forward(): robot.forward(1)robot.right()while robot.ground(): robot.backward(1)robot.forward(1)while robot.ground(): robot.plant() robot.forward(1)robot.forward(1)

But, as you already noticed, I am not native english speaker. When I speak about this program with childrens I forced continuosly switch ukrainian-english and back again and again. So topic arised: can Python understand ukrainian? Or in global context: any language other then english? I devide the overall task into three different in size and effort parts: 1) make python allows international symbols as identifiers: variables, classes, functions names. 2) translate parts of most used keywords and functions. 3) full python localization. Part One. As everybody knows characters represented inside computers as one or more bytes. Historically first widespread charset was ASCII coding standart. It declairs first 127 of 255 possible values of one byte. Later was introduced other 8-bit coding systems, such as latin1, koi8, win1252 etc. And finally was invented unicode, especially in utf-8 form. Utf-8 deffer from other unicode standarts by unique features: it compliant with ASCII code in first 127 values, and all other chars encoded by values from 127-255 vector, so no clash can occure with ASCII signs, digits and such. Let's start. Download latest python 2.5 source: wget -c  www.python.org/ftp/python/2.5/Python-2.5.tar.bz2 Extract archive: tar -xjf Python-2.5.tar.bz2 cd Python-2.5 Pyton checks for allowed symbols in tokenizer module: Parser/tokenizer.c. Open it with your favorite editor and locate first call of function isalpha(), it looks like:

/* Identifier (most frequent token!) */ if (isalpha(c) || c == '_') {

All we need to do is to allow symbols above 127 (hex 7F), so edit line to get something like:

/* Identifier (most frequent token!) */ if (isalpha(c) || c == '_' || c > 0x7F) {

Next point is 20 line firther, search for isalnum() function, lines look like:

while (isalnum(c) || c == '_') { c = tok_nextc(tok); }

Do exactly edentical edit:

while (isalnum(c) || c == '_' || c > 0x7F) { c = tok_nextc(tok); }

Thats all! Just do regular things like conigure, make, install:

./configuremake make install

Type /usr/local/bin/python and you'l get python interpreter wich allows all umlauts, accents and cyrillics. Part Two. .... to be continued: hack python grammar - translate keywords ("def", "while", "if") and builtin functions ("range")...

Edited by OpaQue (see edit history)

Share this post

Link to post
Share on other sites

kl, i am thinking about teaching myself python but i want to focus on getting the languages i know perfect before starting a new one. Ill give it a week before i start more :)The Thing is though, you have some coding which won't work in other languages like in AS (action script) you have this codingOn (press) gotoandstop(2)And in some languages they don't have a world for "on" by itself so it would be like and on in the word.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.