[ %0 | [1|2|3|4|5|6|7|8|9] [%0|1|2|3|4|5|6|7|8|9]* ]
denotes the infinite set {0,1,2, ... 10, 11, ... 1998, ...}. It
excludes expressions with leading zeros such as 01.
The corresponding set of numerals in English and in other languages, {zero, one, two, ... ten, eleven, ... one thousand nine hundred ninety-eight, ...}, is of course also a regular language.
This excercise is about defining the two numeral languages simultaneously as a relation so that each decimal number is linked to its counterpart in a natural language.
The attached script,
/mnt/linc/ftp/pub/cis639/public_html/assign/numbers.xfst
creates a transducer that maps decimal numbers (up to a billion) into English numerals, and vice versa. Run it with the command
xfst -l numbers.xfst
Here is an example of the kind of output it produces when applied:
apply down> 1998
one thousand nine hundred ninety-eight
apply up> twelve thousand three hundred forty-five
12345
Your task (by Monday) is to
ninety-eight
into
quatre vingt dix huit
and vice versa. An English/German transducer must flip the
tens and the ones:
acht und neunzig
Here is a script for English:
########################################################################
# Copyright (c) 1998 Xerox Corporation Lauri Karttunen
#
# This script creates and leaves on the stack a transducer that
# maps decimal numbers (up to a billion) into English numerals
# and vice versa.
#
define I [1|2|3|4|5|6|7|8|9]; # First define some useful classes.
define Z [%0 | I];
define N [ I (Z (Z)) [0:%, Z Z Z]* | %0 ];
define Lim [.#. | %,] ;
define NonZero [Lim [~$[%,] & $[I]]];
# Now define the component relations.
define Space [ [. .] -> " " || Lim I _ [[Z Z] - [%0 %0]]] ;
define Hundred [ [. .] -> % h u n d r e d || Lim I _ (% ) Z Z];
define Teens [1 %0 <-> t e n,
1 1 <-> e l e v e n,
1 2 <-> t w e l v e,
1 3 <-> t h i r t e e n,
1 4 <-> f o u r t e e n,
1 5 <-> f i f t e e n,
1 6 <-> s i x t e e n,
1 7 <-> s e v e n t e e n,
1 8 <-> e i g h t e e n,
1 9 <-> n i n e t e e n || _ Lim] ;
define Tens [2 <-> t w e n t y,
3 <-> t h i r t y,
4 <-> f o r t y,
5 <-> f i f t y,
6 <-> s i x t y,
7 <-> s e v e n t y,
8 <-> e i g h t y,
9 <-> n i n e t y || _ (%-) Z Lim ] ;
define Dash [ [. .] -> %- || I _ I Lim ];
define Ones [1 -> o n e, 2 -> t w o,
3 -> t h r e e, 4 -> f o u r,
5 -> f i v e, 6 -> s i x,
7 -> s e v e n, 8 -> e i g h t,
9 -> n i n e || Lim _ , _ Lim ];
define Zero [ %0 -> z e r o || .#. _ .#. ];
define AbsorbZeros [ %0 -> [ ] || ? _ , _ ? ];
define Thousand [ %, -> %, t h o u s a n d %, || NonZero _ Z Z Z .#. ] ;
define Million [ %, -> %, m i l l i o n %, || _ Z Z Z %, Z Z Z .#. ] ;
define CommaToSpace [ %, -> " "] ;
# Now build the transducer!
read regex [ N
.o.
Million
.o.
Thousand
.o.
Space
.o.
Hundred
.o.
Teens
.o.
Dash
.o.
Tens
.o.
Ones
.o.
Zero
.o.
AbsorbZeros
.o.
CommaToSpace ];
echo
echo >> Testing 'apply down 1998'
apply down 1998
echo
echo >> Testing 'apply up two hundred nineteen'
apply up two hundred nineteen
echo