XFST (Xerox Finite State Tool) is a tool included with the book Finite State Morphology
, by Karttunen and Beesley, and information about the tool is available at http://www.fsmbook.com
From the XFST documentation
XFST can read finite-state networks from binary files and compile them from regular expressions and text files. The networks can be simple networks or finite-state transducers. They can be combined by means of a variety of operations, such as union and composition. The resulting networks can be saved into a binary or a text file. The user may apply a network to strings to determine whether the string is accepted by the network or to transform it to another string if the network is a transducer. XFST provides many ways to get information about a network and to inspect and modify its structure.
- 26 Mar 2005
Unicode in XFST under WinXP
Use Notepad for all file editing, be certain to save files as "UTF-8", not "Unicode," and not "Unicode big endian."
Start XFST from the command line with the "-utf8" flag.
This will change the char-encoding variable from ISO-8859-1 to UTF-8. If you need to switch between encodings, use the set
command as below:
xfst: set char-encoding utf-8
xfst: set char-encoding ISO-8859-1
I've not had any success outputting clean UTF-8 from XFST.
- 10 Dec 2004
Topic revision: r5 - 2005-03-26 - 00:11:39 - RyanMattson