Psion Word allows you to enter styled text on Psion palmtops. Back in the Series 3 palmtop’s heyday — the mid to late 1990s — Psion provided Windows software that converted Psion Word files to Microsoft Word. A separate DOS utility, wrd2txt
could also convert transferred files to plain text. This latter was the inspiration for word2text
, which I wrote to perform exactly the same task, but on a modern macOS or Linux machine.
I write either plain text or, when I require formatting, in Markdown format. Psion Word, created long before Markdown’s conception, doesn’t support it. However, styled files created in Word can be transferred to a Mac or a Linux machine, whereword2text
strips away Psion Word’s embedded file, formatting and printing related data and outputs the result as plain text.
Alternatively, word2text
can use Psion Word files’ styling information to mark up the processed text with Markdown formatting tags. This is necessarily limited: Word has only two standard headline sizes, and the only emphasis options relevant to Markdown are bold and italic text. Where these apply, word2text
will mark up the text accordingly. Text styled as a Word Bulleted List entry will be tagged as a Markdown unordered list.
For custom styles, word2text
will apply bold and/or italic emphasis where it can. For custom headlines, word2text
will apply a Markdown headline size based on the style’s font size.
In due course, I hope to support table formatting from Word tables and ultimately to more intelligently parse custom Word styles.
The Psion Series 3a uses the IBM Code Page 850 character set. This was used in the DOS days and was superseded first by Windows Code Page 1252 and ultimately UTF-8. 850 and 1252 are not exactly the same, and though Swift can convert from 1252 to UTF-8, it doesn’t speak 850. And compiled on Linux, the conversion code has issues with 1252 too, I have found.
word2text
converts the UK pound sign £
from 850 value 156 to 1252’s 163. At some point, I may extend this to the full set of matchable characters. If you have a character that is not being correctly converted (word2text
should report characters that have issues), please let me know.
Use word2text
to convert into plain text any Psion Word documents that you have transferred to your Mac. Provide a .WRD
file name as an argument and word2text
will output a plain text version to stdout. This way you can pipe the result into other command line utilities or redirect output to a file. For example:
word2text $HOME/Psion/MAGOPUS.WRD > ~/Desktop/MyMagnumOpus.txt
If you include the --file
flag, word2text
will write the processed text to (using the example above) MAGOPUS.txt
and not emit it to stdout.
You can also pass a directory name (or a mix of file names and directories), in which case each .WRD
file in the directory will be converted to a text file in that directory. Files generated this way (or with the --file
flag) are named after the source file.
word2text $HOME/Psion
word2text $HOME/Psion $HOME/Desktop/MY_DOC.WRD
For convenience, files are written using UTF-8 encoding.
word2text
accepts the following modifiers:
-m
/--markdown
— Output the body text in Markdown formatting. Default: false
.-o
/--outer
— Include ‘outer’ text, ie. header and footer text, in addition to the body text. Default: false
.-s
/--stop
— Stop processing multiple files on the first error. Default: false
.-f
/--file
— Output a single input file to a new file. Default: false
.-v
/--verbose
— Show file and content discovery information during file processing.For example:
word2text $HOME/Psion/BIDDRAFT.WRD
Bid headline
Bid text to go here...
word2text $HOME/Psion/BIDDRAFT.WRD -o
Our Bid for Major Project
-------------------------
Bid headline
Bid text to go here...
-------
Page %P
word2text $HOME/Psion/BIDDRAFT.WRD -m
# Bid headline
Bid text **to go here**...
You can view Word2text’s source code at GitHub.