by Marilyn Mason |
![]()
for Haitian Creole - a Minority Language
Although research has been conducted by several institutes on how to process written text for minority and vernacular languages, no academic research project thus far seems to have produced a usable, functional, authoring or translation tool for end-user native speakers of these types of languages. On the other hand, a set of software programs has been in the making for twenty years outside of academia. After having worked in Haitian Creole (HC) documentation contexts, Marilyn Mason began the development of a software tool that could convert texts written in earlier HC orthographies to conform to the Institut Pédagogique National (IPN) orthography (i.e., the legal standard established by the Orthography Law of 1979). The system she developed works in conjunction with this 1979 Law that had established a core of fixed phonemic-to-graphemic rules along with a set of other writing rules for the use of apostrophes, hyphens, contractions, punctuation, capitalization, proper names, and nasalization in HC.
A benchmark test for this orthography conversion process was conducted back in 1991 with the HC Bible, which is one of the largest texts in existence for this language (Allen & Hogan, 1998; Mason, 2000 forthcoming), in order to validate the prototype system. Other reasons for the choice of the Bible as training text are explained in Mason (2000, forthcoming). Using the digitized HC Bible texts that took several person years of work to create by manual data entry, the initial orthography conversion experimentation process was developed within standard "over-the-counter" word processing software editing applications. After several years of development and testing, starting with the prototype model and arriving at the current fully-functional system called CreoleConvert, the process has matured from a semi-automated process taking 2 hours to convert a 250-page book to a fully-automated process requiring less than 2 1/2 minutes to convert that same 250-page book - without loss of formatting.
This process has been reduced to a single mouse click on a menu item in order to make it truly user-friendly for computer novices (Mason, 1999) who need to easily convert texts from one orthography to another (ie, Pressoir-Faublas text to IPN text, IPN text to McConnell text, etc). Numerous examples of text conversion for HC using CreoleConvert can be found at the following Web site: http://members.aol.com/mit2haiti/Index4.html?mtbrand=AOL_US [URL updated as of 06 Nov 2001 - AvO]
Another necessary step for the documentation workflow process was expanding the testing of CreoleConvert to new HC texts that had not been used to train the system. Also in 1991, research in optical character recognition (OCR) resulted in another prototype system that has also been improved over the years and has resul-ted in the current tool called CreoleScan. This tool can be used to scan and computerize printed HC texts of varying age and print quality that are produced by various writers and authoring teams.
Both CreoleConvert and CreoleScan work within standard software applications in the Macintosh, DOS and Microsoft Windows environments and have been demonstrated and test-marketed in Haiti, in Florida, in Seychelles, and elsewhere by Mason Integrated Technologies Ltd (MIT2). MIT2, based in Boston (Massachusetts, USA), is a start-up company formed to enable publishers, writers, educators, and governmental and non-governmental agencies within developing nations to quickly and efficiently standardize and computerize printed materials. This company fosters further research and development for broad-based delivery of such tools in Haiti, the Haitian Diaspora, and other French Creole speaking nations and languages for which this methodology has shown to be applicable.
Orthography conversion tools are not just necessary for minority languages. Even beyond vernacular languages, we know that some international languages - notably German, Dutch, Norwegian, Swedish, Greenlandic, Spanish - have recently undergone orthography modifications.
However, the majority of the world's languages, being minority and vernacular languages, have not been able to benefit from the advantages of the modern technological and computerized world. This is why Mason Integrated Technologies Ltd has been developing innovative technologies for minority languages, with plans to develop still more multilingual documentation technologies for HC and other French Creole languages. Without such modern computerized techniques and tools (e.g., spellchecker), the minority languages of today and tomorrow will suffer greatly and will be unable to meet the needs of the authoring and translation sectors that are so critical in a modern globalizing world.