Merging with git

Author Eric Albright | 12.06.2008 | Category Developers, WeSay

Git still doesn’t have good unicode support so to merge unicode files that git has labeled binary, I wanted to use a visual merger. Finally figured out how to do it — add the following lines to config:

[merge]
   tool = tortoise

[mergetool "tortoise"]
   cmd = \"TortoiseMerge.exe\" /base:\"$BASE\" /theirs:\"$REMOTE\" /mine:\"$LOCAL\" /merged:\"$MERGED\"

[mergetool "p4"]
   cmd = \"p4merge.exe\"  \"$BASE\" \"$REMOTE\" \"$LOCAL\" \"$MERGED\"

If you don’t have TortoiseMerge.exe in your path then you can replace that with the full path (c:/Program Files/TortoiseSVN/bin/TortoiseMerge.exe).

Upgrading user settings in C#

Author Tim | 10.06.2008 | Category Developers, WeSay

In the course of development we found it necessary to migrate an old user setting into a new one and to then remove it. This brought with it a few problems which I hope to shed some light on below.

In order to get the value of the old setting we used the Property.Settings.GetPreviousVersion() method. Initially we were getting a SettingsPropertyNotFoundException() although the setting was verifiably present in the user.config file. As it turns out we had removed the Property from the Settings designer which removed the Property in the Property.Settings class. In order for Settings to be found, they have to have a property that is tagged with the [UserScopedSettingAttribute] attribute. This tells the GetPreviousVersion() method to look for the setting in user.config. So far so good…

At this point however, the base.Upgrade() method is called to move old settings into the new file. This causes the old, unwanted setting to be moved in right along with all the old settings that we want to keep around. In order to avoid this behavior the [NoSettingsVersionUpgrade] attribute must also be used for the unwanted Property.

public override void Upgrade()
{
string lastConfigFilePath = (string) GetPreviousVersion(”LastConfigFilePath”);
base.Upgrade(); // bring forward our properties that are the
//  same (but also will bring forward LastConfigFilePath)
}

[UserScopedSettingAttribute]
[DebuggerNonUserCode]
[DefaultSettingValueAttribute(”")]
[Obsolete(”Please use MruConfigFilePaths instead”)]
[NoSettingsVersionUpgrade]
public string LastConfigFilePath
{
get
{
throw new NotSupportedException(”LastConfigFilePath is obsolete”);
}
set
{
throw new NotSupportedException(”LastConfigFilePath is obsolete”);
}
}

An enchant provider for LIFT

Author Eric Albright | 13.05.2008 | Category Developers, WeSay

We wanted to allow users to edit their dictionary and use that same dictionary for spell checking. Since WeSay uses LIFT as the file format for the dictionary and keeps that file up to date, all we needed was an enchant provider that can read LIFT files.

I took the spell checking engine I had written a while back, Ascens, and refactored it so that it could read files of various formats. Currently it supports line based and XML based formats. For line based formats, the words are entered one per line. For XML based formats, an XPath expression determines what text from within the file should be selected to constitute correctly spelled words.

Ascens looks for a settings file with the same name as the language identifier that is passed to enchant. Within the settings file, the location of the dictionary and the type of the dictionary are specified. If the type is xml then the xpath expression should be defined.

The following is an example settings file for Ascens referring to a Lift file:

# This is the settings file for Ascens
[Dictionary]
# Type is either xml or line
# for xml you also need to set the XPath
#Type=line
Type=xml

# path to the dictionary
# (can be absolute or relative to the directory that this file is in)
#Path=c:\documents and settings\user\my documents\dictionaries\fr_FR.dic
#Path=fr_FR.dic
Path=..\..\..\My Documents\WeSay\French\French.lift

# XPath gives the Xpath that selects the words to be used as dictionary
# it must all be on a single line
XPath=//entry[not(citation-form/form[@lang='fr'])]/lexical-unit/form[@lang='fr']/text | //entry/citation-form/form[@lang='fr']/text
# this xpath selects the forms with the language id of 'fr' from the
# citation form when there is one and from the lexical unit when
# there is no citation form (it will not select both)

Enchant looks for user Ascens settings files in the following locations:

  1. The ascens subdirectory of the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.
  2. %APPDATA%\enchant\ascens, where %APPDATA% is shorthand for the C:\Users\<username>\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\<username>\Application Data\ folder (Windows XP/2000).
  3. The enchant\ascens subdirectory of the directory value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Home_Dir, if there is one.
  4. %USERPROFILE%\enchant\ascens, where %USERPROFILE% is shorthand for the C:\Users\<username> folder (Windows Vista) or the C:\Documents and Settings\<username> folder (Windows XP/2000).

Enchant looks for shared Ascens settings files in the following locations:

  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\ascens\Data_Dir, if there is one. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\ascens\Data_Dir, if there is one.
  2. <enchant>\share\enchant\ascens, where <enchant> is the location of libenchant.dll.

WeSay Tests on Mono Status

Author Eric Albright | 22.04.2008 | Category Developers, WeSay

One step toward getting WeSay to run on the OLPC is to verify that it can run with Mono. WeSay Tests on MonoWe already reported all the System.Windows.Forms bugs that we could find by running MWF on Windows as documented here. The next step has been to run all the tests under Mono. As you can see from the diagram (that actually lives on our whiteboard) at left, we have found and fixed and reported quite a few bugs that have made the number of failing tests plummet. We’re still not there yet, but I’m making good progress.

Solid 0.8.5 Released

Author cambell | 07.03.2008 | Category Solid

A new release of Solid is now available for download.

The main changes are related to the default templates and the Lift export.

Full details of changes made in this release are available on the project site:

WeSay on OLPC

Author Eric Albright | 07.03.2008 | Category WeSay

We have gotten a lot of fixes into Mono recently and have now successfully got WeSay running on our OLPC.

WeSay on OLPC

There are still definitely visual issues. I will be checking to see if they are related to the dpi. So far, in the normal operation it seems responsive enough.

Formatting dictionaries with CSS

Author Eric Albright | 26.02.2008 | Category Typesetting, Dictionary, Developers

In evaluating CSS as a stylesheet language for formatting dictionaries, I started putting PrinceXML through its paces. I tried what I considered to be Cobuild dictionarythe hardest dictionary layout and while I think I have matched many of the features. The sidenotes are just not going to happen without specialized support for them in CSS. (The closest I could get was a float but of course if you have more than one within a line, they just write on top of each other). That result is here. I then switched to a more typical layout which had no problems at all. That result is here. You can get all the files to reproduce this exercise here.

Types of style

There are really a number items which contribute to the style of a dictionary:
  1. Selection of fields
  2. Order of fields
  3. Textual markup - characters or text that is added before, after, or around items to distinguish a field from surrounding text
  4. Character styles - font changes
  5. Paragraph styles
  6. Page layout - columns
CSS actually allows us to handle items 1, 3-5. (Selection of fields can be handled by setting the display property to none.) All the textual markup in the examples was done using the CSS content property.

CSS3 Selectors

Another interesting behavior of CSS 3 is that you cannot select the first element having a class containing the word ‘pronunciation’: .pronunciation:first-of-type You can only use the :first-of-type selector to select the first element with a particular name so a general div and span with class attributes would have to be converted to xml named elements instead. There is a way around this, given that our document will be generated from another format and that is to actually add classes first-of-type and last-of-type. Then the data becomes:

<span class="pronunciation first-of-type">...</span><span class="pronunciation">...</span><span class="pronunciation last-of-type">...</span>

and

<span class="pronunciation first-of-type last-of-type">...</span>

Playing with both the xml and xhtml varieties in IE7 and Firefox 2 shows that both do a much better job with the xhtml over xml.

Column-span

The only other problem I ran into was that Prince does not yet support the column-span property. This ended up not being a big problem since I just wanted the heading to span both columns and was able to work around this by making the first page of the section have a 12cm top margin and to float the heading into this space.

Configuring where Enchant looks for files

Author albright | 22.02.2008 | Category Spelling, Developers

So far, I have covered how to get started using Enchant and how to set up dictionaries. This post will cover more advanced concepts that let an application developer or a user take more control over Enchant.


Where Enchant looks for providers


Enchant looks for which providers are available when the enchant_broker_init function is called.


Providers can be installed on the machine for all users to use on the system or can be installed for only one user. If Enchant finds a particular provider as a system provider and as a user provider, the user provider is used.


Enchant looks for system providers in the following locations:



  1. The value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Module_Dir, if any

  2. Otherwise, the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Config\Module_Dir, if any

  3. Otherwise, in %enchant%\lib\enchant, where %enchant% is the location of libenchant.dll.


The provider location for the user is determined by:



  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.

  2. Otherwise, in %APPDATA%\enchant, where %APPDATA% is shorthand for the C:\Users\\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\\Application Data\ folder (Windows XP/2000).


How Enchant decides which provider to load for a given language


The provider that is used for a given language is determined by the provider ordering. This can be set programatically by using the enchant_broker_set_ordering function. Enchant initializes the ordering by looking in the enchant.ordering file. There is a system ordering file as well as a user ordering file. A user entry overrides a system entry.


Enchant looks for the system enchant.ordering file in the following locations:



  1. The value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Config\Data_Dir, if any

  2. Otherwise, in %enchant%\share\enchant, where %enchant% is the location of libenchant.dll.


Enchant looks for the user enchant.ordering file in the following locations:



  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.

  2. Otherwise, in %APPDATA%\enchant, where %APPDATA% is shorthand for the C:\Users\\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\\Application Data\ folder (Windows XP/2000).


If enchant doesn’t find any ordering files and the ordering is not overridden programmatically then the ordering is system dependent (but I think that means they will be ordered alphabetically by filename).


Where Enchant looks for Ispell dictionaries


Enchant looks for user Ispell dictionaries in the following locations:



  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.

  2. Otherwise, in %APPDATA%\enchant\ispell, where %APPDATA% is shorthand for the C:\Users\\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\\Application Data\ folder (Windows XP/2000).


Enchant looks for system Ispell dictionaries in the following locations:



  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Ispell\Data_Dir, if there is one.

  2. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Ispell\Data_Dir, if there is one.

  3. Otherwise, in %enchant%\share\enchant\ispell, where %enchant% is the location of libenchant.dll.


Where Enchant looks for MySpell dictionaries


Enchant looks for user MySpell dictionaries in the following locations:



  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Myspell\Data_Dir, if there is one.

  2. Otherwise, in %APPDATA%\enchant\myspell, where %APPDATA% is shorthand for the C:\Users\\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\\Application Data\ folder (Windows XP/2000).


Enchant looks for system Ispell dictionaries in the following locations:



  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Myspell\Data_Dir, if there is one.

  2. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Myspell\Data_Dir, if there is one.

  3. Otherwise, in %enchant%\share\enchant\myspell, where %enchant% is the location of libenchant.dll.


Where Enchant looks for the Aspell library


Enchant looks for the aspell-15.dll using the following locations:



  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Aspell\Module, if there is one (this value should include the filename and not just the path).

  2. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Aspell\Module, if there is one (this value should include the filename and not just the path).

  3. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Aspell\Path, if there is one, as the path to find aspell-15.dll (this is set by the Aspell installer for Windows).

  4. Otherwise, in the same directory as libenchant_aspell.dll.

  5. Otherwise, it uses the normal Windows search strategy, which includes looking in the path.

Setting up dictionaries for Enchant

Author albright | 21.02.2008 | Category Spelling, Developers

In my last post, I gave some tips for getting started with Enchant but you really can’t get anywhere until you have properly configured the providers and installed some dictionaries.


ASpell


The ASpell provider for Enchant requires aspell-15.dll. The easiest way to get started with ASpell is to use the installer for ASpell and for dictionaries.



  1. Be sure you have the ASpell provider (you can list it with enchant-lsmod) libenchant_aspell.dll

  2. Download the installer and run it to install ASpell.

  3. Download a dictionary installer from here and run the installer.

  4. Verify that it has been installed correctly by running enchant-lsmod.exe -list-dicts. You should see something like: en_US (aspell) but with the language code for the language you installed instead of en_US

  5. You can also test it using enchant -d en_US -a (again using the language code for the language you installed). Then you can type words which are or aren’t in the dictionary and see suggestions when they aren’t.



    It is possible to use ASpell by including the aspell-15.dll in the same directory as libenchant_aspell.dll or it can be somewhere in the path. If you install aspell using the Windows installer, it will write a registry entry that points to where it was installed and Enchant will use that to find the dependency.


    MySpell/Hunspell (OpenOffice format)


    Enchant doesn’t require any additional dependencies other than the MySpell provider for MySpell dictionaries but it does require you to copy the dictionary files to the right place.



    1. Be sure you have the MySpell provider (you can list it with enchant-lsmod) libenchant_myspell.dll

    2. Download a dictionary that you want: You can get any of the dictionaries from OpenOffice.org.

    3. Unzip (or otherwise uncompress the package) and copy the contents into %APPDATA%\enchant\myspell (you may need to create the enchant and myspell directories the first time).

      %APPDATA% is shorthand for the C:\Users\\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\\Application Data\ folder (Windows XP/2000). But you can type %APPDATA% in the explorer’s address bar and it will go to the right place.



    4. Verify that it has been installed correctly by running enchant-lsmod.exe -list-dicts. You should see something like: en_US (myspell) but with the language code for the language you installed instead of en_US

    5. You can also test it using enchant -d en_US -a (again using the language code for the language you installed). Then you can type words which are or aren’t in the dictionary and see suggestions when they aren’t.


    Note: if you install MySpell and ASpell dictionaries for the same language, the ASpell dictionaries will be used instead of the MySpell dictionaries (this can be changed but I’ll leave that for another post)


    If you are feeling really adventurous and would like to create your own, you can see the directions here.


    ISpell


    Enchant’s Ispell provider also doesn’t have any dependencies (the dictionaries are read directly by Enchant).



    1. Be sure you have the ISpell provider (you can list it with enchant-lsmod) libenchant_ispell.dll

    2. Download a dictionary from here (at the bottom of the page).

    3. Unzip (or otherwise uncompress the package) and copy the contents into %APPDATA%\enchant\ispell (you may need to create the enchant and ispell directories the first time).

    4. Verify that it has been installed correctly by running enchant-lsmod.exe -list-dicts. You should see something like: en_US (ispell) but with the language code for the language you installed instead of en_US

    5. You can also test it using enchant -d en_US -a (again using the language code for the language you installed). Then you can type words which are or aren’t in the dictionary and see suggestions when they aren’t.


    Empty dictionaries


    An easy way to get spell checking for a language that doesn’t have a dictionary, is to create an empty MySpell dictionary. First, decide on the language code to be used. (You should use the iso639 code or the ietf language tag, for our example we will use qaa, the first of the private use language codes, as the language code). There are two files that are required, the affix file, qaa.aff, and the dictionary file, qaa.dic. They should both be put in %APPDATA%\enchant\myspell.


    The qaa.aff file should contain the following line:


    SET UTF-8


    The qaa.dic file should contain the following line (it’s a zero, the number of items in the dictionary):


    0


    Of course, you won’t have any items in your empty dictionary so all the words will be marked as misspelled. As you add items to the dictionary using Enchant, the words will be stored in %APPDATA%\enchant\qaa.dic.

    Using Enchant in a Windows App: Getting Started

    Author albright | 20.02.2008 | Category Spelling, Developers

    The following are notes toward getting started with incorporating Enchant into a Windows app.


    Enchant is a spell-checking framework that allows you to use many different spell-checking backends, including Aspell, Hunspell, and Ispell.


    You can get the source here.Building using MSVC is not difficult once all the dependencies are provided. The full build notes are here.


    If you don’t want to bother with building it yourself, you can get binaries here.


    libenchant.dll is the main library. It uses backend adapters for the providers: libenchant_aspell.dll, libenchant_ispell.dll, and libenchant_myspell.dll to proxy spell checking requests. (There are others available but if you want others, you will have to build it yourself.) There is also a .Net binding (Enchant.Net.dll) that can sit on top of libenchant.dll. libenchant_aspell.dll only works if you have aspell installed as well. If aspell-15.dll is not in your path, you must specify the dll file location in the registry key: HKCU or HKLM \Software\Enchant\Aspell\Module


    By default, the providers (the backend adapters) are put into the subdirectory lib\enchant underneath the location of libenchant.dll


    By default, you put dictionaries (like ispell and myspell) into the user’s appdata\enchant\[Provider Name] where [Provider Name] is MySpell or ISpell (But aspell get’s its dictionary from its installation location)


    You can check your setup by running enchant-lsmod.exe. It will list the providers it finds and the dictionaries as well.


    I’ll add more later.