Sentry Spelling Checker Engine - Support |
Home Site index Contact us Catalog Shopping Cart Products Support Search |
You are here: Home > Support > Sentry Spelling Checker Engine > Most or all words reported as misspellings
Product: Sentry Spelling Checker Engine Source SDK (and Windows SDK with Core API only)
Problem: When checking text, the Sentry engine reports all or most words as misspellings.
Note: If your application uses the Sentry Windows SDK and does not call SSCE_OpenLex directly, please see Most or all words reported as misspelled (Windows SDK).
The main American English dictionaries are shipped as two files: ssceam.tlx and ssceam2.clx (substitute "br" for "am" in the file name for British English, and "ca" for "am" for Canadian English). The first file contains about 1,000 of the most commonly used words. The second file contains about 100,000 of the most commonly used words. Words in the first file are also present in the second file.
If the Sentry engine accepts words such as "the" or "of" but doesn't accept less common words such as "engine" or "improvement", the most likely cause is failure to open the ssceam2.clx file. If Sentry rejects all words including very common words such as "the" and "of", the most likely cause is failure to open any dictionary files. This problem is almost always the result of a configuration error on your system. Your application passes the name of the dictionary file to open to SSCE_OpenLex. In response, the Sentry engine asks the operating system to open the file. If the operating system is unable to open the file for any reason (i.e., it returns an error code), the Sentry engine will not be able to open the file, either. Because the dictionary cannot be opened, its contents cannot be loaded into memory, and any words contained within the dictionary file will be reported as misspelled. Typical reasons the operating system will be unable to open the dictionary file include:
The dictionary file name or part of the path name passed to SSCE_OpenLex is misspelled. For example, ssceam2.clx is contained in directory /myapp/dicts on the run-time system. The path name your application passes to SSCE_OpenLex is "/myapp/dict/ssceam2.clx", or "/myapp/dicts/ssecam2.clx" (look carefully to spot the misspellings!).
The path name passed to SSCE_OpenLex is syntactically incorrect. For example, the path passed to SSCE_OpenLex is "/myapp/dicts//ssceam2.clx", and the operating system considers the "//" in the path name to be a syntax error. This can sometimes happen when concatenating the dictionary file name onto a path name, and dictionary file name begins with "/" and the path name ends with "/". Another possible hard-to-spot cause is spaces at the beginning or end of the path name passed to SSCE_OpenLex. Some operating systems open the passed file name literally, resulting in a file-not-found error.
The indicated dictionary file is not contained in the directory indicated by the path name passed to SSCE_OpenLex. For example, ssceam2.clx is contained in directory /myapp/dicts on the run-time system. The path name your application passes to SSCE_OpenLex is "/myapp/dicts/am/ssceam2.clx". Another possibility: Your application passes "/myapp/dicts/am/ssceam2.clx" to SSCE_OpenLex, but you forgot to actually copy the dictionary files to that directory, or one or more of the dictionary files were subsequently deleted for some reason.
If the operating system supports permissions, the permissions assigned to the dictionary files or the directories in the dictionary files' path may prevent the application from opening them. The Sentry engine is a software library used by an application. When used by your application, it essentially becomes part of your application, and has exactly the same capabilities and authority. If your application can't open a file, the Sentry engine won't be able to, either.
An important point to consider is that the Sentry engine passes the file name you pass to SSCE_OpenLex to the operating system verbatim. If you pass "/myapp/dict/ssceam2.clx" to SSCE_OpenLex, the Sentry engine asks the operating system to open "/myapp/dict/ssceam2.clx". If the operating system returns an error code, the Sentry engine returns an error code to your application and stops attempting to open the dictionary file. It doesn't search for the dictionary file in other directories, try different spellings of the dictionary file name, or magically blast through the operating system's permissions to open the file at all costs. The Sentry engine merely tries to open the file your application asks it to open, and if it can't, it fails.
Another, though less common, cause of dictionary open failure is file corruption. ssceam2.clx (and all dictionary files with a .clx extension) are binary files. If these files are transferred from one system to another using "ASCII" or "Text Mode" file transfer, they will be corrupted. If the dictionaries were once openable but the Sentry engine suddenly started returning errors on attempts to open them, and no other changes took place on the run-time system, it's possible that the dictionary files became corrupted through file system corruption or some other cause. Replacing the files should solve the problem.
A problem related to corruption is attempting to open a dictionary file which is in a format inappropriate for your version of the Sentry engine. This usually happens only with compressed dictionary files (which have a .clx extension). The "2" in the dictionary file name (e.g., ssceam2.clx) indicates the internal format version of the dictionary file. A "2" format version is compatible with the Sentry engine version 5.1 and later. If you are using, for example, version 5.15 of the Sentry engine, the .clx files you use should have names like sscexx2.clx. When you purchase supplemental dictionaries from Wintertree Software, dictionary files in both the current "2" format version and the older "1" format version will be included. For example, Wintertree Software's French dictionary product includes files sscefr1.clx and sscefr2.clx. You must use the file appropriate to your version of the Sentry engine. If you attempt to open sscefr1.clx with a Sentry engine version 5.1 or later, the open will fail. Similarly, if you attempt to open sscefr2.clx with Sentry engine version 4.22 or earlier, the open will fail.
If the dictionary files open successfully (SSCE_OpenLex returns >= 0) but many words are still reported as misspelled, a potential cause is passing a relative path name to SSCE_OpenLex. Under some circumstances, the Sentry engine will re-open a compressed (.clx) lexicon file while checking spelling or looking up suggestions to load parts of the dictionary not cached in memory. To re-open the file, it uses the file name passed to SSCE_OpenLex. If the path to the dictionary file is relative, and the calling application's current directory has changed, the attempt to re-open the file may fail. As a result, the needed parts of the dictionary are not loaded into memory, and the words contained within those parts are reported as misspelled. The path names passed to SSCE_OpenLex should therefore always be absolute (i.e., should start from the file system's root directory).
If SSCE_OpenLex returns SSCE_BUFFER_TOO_SMALL_ERR, the memBudget parameter is too small. Try increasing it. Unfortunately, there is no easy way to determine the minimum allowable memBudget parameter for a given dictionary file.
Here are some things you can try to diagnose the cause and solve the problem:
Always test the return value from SSCE_OpenLex. This should help you to narrow down the specific dictionary files which could not be opened.
In your application, call SSCE_SetDebugFile before opening any lexicons. The Sentry engine will record diagnostic information in the file you pass to SSCE_SetDebugFile, including the names of dictionary files it tries to open and the results of those attempts. Sometimes by examining the contents of the diagnostic file you may be able to spot the problem, such as a malformed or misspelled path.
Try opening the lexicon file in your application to ensure that it can be opened. For example, if SSCE_OpenLex returns an error code, try opening that file explicitly. This is particularly relevant if SSCE_OpenLex returns a file-oriented error code such as SSCE_FILE_NOT_FOUND_ERR or SSCE_FILE_OPEN_ERR. For example, if your application calls SSCE_OpenLex like this:
rv = SSCE_OpenLex(sid, lexFile, 0);
insert the following call (or something similar) right after:
fp = fopen(lexFile, "rb");
If your attempt to open the file fails, the Sentry engine will not be able to open the file either, so you must solve that problem first. If your attempt to open the file succeeds, you know that the file exists, the path is correct, and permissions permit access, so the problem may be caused by file corruption, wrong dictionary file version, etc.
Source SDK version 4.22 and earlier: If you have compiled the Sentry source code and are attempting to use the compressed lexicons (*.clx) files which ship with the Sentry Windows SDK, you may run into problems. The lexicons may open successfully, but attempts to look up words in them may fail. The compressed lexicons contain binary numbers and data structures. The binary numbers were written using "little endian" (least significant byte first) ordering. The data structures were written with 2-byte member alignment. If you run the Sentry engine on a computer which uses "big endian" (most significant byte first) byte ordering, the binary numbers in the lexicon will be garbled. If you compiled the Sentry engine with structure member alignment set to something other than 2 bytes, the structures in the compressed lexicon will be corrupted when read. The most straightforward solution is to compile SQLEX or WINSQLEX for your target computer then recompress the lexicons. If your target computer uses little-endian byte ordering (e.g., Intel), you can also recompile the SSCE source code using 2-byte structure member alignment.
Copyright © 2015 Wintertree Software Inc.