Sentry Spelling Checker Engine for Java |
Home Site index Contact us Catalog Shopping Cart Products Support Search |
You are here: Home > Support > Sentry Spelling Checker Engine > Java SDK > Problems with locating suggestions
Product: Sentry Spelling Checker Engine Java SDK
No suggestions are are given by the SpellingSession.suggestion() method or displayed in the spelling dialog when a misspelled word is detected.
The correct spelling does not appear in the list of suggestions.
A number of factors affect the success of Sentry's suggestion feature, and it can be difficult to determine in advance what conditions are most likely to yield success. Some of the factors that influence the suggestion feature are listed below.
Suggestion depth: The suggestion depth is an integer value ranging from 0 to 100. The value determines the tradeoff between time to locate suggestions and the quality of the suggestions produced. A value of 100 produces the best (most accurate) suggestions but takes the most time. Unfortunately, the factors listed below affect the depth needed to locate the best suggestion, so it can be difficult to determine in advance. For some misspellings, the correct suggestion will be located at depth 10 and for others depth 100 is needed. The example dialog boxes in AWT and Swing examples included with Sentry Java SDK use an interesting approach to deal with this. Initially, the suggestion list in the spelling dialog box is populated using the depth indicated by the MinSuggestDepth value, which is typically fairly low (e.g., 30). This produces suggestions quickly, and, in many cases, produces the correct suggestion. The spelling dialog boxes contain a "Suggest" button. When this button is pressed, the suggestion depth value is increased by 10, and a new set of suggestions is obtained. The user can repeatedly press the Suggest button to obtain suggestions at increasingly higher depths (until the maximum depth, 100, is reached, at which point the Suggest button is disabled).
Word length: Locating suggestions is pattern matching. Short words have less information available to ensure a successful match. Furthermore, the small number of letters means less letter diversity, which essentially means that short words tend to look more alike than do longer words, decreasing the probability that the correct match can be made.
Number of errors: The more incorrect letters a word has, the less it looks like the correct word and the more it may look like other words.
Position of errors: When errors occur near the beginning of a word, a higher suggestion depth may be needed to locate the correct spelling.
Similarity to other words: Words that have letters in common with many other words may result in so many suggestions of equal value that the correct spelling is discarded. This can happen because the Sentry engine locates the best set of suggestions. The size of the set is determined by the calling application (the value passed to the SuggestionSet constructor). For example, if the set can hold 16 words, but 17 words are found that make equally good replacements for the misspelled word, it's possible that the word left out will be the correct word.
Algorithm: The phonetic suggestion algorithm (implemented in EnglishPhoneticComparator) has a better chance of finding the correct spelling for badly misspelled words (e.g., where half of the letters are incorrect) that the typographical algorithm (implemented in TypographicalComparator). However, the phonetic algorithm is sometimes confused by certain combinations of letters. The typographical algorithm works very well with words containing one or two errors. Better results may be achieved by using both the phonetic and typographical algorithms (with some increase in the time required to locate suggestions).
Copyright © 2015 Wintertree Software Inc.