Page 1 of 1

Word List

Posted: Thu Jan 24, 2013 10:04 pm
by logan
New feature: you can now browse a list of every word (5-25 letters long) in every source indexed by Genealogy Indexer, by clicking on the "Words" link in the top right or going directly to genealogyindexer.org/words. You can browse by letter (Latin, Cyrillic, Hebrew) or enter a word in the search box there to jump to that part of the word list. When viewing the word list, you can click on any word to search all sources for that word (with the default search options).

But why would you want to do this, you might wonder, when you can already search everything? Since the search engine relies on OCR, there are inaccuracies, and browsing the word list can help you find words that probably match your search term except for OCR errors. The most common OCR errors will be caught by the "OCR-Adjusted" search option, but you can sometimes find other OCR errors in the word list.

As a by-product of creating this word browser, the following link syntax will now work:

genealogyindexer.org/words/WOLOWELSKI will browse the word list starting at the word WOLOWELSKI
genealogyindexer.org/search/WOLOWELSKI will search all sources with the default search options for the word WOLOWELSKI

The list currently contains a little over 9 million distinct words, 5-25 letters long. Some of these words might appear thousands of times in the sources, some only once. Because of our focus on directories, most of the words are proper nouns or (abbreviations of) occupations.

Logan