An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does scale? | Download the latest release
Word Lists


SCOWL (Spell Checker Oriented Word Lists) is a collection of word lists split up in various sizes, and other categories, intended to be suitable for use in spell checkers. However, I am sure it will have numerous other uses as well.

readme, tar.gz, zip (rev 6, 2.2 MB) (release notes & changelog)


AGID is an Automatically Generated Inflection Database from an insanely large word list. My goal is for the non-questionable entries to be 100% accurate.

readme, tar.gz, zip (rev 4, 0.9 MB) (release notes & changelog)



VarCon (Variant Conversion Info) contains tables to convert between American, British (both "ise" and "ize" spellings), and Canadian spellings and vocabulary as well as well as a table listing the equivalent forms of other variants.

readme, tar.gz, zip (rev 4.1, 0.1 MB) (release notes & changelog)

Part Of Speech Database

The Part Of Speech Database is a combination of "Moby (tm) Part-of-Speech II" and the WordNet database.

readme, tar.gz, zip (1.2 MB)

Unofficial Jargon File Word Lists

The Unofficial Jargon File Word Lists is a collection of useful Word Lists created from the Jargon file.

readme, tar.gz

Ispell English Word Lists

This package contains the contents of the Ispell (ver 3.1.20) word list after being expand from there affix compressed form used by Ispell.

readme, tar.gz, zip (0.4 MB)

Unofficial Alternate 12 Dicts Package

The Unofficial Alternate 12 Dicts Package contains almost all the information in the official 12Dicts package but in a different format as well as a good deal of additional information. However it is not meant as a replacement for the official 12Dicts package. It simply offers the information in a different way.

readme, readme-infl, tar.gz, zip (rev 4, 1.0 MB) (release notes & changelog)



Official 12Dicts Package

12dicts is a collection of English word lists. It differs in several important ways from most of the other free word lists you can download.

  • The 12dicts lists are oriented towards common words. If you're looking for myriads of archaic, scientific or computer jargon words, you should look elsewhere.
  • The 12dicts lists have been rigorously checked for errors. (This is not to say that they are error-free, merely that enough care has been taken that errors are rather infrequent.)
  • 12dicts contains a variety of lists, of different sizes and characteristics. One size does not fit all. Because each list has different characteristics, I do not recommend combining them, except as noted below.

Originally, 12dicts was composed of lists derived from a specific set of 12 source dictionaries. In addition to these "classic" lists, 12dicts now includes lists derived from other sources. It would perhaps be appropriate to rename 12dicts to someth ing more generic, such as BAWL (Beale's Assorted Word Lists), but I have not done so in order to preserve continuity.

readme, (ver 5.0, 1419 KB)


Spell Checking Dictionaries

These wordslists are the basis for the offical English dictionary in Aspell and the en_US and en_CA dictionary in Hunspell.


Additional Information

Please submit all bug reports regarding these words lists or the speller dictionaries to the Issue Tracker.

The latest version of SCOWL and friends can be found in svn. In order to build SCOWL you will need to check out the trunk directory:

   svn co wordlist

Than to build SCOWL:

  cd wordlist

To build the aspell and hunspell dictionaries (you will need Aspell 0.60 installed):

  cd scowl/speller
  make aspell
  make hunspell

General discussion should be directed to the wordlist-devel mailing list.

Sourceforge Project Page

Links to other General Word Lists

Links to Online Dictionaries

Links to Slang Word Lists and Dictionaries

Links to Specialty Word Lists and Dictionaries

Other Links

Posted Mon, Aug 23 2010 9:24 PM by
Filed under:
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, LLC