Looking for a downloadable hyphenation dictionary

Forum for English and all other languages.

Moderators: kokoyaya, Beaumont

Post Reply
Alan
Guest

Looking for a downloadable hyphenation dictionary

Post by Alan »

I'm looking for a list of (English) words with hypens between syllables the same way they appear in printed dictionaries.

I want to a write a password generator program which will output in random order two random syllables from the list and two random numerical digits. I think this will create passwords that are easy to remember yet reasonably secure. The passwords would look something like 1add2form or bot28sec or viv8can6.

If I can get a list I can split the words on the hyphens, sort, cull duplicates, and get started. This is for non-commercial use. I would probably write it to be used as a CGI application from a web browser and have it generate something like 100 passwords at a time, then the user could pick one they think they might remember. That way even if someone intercepts the page containing them they won't know which one the user chose.

Such a program probably exists, but I program both for fun and for a living. It would probably be released as open source or freeware.

Alan
User avatar
pc2
Membre / Member
Posts: 5299
Joined: 18 Feb 2005 13:21
Location: Rio de Janeiro, Brasil
Contact:

Post by pc2 »

that's really an interesting idea. what you want could be a hyphenated English words database, right? we could try to generate that for you, and make a .mdb access database file, is that what you're looking for?
we don't know the hyphenation of English, but we do know that of Portuguese, which is our natuve language. then, we could try to look for a hyphenated English wordlist, and put that together in a database.

salutations,
Merci de corriger notre français si nécessaire.
Paulo Marcos -- & -- Claudio Marcos
Brasil/Brazil/Brésil
User avatar
Latinus
Admin
Posts: 24965
Joined: 18 Mar 2002 01:00
Location: complètement à l'Ouest
Contact:

Post by Latinus »

pc2 wrote:.mdb access database file
Such a strange idea...
With µsoft-access database, you will have to manage with odbc, for example.
I think it would be better to use a web optimized database system, like MySQL or PostgreSQL.

Actually... it's not the point ! Alan is looking for a list of words.
Les courses hippiques, lorsqu'elles s'y frottent.
User avatar
Latinus
Admin
Posts: 24965
Joined: 18 Mar 2002 01:00
Location: complètement à l'Ouest
Contact:

Post by Latinus »

It's not a "list of words" but I think you'll be interested by the passkool project (python needed).
Just get it and read the README.

Here's an extract :
MARKOV CHAINS
-------------

The data used by the Markov chains is stored in markov.dat. It's basically an
english text with spaces removed. You could replace this markov.dat with you own
text. The principle is quite simple, you sequentially move in the text and at
each iteration, you grab three letters.
Take also a look at multicians.org and answers.com
Les courses hippiques, lorsqu'elles s'y frottent.
Alan
Guest

No, not a database

Post by Alan »

Yes, it would be strange and cumbersome to use any sort of database for this. Although I find Access to be sometimes indispensable and I do use PostgreSQL, what I would envision for this is a small list of syllables contained within the program itself or possibly read in from a small text file. I like to keep dependancies to a minimum so I wouldn't want to require any database to be installed. I've been writing CGI in C lately, from scratch, not using any libraries. It makes for very small and efficient programs.

I'd like to store something like 1000 syllables inside the program itself and output those at random with a couple of digits. I've looked at other methods including the ones mentioned in the next post, but what's unique about my idea (I think) is to use commonly-found syllables that are already part of existing words. If they're easy to pronounce they might be easy to remember. I've done a little bit with generating random passwords but things that are truly random are hard to remember. I've got a password generator written in JavaScript at http://128.119.200.7/randpass.html which anyone's welcome to use but it produces passwords like SPQiOshH 3bMqayY3 3hBnsh07 which no one could ever remember.

It might be practical to OCR some dictionary pages or type them in to get my syllable list, but I was hoping to find something already out there in electronic form. Another option might be to try to find rules for hyphenation and use them on a word list like those found in /usr/share/dict (I'm an OpenBSD user). About the only thing concrete I know about these rules is that one can hypenate between double consonants.

Alan
Latinus wrote:
pc2 wrote:.mdb access database file
Such a strange idea...
With µsoft-access database, you will have to manage with odbc, for example.
I think it would be better to use a web optimized database system, like MySQL or PostgreSQL.

Actually... it's not the point ! Alan is looking for a list of words.
User avatar
Beaumont
Admin
Posts: 7384
Joined: 07 Jun 2002 02:00
Location: Thailande
Contact:

Post by Beaumont »

Might be useful, I don't know:

Plain text list of English words (491 kb, approx. 120 000 words)
http://www.freelang.net/download/misc/english_words.zip
Time is an illusion. Lunchtime doubly so.
Post Reply