annotate embeddings/parameters.py @ 532:34ee3aff3e8f

Improved embedding word preprocessing.
author Joseph Turian <turian@gmail.com>
date Tue, 18 Nov 2008 02:57:50 -0500
parents 919125098a3b
children de974b4fc4ea
rev   line source
458
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
1 """
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
2 Locations of the embedding data files.
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
3 """
510
919125098a3b Fixed parameters.
Joseph Turian <turian@iro.umontreal.ca>
parents: 484
diff changeset
4 WEIGHTSFILE = "/home/fringant2/lisa/data/word_embeddings.collobert-and-weston/lm-weights.txt"
919125098a3b Fixed parameters.
Joseph Turian <turian@iro.umontreal.ca>
parents: 484
diff changeset
5 VOCABFILE = "/home/fringant2/lisa/data/word_embeddings.collobert-and-weston/words.asc"
919125098a3b Fixed parameters.
Joseph Turian <turian@iro.umontreal.ca>
parents: 484
diff changeset
6 #WEIGHTSFILE = "/home/joseph/data/word_embeddings.collobert-and-weston/lm-weights.txt"
919125098a3b Fixed parameters.
Joseph Turian <turian@iro.umontreal.ca>
parents: 484
diff changeset
7 #VOCABFILE = "/home/joseph/data/word_embeddings.collobert-and-weston/words.asc"
458
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
8 NUMBER_OF_WORDS = 30000
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
9 DIMENSIONS = 50
459
f400f62e7f9e Fixed embedding preprocessing
Joseph Turian <turian@iro.umontreal.ca>
parents: 458
diff changeset
10 UNKNOWN = "UNKNOWN"