annotate sandbox/embeddings/files.py @ 458:ed6b0b3be8d2

Polished embeddings module
author Joseph Turian <turian@iro.umontreal.ca>
date Tue, 07 Oct 2008 19:13:53 -0400
parents
children f400f62e7f9e
rev   line source
458
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
1 """
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
2 Locations of the embedding data files.
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
3 """
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
4 WEIGHTSFILE = "/u/turian/data/word_embeddings.collobert-and-weston/lm-weights.txt"
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
5 VOCABFILE = "/u/turian/data/word_embeddings.collobert-and-weston/words.asc"
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
6 NUMBER_OF_WORDS = 30000
ed6b0b3be8d2 Polished embeddings module
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
7 DIMENSIONS = 50