annotate pylearn/datasets/config.py @ 1492:e7c4d031d333

Fix for Windows paths
author Olivier Delalleau <delallea@iro>
date Tue, 16 Aug 2011 15:44:01 -0400
parents 25985fb3bb4f
children
rev   line source
504
19ab9ce916e3 slightly more sophisticated system for finding the mnist data
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
1 """Configuration options for datasets
19ab9ce916e3 slightly more sophisticated system for finding the mnist data
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
2
19ab9ce916e3 slightly more sophisticated system for finding the mnist data
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
3
19ab9ce916e3 slightly more sophisticated system for finding the mnist data
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
4 Especially, the locations of data files.
19ab9ce916e3 slightly more sophisticated system for finding the mnist data
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
5 """
19ab9ce916e3 slightly more sophisticated system for finding the mnist data
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
6
833
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
7 import os, sys, logging
1285
976539956475 adding tinyimages
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 848
diff changeset
8 def _logger(): return logging.getLogger('pylearn.datasets.config')
976539956475 adding tinyimages
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 848
diff changeset
9 def debug(*msg): _logger().debug(' '.join(str(m) for m in msg))
976539956475 adding tinyimages
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 848
diff changeset
10 def info(*msg): _logger().info(' '.join(str(m) for m in msg))
976539956475 adding tinyimages
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 848
diff changeset
11 def warn(*msg): _logger().warn(' '.join(str(m) for m in msg))
976539956475 adding tinyimages
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 848
diff changeset
12 def warning(*msg): _logger().warning(' '.join(str(m) for m in msg))
976539956475 adding tinyimages
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 848
diff changeset
13 def error(*msg): _logger().error(' '.join(str(m) for m in msg))
833
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
14
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
15
655
14d22ca1c8b5 if PYLEARN_DATA_ROOT don't exist try DBPATH.
Frederic Bastien <bastienf@iro.umontreal.ca>
parents: 653
diff changeset
16 def env_get(key, default, key2 = None):
14d22ca1c8b5 if PYLEARN_DATA_ROOT don't exist try DBPATH.
Frederic Bastien <bastienf@iro.umontreal.ca>
parents: 653
diff changeset
17 if key2 and os.getenv(key) is None:
14d22ca1c8b5 if PYLEARN_DATA_ROOT don't exist try DBPATH.
Frederic Bastien <bastienf@iro.umontreal.ca>
parents: 653
diff changeset
18 key=key2
653
d3d8f5a17909 print warning on undefined PYLEARN_DATA_ROOT
bergstra@mlp4.ais.sandbox
parents: 537
diff changeset
19 if os.getenv(key) is None:
833
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
20 if env_get.first_warning:
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
21 warning("Environment variable", key, 'is not set. Using default of', default)
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
22 env_get.first_warning = False
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
23 return default
818
f4729745bb58 backporting to 2.4
dumitru@deepnets.mtv.corp.google.com
parents: 655
diff changeset
24 else:
848
e7d1dd6a9785 Fix missing "return"
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 833
diff changeset
25 return os.getenv(key)
833
039e93a95c20 dataset.config uses logging for warnings
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 818
diff changeset
26 env_get.first_warning = True
504
19ab9ce916e3 slightly more sophisticated system for finding the mnist data
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
27
505
74b3e65f5f24 added smallNorb dataset, switched to PYLEARN_DATA_ROOT
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 504
diff changeset
28 def data_root():
1413
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
29 """Deprecated, use data_roots() or get_filepath_in_roots()
1425
25985fb3bb4f fix whitespace.
Frederic Bastien <nouiz@nouiz.org>
parents: 1424
diff changeset
30
1413
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
31 It id deprecated as it don't allow to use more then 1 path.
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
32 """
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
33 roots = env_get('PYLEARN_DATA_ROOT', os.getenv('HOME')+'/data', 'DBPATH')
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
34 return roots.split(':')[0]
505
74b3e65f5f24 added smallNorb dataset, switched to PYLEARN_DATA_ROOT
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 504
diff changeset
35
1413
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
36 def data_roots():
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
37 """Return a list of path that are in the PYLEARN_DATA_ROOT env variable."""
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
38 if hasattr(data_roots, 'rval'):
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
39 return data_roots.rval
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
40 roots = os.getenv('PYLEARN_DATA_ROOT')
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
41 if roots is None:
1414
2b82c5a11512 small fix to new PYLEARN_DATA_ROOT
Frederic Bastien <nouiz@nouiz.org>
parents: 1413
diff changeset
42 roots = [data_root()]
1413
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
43 else:
1492
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
44 # Note that under Windows, we cannot use ':' as a delimiter because
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
45 # paths may contain this character. Thus we use ';' instead (similar to
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
46 # the PATH environment variable in Windows).
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
47 if sys.platform == 'win32':
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
48 roots = roots.split(';')
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
49 else:
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
50 roots = roots.split(':')
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
51 # Remove paths that are not directories.
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
52 data_roots.rval = [r for r in roots if os.path.isdir(r)]
e7c4d031d333 Fix for Windows paths
Olivier Delalleau <delallea@iro>
parents: 1425
diff changeset
53 return data_roots.rval
1413
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
54
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
55
1424
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
56 def get_filepath_in_roots(*names):
1413
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
57 """Return the full path of name that exist under a directory
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
58 in the PYLEARN_DATA_ROOT env variable.
1425
25985fb3bb4f fix whitespace.
Frederic Bastien <nouiz@nouiz.org>
parents: 1424
diff changeset
59
1424
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
60 If their is multiple file name, we return the first that exist.
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
61 This allow to get one of the file that is there.
1413
58dff11840f0 Allow PYLEARN_DATA_ROOT to be a list of directory. created pylearn.datasets.config.get_filepath_in_roots(name) fct to find a file in the list of directory.
Frederic Bastien <nouiz@nouiz.org>
parents: 1285
diff changeset
62 """
1424
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
63 for name in names:
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
64 for root in data_roots():
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
65 path = os.path.join(root,name)
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
66 if os.path.exists(path):
84cb96db5673 return the first file in a file that exists in the directory in the PYLEARN_DATA_ROOT.
Frederic Bastien <nouiz@nouiz.org>
parents: 1414
diff changeset
67 return path