pylearn: doc/v2_planning/learner.txt comparison

comparison doc/v2_planning/learner.txt @ 1043:3f528656855b

v2planning learner.txt - updated API recommendation

author	James Bergstra <bergstrj@iro.umontreal.ca>
date	Wed, 08 Sep 2010 11:33:33 -0400
parents	38cc6e075d9b
children	3b1fd599bafd

comparison

equal deleted inserted replaced

-:4eaf576c3e9a
+:3f528656855b
 straightforward to write a meta-ExperimentGraph around it that implements AdaBoost.
 A meta-meta-ExperimentGraph around that that does early-stopping would complete
 the picture and make a useful boosting implementation.
+Using External Hyper-Parameter Optimization Software
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+TODO: use-case - show how we could use the optimizer from
+http://www.cs.ubc.ca/labs/beta/Projects/ParamILS/
 Implementation Details / API
 ----------------------------
-TODO: PUT IN TERMINOLOGY OF LEARNER, HYPER-LEARNER.
-TODO: SEPARATE DISCUSSION OF PERSISTENT STORAGE FROM LEARNER INTERFACE.
-TODO: API describing hyperparameters (categorical, integer, bounds on values, etc.)
-TODO: use-case - show how we could use the optimizer from
-http://www.cs.ubc.ca/labs/beta/Projects/ParamILS/
-ExperimentGraph
-~~~~~~~~~~~~~~~
-One API that needs to be defined for this perspective to be practical is the
-ExperimentGraph.  I'll present it in terms of global functions, but an
-object-oriented things probably makes more sense in the code itself.
-def explored_nodes(graph):
-"""Return iterator over explored nodes (ints? objects?)"""
-def forget_nodes(graph, nodes):
-"""Clear the nodes from memory (save space)"""
-def all_edges_from(graph, node):
-"""Return iterator over all possible edges
-Edges might be parametric - like "set learn_rate to (float)"
-Edges might contain a reference to their 'from' end... not sure.
-"""
-def explored_edges_from(graph, node):
-"""Return the edges that have been explored
-"""
-def add_node(graph, new_node):
-"""add a node.  It may be serialized."""
-def add_edge(graph, edge):
-"""add edge, it may be serialize"""
-def connect(graph, from_node, to_node, edge):
-"""
-to_node = None for un-explored edge
-"""
-It makes sense to have one ExperimentGraph implementation for each storage
-mechanism - Memory, JobMan, sqlite, couchdb, mongodb, etc.
-The nodes should be serializable objects (like the 'learner' objects in Yoshua's
-text above, so that you can do node.learner.predict() if the edge leading to
-`node` trained something new).
-The nodes could also contain the various costs (train, valid, test), and other
-experiment statistics that are node-specific.
-Some implementations might also include functions for asynchronous updating of
-the ExperimentGraph:
-ExperimentGraphEdge
-~~~~~~~~~~~~~~~~~~~
-The ExperimentGraph is primarily a dictionary container for nodes and edges.
-An ExperimentGraphEdge implementation is the model-dependent component that
-actually interprets the edges as computations.
-def estimate_compute_time(graph, node, edge):
-"""Return an estimated walltime expense for the computation"""
-def compute_edge(graph, node, edge, async=False, priority=1):
-"""Run the computations assocated with this graph edge, and store the
-resulting 'to_node' to the graph when complete.
-If async is True, the function doesn't return until the graph is updated
-with `to_node`.
-The priority is used by implementations that use cluster software or
-something to manage a worker pool that computes highest-priority edges
-first.
-"""
-def list_compute_queue(graph):
-"""Return edges scheduled for exploration (and maybe a handle for
-where/when they started running and other backend details)
-"""
-Different implementations of ExperimentGraphExplorer will correspond to
-different experiments.  There can also be ExperimentGraphExplorer
-implementations that are proxies, and perform the computations in different
-threads, or across ssh, or cluster software.
 Learner
 ~~~~~~~
+An object that allows us to explore the graph discussed above.  Specifically, it represents
-A learner is a program that implements a policy for graph exploration by
+an explored node in that graph.
-exploiting the ExperimentGraph and ExperimentGraphEdge interfaces.
+def active_instructions()
-The convenience of the API hinges on the extent to which we can implement
+""" Return a list/set of Instruction instances (see below) that the Learner is prepared
-policies that work on different experiment-graphs (where the labels on the edges
+to handle.
-and semantics are different).  The use-cases above make me optimistic that it
+"""
-will work sufficiently well to be worth doing in the absence of better ideas.
+def copy(), deepcopy()
+""" Learners should be serializable """
+To make the implementation easier, I found it was helpful to introduce a string-valued
+`fsa_state` member attribute and associate methods to these states.  That made it
+syntactically easy to build relatively complex finite-state transition graphs to describe
+which instructions were active at which times in the life-cycle of a learner.
+Instruction
+~~~~~~~~~~~
+An object that represents a potential edge in the graph discussed above.  It is an
+operation that a learner can perform.
+arg_types
+"""a list of Type object (see below) indicating what args are required by execute"""
+def execute(learner, args, kwargs):
+""" Perform some operation on the learner (follow an edge in the graph discussed above)
+and modify the learner in-place.  Calling execute 'moves' the learner from one node in
+the graph along an edge.  To have the old learner as well, it must be copied prior to
+calling execute().
+"""
+def expense(learner, args, kwargs, resource_type='CPUtime'):
+""" Return an estimated cost of performing this instruction (calling execute), in time,
+space, number of computers, disk requierement, etc.
+"""
+Type
+~~~~
+An object that describes a parameter domain for a call to Instruction.execute.
+It is not necessary that a Type specifies exactly which arguments are legal, but it should
+`include` all legal arguments, and exclude as many illegal ones as possible.
+def includes(value):
+"""return True if value is a legal argument"""
+To make things a bit more practical, there are some Type subclasses like Int, Float, Str,
+ImageDataset, SgdOptimizer, that include additional attributes (e.g. min, max, default) so
+that automatic graph exploration algorithms can generate legal arguments with reasonable
+efficiency.
+The proxy pattern is a powerful way to combine learners. Especially when proxy Learner
+instances also introduce Proxy Instruction classes.
+For example, it is straightforward to implement a hyper-learner by implementing a Learner with
+another learner (sub-learner) as a member attribute.  The hyper-learner makes some
+modifications to the instruction_set() return value of the sub-learner, typically to introduce
+more powerful instructions and hide simpler ones.
+It is less straightforward, but consistent with the design to implement a Learner that
+encompasses job management.  Such a learner would retain the semantics of the
+instruction_set of the sub-learner, but would replace the Instruction objects themselves with
+Instructions that arranged for remote procedure calls (e.g. jobman, multiprocessing, bqtools,
+etc.)  Such a learner would replace synchronous instructions (return on completion) with
+asynchronous ones (return after scheduling) and the active instruction set would also change
+asynchronously, but neither of these things is inconsistent with the Learner API.
+TODO
+~~~~
+I feel like something is missing from the API - and that is an interface to the graph structure
+discussed above.  The nodes in this graph are natural places to store meta-information for
+visualization, statistics-gathering etc.   But none of the APIs above corresponds to the graph
+itself. In other words, there is no API through which to attach information to nodes.  It is
+not good to say that the Learner instance *is* the node because (a) learner instances change
+during graph exploration and (b) learner instances are big, and we don't want to have to keep a
+whole saved model just to attach meta-info e.g. validation score.    Choosing this API spills
+over into other committees, so we should get their feedback about how to resolve it.

Mercurial > pylearn

comparison doc/v2_planning/learner.txt @ 1043:3f528656855b