Mercurial > pylearn
comparison doc/v2_planning/plugin_RP.py @ 1154:f923dddf0bf7
a better version of the script
author | pascanur |
---|---|
date | Thu, 16 Sep 2010 23:42:26 -0400 |
parents | ae5ba6206fd3 |
children | 6993fef088d1 3c2d7c5f0cf7 |
comparison
equal
deleted
inserted
replaced
1153:ae5ba6206fd3 | 1154:f923dddf0bf7 |
---|---|
26 | 26 |
27 | 27 |
28 .. code-block:: | 28 .. code-block:: |
29 ''' | 29 ''' |
30 sch = Schedular() | 30 sch = Schedular() |
31 p = ProducerFactory() | |
32 p = sched.schedule_plugin(event = every(p.outputStuffs()), p ) | |
33 p = sched.schedule_plugin(event = Event("begin"), p) | |
34 c = sched.schedule_plugin(event = every(p.outputStuffs()), ConsumerFactory ) | |
35 pc= sched.schedule_plugin(event = every(p.outputStuffs()), ProducerConsumerFactory ) | |
36 | 31 |
37 sched.run() | 32 @FnPlugin(sch) |
33 def producer(self,event): | |
34 self.fire('stuff', value = 'some text') | |
35 | |
36 @FnPlugin(sch) | |
37 def consumer(self,event): | |
38 print event.value | |
39 | |
40 @FnPlugin(sch) | |
41 def prod_consumer(self,event): | |
42 print event.value | |
43 self.fire('stuff2', value = 'stuff') | |
44 | |
45 producer.act( on = Event('begin'), when = once() ) | |
46 producer.act( on = Event('stuff'), when = always() ) | |
47 consumer.act( on = Event('stuff'), when = always() ) | |
48 prod_consumer.act( on = Event('stuff'), when = always() ) | |
49 | |
50 sch.run() | |
38 | 51 |
39 | 52 |
40 | 53 |
41 ''' | 54 ''' |
42 Example : Logistic regression | 55 Example : Logistic regression |
51 | 64 |
52 | 65 |
53 Possible script | 66 Possible script |
54 --------------- | 67 --------------- |
55 | 68 |
56 Sorry for long variable names, I wanted to make it clear what things are .. | 69 Notes : This would look the same for any other architecture that does not |
70 imply pre-training ( i.e. deep networks). For example the mlp. | |
57 | 71 |
58 .. code-block:: | 72 .. code-block:: |
59 ''' | 73 ''' |
60 sched = Schedular() | |
61 # This is a shortcut .. I've been to the dataset committee and they have | |
62 # something else in mind, a bit more fancy; I totally agree with their | |
63 # ideas I just wrote it like this for brevity; | |
64 train_data, valid_data, test_data = load_mnist() | |
65 | 74 |
66 # This part was not actually discussed into details ; I have my own | 75 sched = Schedular() |
67 # opinions of how this part should be done .. but for now I decomposed it | 76 |
68 # in two functions for convinience | 77 # Data / Model Building : |
69 logreg = generate_logreg_model() | 78 # I skiped over how to design this part |
79 # though I have some ideas | |
80 real_train_data, real_valid_data = load_mnist() | |
81 model = logreg() | |
82 | |
83 # Main Plugins ( already provided in the library ); | |
84 # This wrappers also registers the plugin | |
85 train_data = create_data_plugin( sched, data = real_train_data) | |
86 valid_data = create_data_plugin( sched, data = real_valid_data) | |
87 train_model = create_train_model(sched, model = model) | |
88 validate_model = create_valid_model(sched, model = model, data = valid_data) | |
89 early_stopper = create_early_stopper(sched) | |
70 | 90 |
71 | 91 |
72 | 92 # On the fly plugins ( print random stuff); the main difference from my |
73 # Note that this is not meant to replace the string idea of Olivier. I | 93 # FnPlugin from Olivier's version is that it also register the plugin in sched |
74 # actually think that is a cool idea, when writing things down I realized | 94 @FnPlugin(sched) |
75 # it might be a bit more intuitive if you would get that object by calling | 95 def print_error(self, event): |
76 # a method of the instance of the plugin with a significant name | 96 if event.type == Event('begin'): |
77 # I added a warpping function that sort of tells on which such events | 97 self.value = [] |
78 # you can have similar to what Olivier wrote { every, at .. } | 98 elif event.type == train_model.error(): |
79 doOneTrainingStepPlugin =ModelPluginFactory( model = logreg ) | 99 self.value += [event.value] |
80 trainDataPlugin = sched.schedule_plugin( | 100 else event.type == train_data.eod(): |
81 event = every(doOneTrainingStepPlugin.new_train_error), | 101 print 'Error :', numpy.mean(self.value) |
82 DatasetsPluginFactory( data = train_data) ) | |
83 | 102 |
84 trainDataPlugin = sched.schedule_plugin( | 103 @FnPlugin(sched) |
85 event = Event('begin'), trainDataPlugin ) | 104 def save_model(self, event): |
86 | 105 if event.type == early_stopper.new_best_error(): |
87 clock = sched.schedule_plugin( event = all_events, ClockFactory()) | 106 cPickle.dump(model.parameters(), open('best_params.pkl','wb')) |
88 | |
89 doOneTrainingStepPlugin = sched.schedule_plugin( | |
90 event = every(trainDataPlugin.new_batch()), | |
91 ModelFactory( model = logreg)) | |
92 | 107 |
93 | 108 |
109 # Create the dependency graph describing what does what | |
110 train_model.act(on = train_data.batch(), when = always()) | |
111 validate_model.act(on = train_model.done(), when = every(n=10000)) | |
112 early_stopper.act(on = validate_model.error(), when = always()) | |
113 print_error.act( on = train_model.error(), when = always() ) | |
114 print_error.act( on = train_data.eod(), when = always() ) | |
115 save_model.act( on = eraly_stopper.new_best_errot(), when = always() ) | |
94 | 116 |
117 # Run the entire thing | |
118 sched.run() | |
95 | 119 |
96 # Arguably we wouldn't need such a plugin. I added just to show how to | |
97 # deal with multiple events from same plugin; the plugin is suppose to | |
98 # reset the index of the dataset to 0, so that you start a new epoch | |
99 resetDataset = sched.schedule_plugin( | |
100 event = every(trainDataPlugin.end_of_dataset()), | |
101 ResetDatasetFactory( data = train_data) ) | |
102 | |
103 | |
104 checkValidationPlugin = sched.schedule_plugin( | |
105 event =every_nth(doOneTrainingStepPlugin.done(), n=1000), | |
106 ValidationFactory( model = logreg data = valid_data)) | |
107 | |
108 # You have the options to also do : | |
109 # | |
110 # checkValidationPlugin = sched.schedule_plugin( | |
111 # event =every(trainDataPlugin.end_of_dataset()), | |
112 # ValidationFactory( model = logreg, data = valid_data)) | |
113 # checkValidationPlugin = sched.schedule_plugin( | |
114 # event =every(clock.hour()), | |
115 # ValidationFactory( model = logreg, data = valid_data)) | |
116 | |
117 # This plugin would be responsible to send the Event("terminate") when the | |
118 # patience expired. | |
119 earlyStopperPlugin = sched.schedule_plugin( | |
120 event = every(checkValidationPlugin.new_validation_error()), | |
121 earlyStopperFactory(initial_patience = 10) ) | |
122 | |
123 # Printing & Saving plugins | |
124 | |
125 printTrainingError = sched.schedule_plugin( | |
126 event = every(doOneTrainingStepPlugin.new_train_error()), | |
127 AggregateAndPrintFactory()) | |
128 | |
129 printTrainingError = sched.schedule_plugin( | |
130 event = every(trainDataPlugin.end_of_dataset()), | |
131 printTrainingError) | |
132 saveWeightsPlugin = sched.schedule_plugin( | |
133 event = every(earlyStopperPlugin.new_best_valid_error()), | |
134 saveWeightsFactory( model = logreg) ) | |
135 | |
136 sched.run() | |
137 | 120 |
138 ''' | 121 ''' |
139 Notes | 122 Notes |
140 ===== | 123 ===== |
141 | 124 |
142 In my code schedule_plugin returns the plugin that it regsiters. I think that | 125 * I think we should have a FnPlugin decorator ( exactly like Olivier's) just |
143 writing something like | 126 that also attaches the new created plugin to the schedule. This way you |
144 x = f( .. ) | 127 can create plugin on the fly ( as long as they are simple functions that |
145 y = f(x) | 128 print stuff, or compute simple statitics ). |
129 * I added a method act to a Plugin. You use that to create the dependency | |
130 graph ( it could also be named listen to be more plugin like interface) | |
131 * Plugins are obtained in 3 ways : | |
132 - by wrapping a dataset / model or something similar | |
133 - by a function that constructs it from nothing | |
134 - by decorating a function | |
135 In all cases I would suggest then when creating them you should provide | |
136 the schedular as well, and the constructor also registers the plugin | |
146 | 137 |
147 makes more readable then writing f( .., event_belongs_to = x), or even worse, | 138 * The plugin concept works well as long as the plugins are a bit towards |
148 you only see text, and you would have to go to the plugins to see what events | 139 heavy duty computation, disregarding printing plugins and such. If you have |
149 they actually produce. | 140 many small plugins this system might only introduce an overhead. I would |
141 argue that using theano is restricted to each plugin. Therefore I would | |
142 strongly suggest that the architecture to be done outside the schedular | |
143 with a different approach. | |
150 | 144 |
151 At this point I am more concern with how the scripts will look ( the cognitive | 145 * I would suggest that the framework to be used only for the training loop |
152 load to understand them) and how easy is to go to hack into them. From this point | 146 (after you get the adapt function, compute error function) so is more about |
153 of view I would have the following suggestions : | 147 the meta-learner, hyper-learner learner level. |
154 * dataset and model creation should create outside the schedular with possibly | 148 |
155 other mechanisms | 149 * A general remark that I guess everyone will agree on. We should make |
156 * there are two types of plugins, those that do not affect the experiment, | 150 sure that implementing a new plugin is as easy/simple as possible. We |
157 they just compute statistics and print them, or save different data and those | 151 have to hide all the complexity in the schedular ( it is the part of the |
158 plugin that change the state of the model, like train, or influence the life | 152 code we will not need or we would rarely need to work on). |
159 of the experiment. There should be a minimum of plugins of the second category, | 153 |
160 to still have the code readable. ( When understanding a script, you only need | 154 * I have not went into how to implement the different components, but |
161 to understand that part, the rest you assume is just printing stuff). | 155 following Olivier's code I think that part would be more or less straight |
162 The different categories should also be grouped. | 156 forward. |
157 | |
158 ''' | |
163 | 159 |
164 | 160 |
165 ''' | 161 ''' |