Mercurial > eagle-eye
view lazywww/README @ 232:978a949602e5
Auto-update Scientists numbers for Academy.
Refined the rules for safehouse, the safe house must be same or higher level then Town Hall.
Make people very happy, when the townHall is less then 16.
Build museum first then tavern
THG: changed warfare.pl
author | "Rex Tsai <chihchun@kalug.linux.org.tw>" |
---|---|
date | Thu, 06 Nov 2008 20:31:05 +0800 |
parents | d26eea95c52d |
children |
line wrap: on
line source
""" [Note] the project is not available yet. A web page fetcing tool chain that has a JQuery-like selector and supports chain working. Here is an exmaple can show the the main idea, To restrive a content you want in a div box in a web page, and then post and restrive next wanted-content in the other web page with the param you just maked from the content in first restriving. finally, storage the production. def func(s): msg = s.html() return {'msg':msg} try: c("http://example.tw/").get().find("#id > div") \ .build_param( func ).post_to("http://example2.com") \ .save_as('hellow.html') except: pass more complex example try: c("http://example.tw/").retry(4, '5m').get() \ .find("#id > div"). \ .build_param( func ).post_to("http://example2.com") \ .save_as('hellow.html') \ .end().find("#id2 > img").download('pretty-%s.jpg'). \ tar_and_zip("pretty_girl.tar.gz") except NotFound: print "the web page is not found." except NoPermissionTosave: print "the files can not be save with incorrect permission." else: print "unknow error." """ 目前還在設計階段,驗證想法,目前卡關中… 卡在怎麼把workflow接在一起... orz 這邊的筆記滿亂的,請見諒。 本來是要寫bot的,但因為覺得python要控制網頁很不直覺?! 至少在取得html特定內容沒Jquery簡單, 又在IRC上看到thinker提到抓網頁架構想法,所以想嘗試在寫bot的過程中,看能不能時做出一個堪用的小工具 (誤, 又發散了 抓網頁的的動作與工廠生產線相似。 流程如下 取得網頁 找特定內容 儲存 加工 workflow -----------> workflow --> product -----> workflow semiproduct Lazy WWW Proposal 0.1 work flow 架構 Jquery-way to parse html easier. http://phpimpact.wordpress.com/2008/08/07/php-simple-html-dom-parser-jquery-style/ Simple Fetcher - get web page basic procces hook - process the content to build middleware object/ semiproduct 0.2 output serialize - c('http://www.example.com').build_dict(lambda x:x).to_xml() 0.3 Fetcher Exception hanldes ( Retry ) 0.4 Storager - save the production. tar / zip c('http://www.kimo.com.tw').get().tar_and_gzip('hello.tgz') 0.5 PipeLine Command operation supports. - ( the idea is from thinker ) lzw getpage http://www.kimo.com.tw/faq.html , find "#id > div" , save_as hello.html 0.6 proposal Dispacher - manage the missions Refrences: WorkFollow: http://en.wikipedia.org/wiki/Getting_Things_Done Thinkers code: http://master.branda.to/downloads/pywebtool/ c('http://www.kimo.com.tw').get() . find('#id div') . save_as('h.html') . tar('a.tar') semiproduct --------------> workflow --------------------> workflow ----------------> workflow-----------> product ----------> workflow semiproduct semiproduct