You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Go to file
Castro0o b9d8bfa4b5 links from image to wiki 5 years ago
sandbox updates to readme & run.sh 5 years ago
static template for document_part with all the metadata 5 years ago
templates links from image to wiki 5 years ago
.gitignore fixed issues with pandoc, using local tmp files 5 years ago
README.md cleaning static_html dir beefore creating new html 5 years ago
download_imgs.py sandbox: wiki_images.py 5 years ago
functions.py links from image to wiki 5 years ago
images2html.py fixed issues with pandoc, using local tmp files 5 years ago
imgs_info.py images2html 5 years ago
query2html.py links from image to wiki 5 years ago
run.sh change name of script publication2html.py --> ask2html.py 5 years ago

README.md

Wiki to HTML pages script

Depencencies

  • python3

  • pip Python library installed

    • Install:
      • curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
      • python3 get-pip.py
  • mwclient Python library

    • Install:
      • pip3 install mwclient
  • jinja2 Python library

    • Install:
      • pip3 install jinja2
  • pandoc

    • Install:
      • Debian/Ubuntu: sudo apt install pandoc
      • Mac: brew install pandoc

login.txt

login.txt is a local and individual file, ignored by git, where you place you itch wiki username and password, in separate lines.

It is used to let mwclient access the wiki, since it is close for reading and writing.

myusername
mypassword

Run

cd special-issue-11-wiki2html/

Run scripts together with ./run.sh

1 script at a time:

python3 download_imgs.py

  • Downloads all images from wiki to images/ directory
  • and stores each image's metadata to images.json

python3 query2html.py

  • with ask API perform a query:

    • help python3 query2html.py --help
    • run dry python3 query2html.py --dry only printing request, not executing it
    • build custom query with arguments --conditions --printouts --sort --order
    • default query is: [[File:+]][[Title::+]][[Part::+]][[Date::+]]|?Title|?Date|?Part|?Partof|sort=Date,Title,Part|order=asc,asc,asc
    • custom query python3 query2html.py -c '[[Date::>=1970/01/01]][[Date::<=1979/12/31]]' -p '?Title|?Date|?Part|?Partof' -s 'Date,Title,Part' -o 'asc,asc,asc'
  • The results, with the same Title, are stored

    • into 1 single HTML
    • sorted by Part

TODO

  • remove HTML files at each new query
  • revise def unpack_response() so that it returns the values of all properties printed out
  • revise template so that they include the values of all properties printed out
    and do not break on missing values