|
|
|
# Wiki to HTML pages script
|
|
|
|
![](https://pzwiki.wdka.nl/mw-mediadesign/images/8/82/Workflow-wiki2html.svg)
|
|
|
|
|
|
|
|
## Depencencies
|
|
|
|
* python3
|
|
|
|
* [pip](https://pip.pypa.io/en/stable/installing/) Python library installed
|
|
|
|
* Install:
|
|
|
|
* `curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`
|
|
|
|
* `python3 get-pip.py`
|
|
|
|
|
|
|
|
* [mwclient](https://mwclient.readthedocs.io/en/latest/index.html) Python library
|
|
|
|
* Install:
|
|
|
|
* `pip3 install mwclient`
|
|
|
|
* [jinja2](https://jinja.palletsprojects.com/en/2.11.x/) Python library
|
|
|
|
* Install:
|
|
|
|
* `pip3 install jinja2`
|
|
|
|
* [pandoc](https://pandoc.org/)
|
|
|
|
* Install:
|
|
|
|
* Debian/Ubuntu: `sudo apt install pandoc`
|
|
|
|
* Mac: `brew install pandoc`
|
|
|
|
|
|
|
|
|
|
|
|
## login.txt
|
|
|
|
`login.txt` is a local and individual file, ignored by git, where you place you itch wiki username and password, in separate lines.
|
|
|
|
|
|
|
|
It is used to let mwclient access the wiki, since it is close for reading and writing.
|
|
|
|
```
|
|
|
|
myusername
|
|
|
|
mypassword
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
## Run
|
|
|
|
|
|
|
|
`cd special-issue-11-wiki2html/`
|
|
|
|
|
|
|
|
Run scripts together with `./run.sh`
|
|
|
|
|
|
|
|
|
|
|
|
1 script at a time:
|
|
|
|
|
|
|
|
`python3 download_imgs.py`
|
|
|
|
* Downloads all images from wiki to `images/` directory
|
|
|
|
* and stores each image's metadata to `images.json`
|
|
|
|
|
|
|
|
`python3 query2html.py`
|
|
|
|
* with ask API perform a query:
|
|
|
|
* help `python3 query2html.py --help`
|
|
|
|
* run dry `python3 query2html.py --dry` only printing request, not executing it
|
|
|
|
* build custom query with arguments `--conditions --printouts --sort --order`
|
|
|
|
* default query is: `[[File:+]][[Title::+]][[Part::+]][[Date::+]]|?Title|?Date|?Part|?Partof|sort=Date,Title,Part|order=asc,asc,asc`
|
|
|
|
* custom query `python3 query2html.py -c '[[Date::>=1970/01/01]][[Date::<=1979/12/31]]' -p '?Title|?Date|?Part|?Partof' -s 'Date,Title,Part' -o 'asc,asc,asc'`
|
|
|
|
|
|
|
|
* The results, with the same Title, are stored
|
|
|
|
* into 1 single HTML
|
|
|
|
* sorted by Part
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## TODO
|
|
|
|
* remove HTML files at each new query
|
|
|
|
* revise `def unpack_response()` so that it returns the values of all properties printed out
|
|
|
|
* revise template so that they include the values of all properties printed out \
|
|
|
|
and do not break on missing values
|