From bdb42043e04ebd96323a42f5089e66155b1b72be Mon Sep 17 00:00:00 2001 From: Damlanur Date: Wed, 19 Feb 2020 16:40:38 +0100 Subject: [PATCH] added convert command to readme --- README.md | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 0eb9584..dc629e5 100644 --- a/README.md +++ b/README.md @@ -39,28 +39,28 @@ Run scripts together with `./run.sh` 1 script at a time: -`python3 download_imgs.py` -* Downloads all images from wiki to `images/` directory +`python3 download_imgs.py` +* Downloads all images from wiki to `images/` directory * and stores each image's metadata to `images.json` `python3 query2html.py` -* with ask API perform a query: +* with ask API perform a query: * help `python3 query2html.py --help` * run dry `python3 query2html.py --dry` only printing request, not executing it * build custom query with arguments `--conditions --printouts --sort --order` * default query is: `[[File:+]][[Title::+]][[Part::+]][[Date::+]]|?Title|?Date|?Part|?Partof|sort=Date,Title,Part|order=asc,asc,asc` - * custom queries + * custom queries * `python3 query2html.py --conditions '[[Date::>=1970/01/01]][[Date::<=1979/12/31]]'` * `python3 query2html.py --conditions '[[Creator::~*task force*]]'` -Note: to avoid confusion or problems is better to leave the `--printouts` `--sort` `--order` arguments as the default. +Note: to avoid confusion or problems is better to leave the `--printouts` `--sort` `--order` arguments as the default. Otherwise document parts will start to get grouped not according to their Title, hence creating documents made from different original parts. ## How does query2html.py work? Based on the query made: -MW API will send back a number of Page titles that match the query conditions, +MW API will send back a number of Page titles that match the query conditions, together with its printouts (metadata proprety::value pairs). For each Page: @@ -69,18 +69,20 @@ For each Page: * a fragment of html (`document_part_html`) is generated based on the `templates/document_part.html` All Pages that *share the same metadata's Title value*, will: -* gather all their html fragments in `all_document_parts` +* gather all their html fragments in `all_document_parts` * render `templates/document.html` with the content of `all_document_parts` -* save the render template to `'static_html/DocumentTitle.html'`, - +* save the render template to `'static_html/DocumentTitle.html'`, + Each of the saved documents: * render `templates/index.html` with the info on each document has been saved into `documentslist` * resulting in `static_html/index.html` - + # Bulk image upload upload_imgs_dir.py Get Help: `python3 upload_imgs_dir.py --help` -**Edit and run via** `.helper-upload_imgs_dir.sh` +**Edit and run via** `.helper-upload_imgs_dir.sh` +to convert pdfs to jpgs: +convert -quality 100 -density 300 [name-of-pdf] %02d.jpg