## Scripted PDF generation with reportlab

Reportlab is a python library that can generate PDFs. It provides a "canvas" object that gives very precise controls to create/draw on pages of a PDF. There are lots of drawing commands that you can use see [the docs in the RL userguide](
https://www.reportlab.com/docs/reportlab-userguide.pdf#page=11).

### Draw a grid


In [1]:
from reportlab.pdfgen.canvas import Canvas
from reportlab.lib.pagesizes import A4, A0
from reportlab.lib.units import inch, cm, mm
import sys
from reportlab.pdfbase.ttfonts import TTFont, pdfmetrics

In [2]:
# Normally you can use a built in page size like A0, A4
# c = Canvas("grid.pdf", pagesize=A0, bottomup=0)

In [3]:
# But we make a custom size with pagew & pageh
# nb the bottomup option this makes 0,0 the top left of page as you might expect (but it isn't the default)

In [4]:
pagew, pageh = 75*mm, 75*mm
c = Canvas("grid.pdf", pagesize=(pagew, pageh), bottomup=0)

and don't forget about the question mark to get docs

In [5]:
c.line?

[0;31mSignature:[0m [0mc[0m[0;34m.[0m[0mline[0m[0;34m([0m[0mx1[0m[0;34m,[0m [0my1[0m[0;34m,[0m [0mx2[0m[0;34m,[0m [0my2[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
draw a line segment from (x1,y1) to (x2,y2) (with color, thickness and
other attributes determined by the current graphics state).
[0;31mFile:[0m      /usr/local/lib/python3.7/dist-packages/reportlab/pdfgen/canvas.py
[0;31mType:[0m      method


In [6]:
c.setLineWidth?

[0;31mSignature:[0m [0mc[0m[0;34m.[0m[0msetLineWidth[0m[0;34m([0m[0mwidth[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m <no docstring>
[0;31mFile:[0m      /usr/local/lib/python3.7/dist-packages/reportlab/pdfgen/canvas.py
[0;31mType:[0m      method


In [7]:
colw = pagew/10
print (A0)
print (colw)

(2383.937007874016, 3370.393700787402)
21.259842519685044


In [8]:
for x in range(10):
    c.line(x*colw, 0, x*colw, pageh)

In [9]:
colh = pageh/10
print (colh)
for y in range(10):
    c.line(0, y*colh, pagew, y*colh)

21.259842519685044


In [10]:
c.showPage() # always shave a page to finish it...
c.save() # finally writes the PDF

### Make a PDF with blank pages

In [11]:
from reportlab.pdfgen.canvas import Canvas
from reportlab.lib.pagesizes import letter, A4
from reportlab.lib.units import mm, cm
# from reportlab.pdfbase.ttfonts import TTFont, pdfmetrics

size = (75*mm, 75*mm)
pages=10
c = Canvas("blank.pdf", pagesize=size)
for i in range(pages):
    # c.setPageSize(size)
    c.showPage()
c.save()

## Make simple text titles

Report lab's canvas doesn't do word wrapping / line breaking, but if you are OK with manually placing type you can use it.

In [12]:
from reportlab.pdfbase.ttfonts import TTFont

In [15]:
from reportlab.pdfgen.canvas import Canvas
from reportlab.lib.pagesizes import letter, A4
from reportlab.lib.units import mm, cm
# from reportlab.pdfbase.ttfonts import TTFont, pdfmetrics

titles = ("One", "Two", "Three", "Four")
titles = [
    
    EA cleared of $11 million loot box fine in The Netherlands




size = (75*mm, 75*mm)
pages=10

pdfmetrics.registerFont(TTFont('SI17', 'SpecialIssue17-Regular.ttf'))

c = Canvas("titles.pdf", pagesize=size, bottomup=0)
for title in titles:
    c.setFont('SI17', 32)
    c.drawString(5*mm, 15*mm, title.upper()) # nb the 2nd number is the distance (from top) to the baseline of the text
    # c.setPageSize(size)
    c.showPage()
c.save()

### Convert a folder of images to PDF

Let's download some (public domain) game tiles from Kenney.nl

In [14]:
!wget https://www.kenney.nl/content/3-assets/14-monochrome-rpg/kenney_monochromerpg.zip

--2022-03-10 13:19:25--  https://www.kenney.nl/content/3-assets/14-monochrome-rpg/kenney_monochromerpg.zip
Resolving www.kenney.nl (www.kenney.nl)... 149.210.216.123
Connecting to www.kenney.nl (www.kenney.nl)|149.210.216.123|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 387381 (378K) [application/zip]
Saving to: ‘kenney_monochromerpg.zip’


2022-03-10 13:19:25 (14.1 MB/s) - ‘kenney_monochromerpg.zip’ saved [387381/387381]



In [15]:
!unzip kenney_monochromerpg.zip -d kenney

Archive:  kenney_monochromerpg.zip
   creating: kenney/Dot Matrix/
   creating: kenney/Dot Matrix/Sprites/
  inflating: kenney/Dot Matrix/Sprites/character0.png  
  inflating: kenney/Dot Matrix/Sprites/character1.png  
  inflating: kenney/Dot Matrix/Sprites/character2.png  
  inflating: kenney/Dot Matrix/Sprites/character3.png  
  inflating: kenney/Dot Matrix/Sprites/enemy0.png  
  inflating: kenney/Dot Matrix/Sprites/enemy1.png  
  inflating: kenney/Dot Matrix/Sprites/enemy2.png  
  inflating: kenney/Dot Matrix/Sprites/heart0.png  
  inflating: kenney/Dot Matrix/Sprites/heart1.png  
  inflating: kenney/Dot Matrix/Sprites/heart2.png  
  inflating: kenney/Dot Matrix/Sprites/item0.png  
   creating: kenney/Dot Matrix/Tilemap/
  inflating: kenney/Dot Matrix/Tilemap/tilemap.png  
  inflating: kenney/Dot Matrix/Tilemap/tilemap_packed.png  
   creating: kenney/Dot Matrix/Tiles/
  inflating: kenney/Dot Matrix/Tiles/tile_0000.png  
  inflating: kenney/Dot Matrix/Tiles/tile_0001.png  
  inflati

In [16]:
# We use PIL (python image library, actually now called Pillow) to open images

In [17]:
from PIL import Image
from reportlab.pdfgen.canvas import Canvas
from reportlab.lib.pagesizes import letter, A4
from reportlab.lib.units import mm, cm
# from reportlab.pdfbase.ttfonts import TTFont, pdfmetrics
from pathlib import Path

# size=A6
size = (75*mm, 75*mm)

c = Canvas("tiles.pdf", pagesize=size)
# Convert all the png files in the Kenney download (folder named Default from the zip above)
images = Path("kenney/Default/Tiles")
for imagepath in images.glob("*.png"):
    image = Image.open(imagepath)
    c.drawInlineImage(image, 0,0, width=75*mm,height=75*mm)
    # show page saves the current page & starts a new one
    c.showPage()
c.save()

## Webpage to PDF with weasyprint

* https://doc.courtbouillon.org/weasyprint/stable/first_steps.html#quickstart
* https://doc.courtbouillon.org/weasyprint/stable/first_steps.html?highlight=page%20size


In [18]:
from weasyprint import HTML, CSS

In [19]:
url = "https://www.ruetir.com/2022/03/09/bold-statement-loot-boxes-in-football-game-fifa-now-allowed/"
css = CSS(string='@page { size: 75mm 75mm; margin: 0mm }')
HTML(url).write_pdf('bold-statement-loot-boxes.pdf', stylesheets=[css])

so like the first part ... D=titles.pdf etc just names the PDF, when you then say D2 after the command **cat** it takes adds page 2 of titles.pdf to the eventual *output* (so **cat** starts the command and **output** then ends it + the name of the file to save, it's a little weird syntax, but super useful!!!!

In [22]:
!pdftk A=grid.pdf B=bold-statement-loot-boxes.pdf C=tiles.pdf D=titles.pdf cat D1 A D2 B D3 C output all.pdf

In [23]:
man pdftk

PDFTK(1)		    General Commands Manual		      PDFTK(1)

NAME
       pdftk - A handy tool for manipulating PDF

SYNOPSIS
       pdftk <input PDF files | - | PROMPT>
	    [ input_pw <input PDF owner passwords | PROMPT> ]
	    [ <operation> <operation arguments> ]
	    [ output <output filename | - | PROMPT> ]
	    [ encrypt_40bit | encrypt_128bit ]
	    [ allow <permissions> ]
	    [ owner_pw <owner password | PROMPT> ]
	    [ user_pw <user password | PROMPT> ]
	    [ flatten ] [ need_appearances ]
	    [ compress | uncompress ]
	    [ keep_first_id | keep_final_id ] [ drop_xfa ] [ drop_xmp ]
	    [ verbose ] [ dont_ask | do_ask ]
       Where:
	    <operation> may be empty, or:
	    [ cat | shuffle | burst | rotate |
	      generate_fdf | fill_form |
	      background | multibackground |
	      stamp | multistamp |
	      dump_data | dump_data_utf8 |
	      dump_data_fields | dump_data_fields_utf8 |
	      dump_data_annots |
	      update_info | update_info_utf8 |
	      attach_files | u