You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

58 lines
4.0 KiB
HTML

4 years ago
<!DOCTYPE html>
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Tasks of the Contingent Librarian</title>
<link rel="stylesheet" type="text/css" href="tasks.css">
<script src="tasks.js"></script>
</head>
<body>
<div class="cardback"><DOCUMENT_FRAGMENT><div class="mw-parser-output"><div class="thumb tright"><div class="thumbinner" style="width:152px;"><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f6/XPUB_bookscanner.jpeg/640px-XPUB_bookscanner.jpeg"><img alt="" class="thumbimage" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f6/XPUB_bookscanner.jpeg/320px-XPUB_bookscanner.jpeg"></a> <div class="thumbcaption"><div class="magnify"><a class="internal" href="File:XPUB_bookscanner.jpeg.html" title="Enlarge"></a></div>An archivist bookscanner</div></div></div>
<p>Snippets:
</p>
<h2><span class="mw-headline" id="first_trials_with_the_bookscanner">first trials with the bookscanner</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Trim4/Using_the_bookscanner&amp;action=edit&amp;section=T-1" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>I tried using the Bookscanner, built as part of <a href="OuNuPo.html" title="OuNuPo">Special Issue 5 OuNuPo</a>. The Bookscanner needed a few adjustments before it was ready to use. The documentation is rather limited, but the software is set up so that it is quite easy to work out how to use it.
</p><p>The scanner takes photos of even and odd pages, from cameras mounted above the glass. First, you have to mount a drive in which the scanned images will be stored. Then, you can adjust the zoom, and shutter speed. I found it impossible to take an image of only the page, so I will need to crop out everything around it in the future.
</p><p>I scanned a book, and made jpegs of each page. The pages are oriented from the camera's perspective, like so:
</p><p><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/6/64/Translations_scan_01.jpeg/640px-Translations_scan_01.jpeg"><img alt="Translations scan 01.jpeg" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/6/64/Translations_scan_01.jpeg/320px-Translations_scan_01.jpeg"></a>
<a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f9/Translations_scan_02.jpeg/640px-Translations_scan_02.jpeg"><img alt="Translations scan 02.jpeg" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f9/Translations_scan_02.jpeg/320px-Translations_scan_02.jpeg"></a>
</p><p>Next, I ran a script that does OCR on jpegs that Pedro, Tancre and Bo made for their workshop <a href="Blurry_Boundaries.html" title="Blurry Boundaries"> Blurry Boundaries</a> as part of Special Issue 9: The Library Is Open. This created two PDFs, with OCR. The next step will to be to work out how to rotate the images 90 degrees to the correct orientation, (clockwise for the odd pages, anti-clockwise for the even pages), and crop the images.
</p><p>The workflow will be like so:
</p>
<pre>1. Scan
2. Rotate images
3. Crop
4. OCR
5. Compile
</pre>
<!--
NewPP limit report
Cached time: 20200616174621
Cache expiry: 86400
Dynamic content: false
CPU time usage: 0.022 seconds
Real time usage: 0.075 seconds
Preprocessor visited node count: 7/1000000
Preprocessor generated node count: 26/1000000
Postexpand include size: 1346/2097152 bytes
Template argument size: 0/2097152 bytes
Highest expansion depth: 2/40
Expensive parser function count: 0/100
Unstrip recursion depth: 0/20
Unstrip postexpand size: 63/5000000 bytes
-->
<!--
Transclusion expansion time report (%,ms,calls,template)
100.00% 5.633 1 User:Simon/Trim4/Using_the_bookscanner
100.00% 5.633 1 -total
-->
<!-- Saved in parser cache with key wdka_mw_mediadesign-mw_:pcache:idhash:31715-0!canonical and timestamp 20200616174621 and revision id 173979
-->
</div></DOCUMENT_FRAGMENT></div>
</body>
</html>