|
|
<!DOCTYPE html>
|
|
|
<!DOCTYPE html>
|
|
|
<html>
|
|
|
<head>
|
|
|
<meta charset="utf-8">
|
|
|
<title>Tasks of the Contingent Librarian</title>
|
|
|
<link rel="stylesheet" type="text/css" href="tasks.css">
|
|
|
<script src="tasks.js"></script>
|
|
|
</head>
|
|
|
<body>
|
|
|
|
|
|
<div class="cardback"><DOCUMENT_FRAGMENT><div class="mw-parser-output"><div class="thumb tright"><div class="thumbinner" style="width:152px;"><a class="image" href="File:XPUB_bookscanner.jpeg.html"><img alt="" class="thumbimage" decoding="async" height="106" src="/mw-mediadesign/images/thumb/f/f6/XPUB_bookscanner.jpeg/150px-XPUB_bookscanner.jpeg" srcset="/mw-mediadesign/images/thumb/f/f6/XPUB_bookscanner.jpeg/225px-XPUB_bookscanner.jpeg 1.5x, /mw-mediadesign/images/thumb/f/f6/XPUB_bookscanner.jpeg/300px-XPUB_bookscanner.jpeg 2x" width="150"></a> <div class="thumbcaption"><div class="magnify"><a class="internal" href="File:XPUB_bookscanner.jpeg.html" title="Enlarge"></a></div>An archivist bookscanner</div></div></div>
|
|
|
<p>Snippets:
|
|
|
</p>
|
|
|
<h2><span class="mw-headline" id="first_trials_with_the_bookscanner">first trials with the bookscanner</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Trim4/Using_the_bookscanner&action=edit&section=T-1" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
|
|
|
<p>I tried using the Bookscanner, built as part of <a href="OuNuPo.html" title="OuNuPo">Special Issue 5 OuNuPo</a>. The Bookscanner needed a few adjustments before it was ready to use. The documentation is rather limited, but the software is set up so that it is quite easy to work out how to use it.
|
|
|
</p><p>The scanner takes photos of even and odd pages, from cameras mounted above the glass. First, you have to mount a drive in which the scanned images will be stored. Then, you can adjust the zoom, and shutter speed. I found it impossible to take an image of only the page, so I will need to crop out everything around it in the future.
|
|
|
</p><p>I scanned a book, and made jpegs of each page. The pages are oriented from the camera's perspective, like so:
|
|
|
</p><p><a class="image" href="File:Translations_scan_01.jpeg.html"><img alt="Translations scan 01.jpeg" decoding="async" height="300" src="/mw-mediadesign/images/thumb/6/64/Translations_scan_01.jpeg/400px-Translations_scan_01.jpeg" srcset="/mw-mediadesign/images/thumb/6/64/Translations_scan_01.jpeg/600px-Translations_scan_01.jpeg 1.5x, /mw-mediadesign/images/thumb/6/64/Translations_scan_01.jpeg/800px-Translations_scan_01.jpeg 2x" width="400"></a>
|
|
|
<a class="image" href="File:Translations_scan_02.jpeg.html"><img alt="Translations scan 02.jpeg" decoding="async" height="300" src="/mw-mediadesign/images/thumb/f/f9/Translations_scan_02.jpeg/400px-Translations_scan_02.jpeg" srcset="/mw-mediadesign/images/thumb/f/f9/Translations_scan_02.jpeg/600px-Translations_scan_02.jpeg 1.5x, /mw-mediadesign/images/thumb/f/f9/Translations_scan_02.jpeg/800px-Translations_scan_02.jpeg 2x" width="400"></a>
|
|
|
</p><p>Next, I ran a script that does OCR on jpegs that Pedro, Tancre and Bo made for their workshop <a href="Blurry_Boundaries.html" title="Blurry Boundaries"> Blurry Boundaries</a> as part of Special Issue 9: The Library Is Open. This created two PDFs, with OCR. The next step will to be to work out how to rotate the images 90 degrees to the correct orientation, (clockwise for the odd pages, anti-clockwise for the even pages), and crop the images.
|
|
|
</p><p>The workflow will be like so:
|
|
|
</p>
|
|
|
<pre>1. Scan
|
|
|
2. Rotate images
|
|
|
3. Crop
|
|
|
4. OCR
|
|
|
5. Compile
|
|
|
</pre>
|
|
|
<!--
|
|
|
NewPP limit report
|
|
|
Cached time: 20200612082943
|
|
|
Cache expiry: 86400
|
|
|
Dynamic content: false
|
|
|
CPU time usage: 0.021 seconds
|
|
|
Real time usage: 0.087 seconds
|
|
|
Preprocessor visited node count: 7/1000000
|
|
|
Preprocessor generated node count: 26/1000000
|
|
|
Post‐expand include size: 1346/2097152 bytes
|
|
|
Template argument size: 0/2097152 bytes
|
|
|
Highest expansion depth: 2/40
|
|
|
Expensive parser function count: 0/100
|
|
|
Unstrip recursion depth: 0/20
|
|
|
Unstrip post‐expand size: 63/5000000 bytes
|
|
|
-->
|
|
|
<!--
|
|
|
Transclusion expansion time report (%,ms,calls,template)
|
|
|
100.00% 4.798 1 User:Simon/Trim4/Using_the_bookscanner
|
|
|
100.00% 4.798 1 -total
|
|
|
-->
|
|
|
|
|
|
<!-- Saved in parser cache with key wdka_mw_mediadesign-mw_:pcache:idhash:31715-0!canonical and timestamp 20200612082943 and revision id 173979
|
|
|
-->
|
|
|
</div></DOCUMENT_FRAGMENT></div>
|
|
|
|
|
|
</body>
|
|
|
</html>
|