edits to rvrs

master
simon 4 years ago
parent 312f27a4c9
commit 50755f23fb

@ -11,7 +11,7 @@
<div class="cardback"><DOCUMENT_FRAGMENT><div class="mw-parser-output"><div class="thumb tright"><div class="thumbinner" style="width:152px;"><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/a/a3/Hidden_characters.jpg/960px-Hidden_characters.jpg"><img alt="" class="thumbimage" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/a/a3/Hidden_characters.jpg/320px-Hidden_characters.jpg"></a> <div class="thumbcaption"><div class="magnify"><a class="internal" href="File:Hidden_characters.jpg.html" title="Enlarge"></a></div>Hidden characters (e.g. tabs, spaces, carriage and soft returns)</div></div></div>
<h2><span class="mw-headline" id="Extracting_text_from_a_PDF">Extracting text from a PDF</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Trim4/Extracting_text_from_PDF&amp;action=edit&amp;section=T-1" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>In Al Sweigart's <i>Automate the Boring Stuff with Python</i>, there's a nice section on a Python library called PyPDF2 that allows you to work with the contents of PDFs. To begin with, I thought I'd try extracting text from a PDF of William S. Burrough's <i>The Electronic Revolution</i>. I chose this PDF as the only version I've found of it online is a 40pp document published by ubuclassics (which I suppose is the publishing house for ubuweb.com). There was no identifier other than this (no ISBN etc.), and it was impossible locating any other version online. What's more, the PDF had very small text, which was uncomfortable to read when I ran the <a href="Michaels_booklet_script_for_PDF_imposition.html" title="User:Simon/Trim4/Michaels booklet script for PDF imposition">booklet.sh</a> script on it.
<p>In Al Sweigart's <i>Automate the Boring Stuff with Python</i>, there's a nice section on a Python library called PyPDF2 that allows you to work with the contents of PDFs. To begin with, I thought I'd try extracting text from a PDF of William S. Burrough's <i>The Electronic Revolution</i>. I chose this PDF as the only version I've found of it online is a 40pp document published by ubuclassics (which I suppose is the publishing house for ubuweb.com). There was no identifier other than this (no ISBN etc.), and it was impossible locating any other version online. What's more, the PDF had very small text, which was uncomfortable to read when I ran the booklet.sh script on it.
</p><p>I thought it would be worthwhile laying out this book again for print reading purposes, and the first step is to get the text from the PDF. Pandoc is usually my go to for extracting text, but it doesn't work with PDFs, so I tried <a class="external text" href="https://pythonhosted.org/PyPDF2/index.html" rel="nofollow">PyPDF2</a>.
</p>
<h3><span class="mw-headline" id="28.09.19">28.09.19</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Trim4/Extracting_text_from_PDF&amp;action=edit&amp;section=T-2" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h3>

@ -29,7 +29,7 @@
</div>
<h2><span class="mw-headline" id="Text_Laundrette">Text Laundrette</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Trim4/Text_Laundrette&amp;action=edit&amp;section=T-1" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p><i>Text Laundrette</i> is a workshop in which we use a home-made, DIY book scanner, and open-source software to scan, process, and add digital features to printed texts brought by the participants to the workshop. These are included in the “bootleg library”, a shadow library accessible over a local network. The workshop was organised by Simon Browne and Pedro Sá Couto, for the 2020 <a href="Py.rate.chnic_sessions.html" title="Py.rate.chnic sessions">py.rate.chnic sessions</a> and first held at WdKA in the Publication Station, February 2020.
<p><i>Text Laundrette</i> is a workshop in which we use a home-made, DIY book scanner, and open-source software to scan, process, and add digital features to printed texts brought by the participants to the workshop. These are included in the “bootleg library”, a shadow library accessible over a local network. The workshop was organised by Simon Browne and Pedro Sá Couto, for the 2020 py.rate.chnic sessions and first held at WdKA in the Publication Station, February 2020.
</p>
<h3><span class="mw-headline" id="Description">Description</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Trim4/Text_Laundrette&amp;action=edit&amp;section=T-2" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h3>
<div class="thumb tright"><div class="thumbinner" style="width:302px;"><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/2/27/Text_launderette_bookscanner.png"><img alt="" class="thumbimage" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/2/27/Text_launderette_bookscanner.png/320px-Text_launderette_bookscanner.png"></a> <div class="thumbcaption"><div class="magnify"><a class="internal" href="File:Text_launderette_bookscanner.png.html" title="Enlarge"></a></div>The bookscanner</div></div></div>

@ -82,7 +82,7 @@ At Leeszaal:
</p>
<ul><li>Transcription of the voice performance (we can be inspired by film transcriptions) - perhaps we could annotate this as well:<br></li></ul>
<p><a class="external free" href="https://static1.squarespace.com/static/583ae0a12994ca4dbbf813f6/t/58572e856a49634cd5602264/1531923111860" rel="nofollow">https://static1.squarespace.com/static/583ae0a12994ca4dbbf813f6/t/58572e856a49634cd5602264/1531923111860</a>
</p><p><a class="external text" href="https://pad.xpub.nl/p/sh_encoding_decoding%7C" rel="nofollow">Annotations on Stuart Hall's <i>Encoding, Decoding</i></a>
</p><p><a class="external text" href="https://pad.xpub.nl/p/sh_encoding_decoding" rel="nofollow">Annotations on Stuart Hall's <i>Encoding, Decoding</i></a>
</p>
<!--
NewPP limit report

@ -11,8 +11,7 @@
<div class="cardback"><DOCUMENT_FRAGMENT><div class="mw-parser-output"><div class="thumb tright"><div class="thumbinner" style="width:152px;"><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/d/d2/Cesarco_index.jpeg/960px-Cesarco_index.jpeg"><img alt="" class="thumbimage" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/d/d2/Cesarco_index.jpeg/320px-Cesarco_index.jpeg"></a> <div class="thumbcaption"><div class="magnify"><a class="internal" href="File:Cesarco_index.jpeg.html" title="Enlarge"></a></div>Alejandro Cesarco, <i>Index</i></div></div></div>
<h2><span id="Reflections_on_classification_in_Hope_A._Olson,_Mapping_beyond_Dewey's_Boundaries"></span><span class="mw-headline" id="Reflections_on_classification_in_Hope_A._Olson.2C_Mapping_beyond_Dewey.27s_Boundaries">Reflections on classification in Hope A. Olson, <i>Mapping beyond Dewey's Boundaries</i></span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/spatial_classification&amp;action=edit&amp;section=T-1" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p><a href="File:Olson_mapping_beyond_deweys_boundaries.a4.pdf.html" title="File:Olson mapping beyond deweys boundaries.a4.pdf">File:Olson mapping beyond deweys boundaries.a4.pdf</a><br>
</p><p>Hope A Olson's text "Mapping Beyond Dewey's Boundaries" on spatial representations in classification systems, explored through a project that attempted to cross-reference two classification systems - <a class="external text" href="https://www.amazon.com/Womens-Thesaurus-Index-Language-Ellen/dp/0061811718" rel="nofollow"><i>A Woman's Thesaurus</i></a> and Dewey Decimal Classification by <a class="external text" href="http://capekconsulting.com/about-mary-ellen-capek/" rel="nofollow">Mary Ellen S Capek</a>. Stating that "classifications are locational systems" suggests that spatial representations can be used with various effect; describing, exposing, and when used as metaphors, shifting the discourse.
<p>Hope A Olson's text "Mapping Beyond Dewey's Boundaries" on spatial representations in classification systems, explored through a project that attempted to cross-reference two classification systems - <a class="external text" href="https://www.amazon.com/Womens-Thesaurus-Index-Language-Ellen/dp/0061811718" rel="nofollow"><i>A Woman's Thesaurus</i></a> and Dewey Decimal Classification by <a class="external text" href="http://capekconsulting.com/about-mary-ellen-capek/" rel="nofollow">Mary Ellen S Capek</a>. Stating that "classifications are locational systems" suggests that spatial representations can be used with various effect; describing, exposing, and when used as metaphors, shifting the discourse.
</p><p><b>1. Spatial representation of classification systems reveals the ideological conditions that form them.</b>
</p><p>Olson refers to spatial representations of classification systems in the form of diagrams.
</p><p>The first diagram is one that shows distribution of subjects, with the idea of a mainstream core that diffuses towards the margins. The second is a Venn diagram that illustrates how "mainstream" or "core" descriptors actually eventuate in very small "cores" due to limitations by Boolean "ands'. Venn diagrams operate on the basis of duality - something is or isn't part of a set.<br>

@ -11,12 +11,12 @@
<div class="cardback"><DOCUMENT_FRAGMENT><div class="mw-parser-output"><div class="thumb tright"><div class="thumbinner" style="width:152px;"><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f6/XPUB_bookscanner.jpeg/960px-XPUB_bookscanner.jpeg"><img alt="" class="thumbimage" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f6/XPUB_bookscanner.jpeg/320px-XPUB_bookscanner.jpeg"></a> <div class="thumbcaption"><div class="magnify"><a class="internal" href="File:XPUB_bookscanner.jpeg.html" title="Enlarge"></a></div>An archivist bookscanner</div></div></div>
<h2><span class="mw-headline" id="first_trials_with_the_bookscanner">first trials with the bookscanner</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Trim4/Using_the_bookscanner&amp;action=edit&amp;section=T-1" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>I tried using the Bookscanner, built as part of <a href="OuNuPo.html" title="OuNuPo">Special Issue 5 OuNuPo</a>. The Bookscanner needed a few adjustments before it was ready to use. The documentation is rather limited, but the software is set up so that it is quite easy to work out how to use it.
<p>I tried using the Bookscanner, built as part of Special Issue 5 OuNuPo. The Bookscanner needed a few adjustments before it was ready to use. The documentation is rather limited, but the software is set up so that it is quite easy to work out how to use it.
</p><p>The scanner takes photos of even and odd pages, from cameras mounted above the glass. First, you have to mount a drive in which the scanned images will be stored. Then, you can adjust the zoom, and shutter speed. I found it impossible to take an image of only the page, so I will need to crop out everything around it in the future.
</p><p>I scanned a book, and made jpegs of each page. The pages are oriented from the camera's perspective, like so:
</p><p><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/6/64/Translations_scan_01.jpeg/960px-Translations_scan_01.jpeg"><img alt="Translations scan 01.jpeg" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/6/64/Translations_scan_01.jpeg/320px-Translations_scan_01.jpeg"></a>
<a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f9/Translations_scan_02.jpeg/960px-Translations_scan_02.jpeg"><img alt="Translations scan 02.jpeg" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/f/f9/Translations_scan_02.jpeg/320px-Translations_scan_02.jpeg"></a>
</p><p>Next, I ran a script that does OCR on jpegs that Pedro, Tancre and Bo made for their workshop <a href="Blurry_Boundaries.html" title="Blurry Boundaries"> Blurry Boundaries</a> as part of Special Issue 9: The Library Is Open. This created two PDFs, with OCR. The next step will to be to work out how to rotate the images 90 degrees to the correct orientation, (clockwise for the odd pages, anti-clockwise for the even pages), and crop the images.
</p><p>Next, I ran a script that does OCR on jpegs that Pedro, Tancre and Bo made for their workshop Blurry Boundaries as part of Special Issue 9: The Library Is Open. This created two PDFs, with OCR. The next step will to be to work out how to rotate the images 90 degrees to the correct orientation, (clockwise for the odd pages, anti-clockwise for the even pages), and crop the images.
</p><p>The workflow will be like so:
</p>
<ol><li>Scan</li>

@ -11,7 +11,7 @@
<div class="cardback"><DOCUMENT_FRAGMENT><div class="mw-parser-output"><div class="thumb tright"><div class="thumbinner" style="width:152px;"><a class="image" href="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/5/59/Rosetta_stone.jpeg/960px-Rosetta_stone.jpeg"><img alt="" class="thumbimage" decoding="async" src="https://pzwiki.wdka.nl/mw-mediadesign/images/thumb/5/59/Rosetta_stone.jpeg/320px-Rosetta_stone.jpeg"></a> <div class="thumbcaption"><div class="magnify"><a class="internal" href="File:Rosetta_stone.jpeg.html" title="Enlarge"></a></div><i>The Rosetta Stone</i>, a tablet discovered in 1799 inscribed with three versions of a decree written in Ancient Egyptian and Ancient Greek</div></div></div>
<h2><span class="mw-headline" id="traces_of_book_use_in_from_the_books">traces of book use in <i>from the books</i></span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/Annotation_typologies&amp;action=edit&amp;section=T-1" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>Typologies of traces of use identified from a previous project called <a href="From_the_Books:_SLV_RBRR_000-099.html" title="User:Simon/From the Books: SLV RBRR 000-099">From the Books</a>, which explored books from the 000-099 section of the Redmond Barry Reading Room in the State Library of Victoria.
<p>Typologies of traces of use identified from a previous project called <em>From the Books: SLV RBRR 000-099</em>, which explored books from the 000-099 section of the Redmond Barry Reading Room in the State Library of Victoria.
</p>
<ul><li>ACCIDENTAL DOG-EAR</li>
<li>ANNOTATION</li>

@ -91,12 +91,14 @@
</div>
</div></li>
</ul>
<h3><span class="mw-headline" id="pads">pads</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/bootleg_library_at_wijnhaven_61&amp;action=edit&amp;section=T-4" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h3>
<p>Description pad: <a class="external text" href="https://pad.xpub.nl/p/Bootleg_Library_Workshop_Sessions%7C" rel="nofollow">Bootleg Library Workshop Sessions</a><br>
<a href="bootleg_lib_wijnhaven_61_pad.html" title="User:Simon/bootleg lib wijnhaven 61 pad">User:Simon/bootleg lib wijnhaven 61 pad</a><br>
bootleg library sessions pad: <a class="external text" href="https://pad.xpub.nl/p/bootleg_library_sessions%7C" rel="nofollow">Documentation of Session</a><br>
<a href="bootleg_library_sessions_wdka_pad_dump.html" title="User:Simon/bootleg library sessions wdka pad dump">User:Simon/bootleg library sessions wdka pad dump</a>
<h3>
<span class="mw-headline" id="pads">pads</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/mw-mediadesign/index.php?title=User:Simon/bootleg_library_at_wijnhaven_61&amp;action=edit&amp;section=T-4" title="Edit section: ">edit</a><span class="mw-editsection-bracket">]</span></span></h3>
<p>Description pad: <a class="external text" href="https://pad.xpub.nl/p/Bootleg_Library_Workshop_Sessions" rel="nofollow">Bootleg Library Workshop Sessions</a><br>
<!-- <a href="bootleg_lib_wijnhaven_61_pad.html" title="User:Simon/bootleg lib wijnhaven 61 pad">User:Simon/bootleg lib wijnhaven 61 pad</a><br> -->
bootleg library sessions pad: <a class="external text" href="https://pad.xpub.nl/p/bootleg_library_sessions" rel="nofollow">Documentation of Session</a><br>
<!-- <a href="bootleg_library_sessions_wdka_pad_dump.html" title="User:Simon/bootleg library sessions wdka pad dump">User:Simon/bootleg library sessions wdka pad dump</a> -->
</p>
<!--
NewPP limit report
Cached time: 20200620142937

Loading…
Cancel
Save