Edited README and cleaned some old stuff from Makefile

master^2^2
Castro0o 6 years ago
parent ae7e184a26
commit 8451d23f81

1
.gitignore vendored

@ -4,3 +4,4 @@ src/index.json
src/database.json src/database.json
.DS_Store .DS_Store
src/**.wav src/**.wav
ocr/list.txt

@ -41,11 +41,6 @@ dirs: ## create the dirs in working dir
@echo $(color_r)'Directories made': ocr/ hocr/ images/ images-tiff/ output/ @echo $(color_r)'Directories made': ocr/ hocr/ images/ images-tiff/ output/
testif:
ifeq ($(OS),Darwin)
@echo $(OS)
endif
# POST-PROCESSING RECIPES # POST-PROCESSING RECIPES
ocr/output.txt: ## ocr with tesseract ocr/output.txt: ## ocr with tesseract
@ -112,18 +107,6 @@ replace:tiffs hocrs ## Natasha: Analyzes pages in order, replace least common wo
rm $(input-hocr) rm $(input-hocr)
rm $(images-tiff) rm $(images-tiff)
visualization: $(images) $(tmpfile) ##Creates data visualization from images/*.jpg. Dependencies: mplayer
@echo $(tmpfile)
for i in $(images); do \
cat $$i >> $(tmpfile); \
done;
ifeq ($(OS),Darwin)
cat $(tmpfile) | mplayer -sws 4 -zoom -vf dsize=720:720 -demuxer rawvideo -rawvideo w=56:h=64:i420:fps=25 -;
else
cat $(tmpfile) | mplayer -vo x11 -sws 4 -zoom -vf dsize=720:720 -demuxer rawvideo -rawvideo w=50:h=50:i420:fps=25 -;
endif
ttssr-human-only: ocr/output.txt ## Loop: text to speech-speech recognition. Dependencies: espeak, pocketsphinx ttssr-human-only: ocr/output.txt ## Loop: text to speech-speech recognition. Dependencies: espeak, pocketsphinx
bash src/ttssr-loop-human-only.sh ocr/output.txt bash src/ttssr-loop-human-only.sh ocr/output.txt

140
README

@ -1,12 +1,11 @@
# OuNuPo Make # OuNuPo Make
Software experiments for the OuNuPo bookscanner, part of Special Issue 5 Software experiments for the OuNuPo bookscanner, part of Special Issue 5
https://issue.xpub.nl/05/ <https://issue.xpub.nl/05/>
https://xpub.nl/ <https://git.xpub.nl/OuNuPo-make/>
<https://xpub.nl/>
## Licenses
## Authors ## Authors
@ -25,61 +24,55 @@ Natasha Berting, Angeliki Diakrousi, Joca van der Horst, Alexander Roidl, Alice
# Make commands # Make commands
## N+7 (example) Author
Description: Replaces every noun with the 7th next noun in a dictionary. Inspired by an Oulipo work of the same name.
run: `make N+7`
Specific Dependencies:
* a
* b
* c
## Sitting inside a pocket(sphinx): Angeliki ## Sitting inside a pocket(sphinx): Angeliki
Description: Speech recognition feedback loops using the first sentence of a scanned text as input Speech recognition feedback loops using the first sentence of a scanned text as input
run: `make ttssr-human-only` run: `make ttssr-human-only`
Specific Dependencies: Specific Dependencies:
* PocketSphinx package `sudo aptitude install pocketsphinx pocketsphinx-en-us` * PocketSphinx package `sudo aptitude install pocketsphinx pocketsphinx-en-us`
* PocketSphinx: `sudo pip3 install PocketSphinx` * PocketSphinx Python library: `sudo pip3 install PocketSphinx`
* Python Libaries:`sudo apt-get install gcc automake autoconf libtool bison swig python-dev libpulse-dev` * Other software packages:`sudo apt-get install gcc automake autoconf libtool bison swig python-dev libpulse-dev`
* Speech Recognition: `sudo pip3 install SpeechRecognition` * Speech Recognition Python library: `sudo pip3 install SpeechRecognition`
* TermColor: `sudo pip3 install termcolor` * TermColor Python library: `sudo pip3 install termcolor`
* PyAudio: `pip3 install pyaudio` * PyAudio Python library: `sudo pip3 install pyaudio`
Licenses: ### Licenses:
© 2018 WTFPL Do What the Fuck You Want to Public License. © 2018 WTFPL Do What the Fuck You Want to Public License.
© 2018 BSD 3-Clause Berkeley Software Distribution © 2018 BSD 3-Clause Berkeley Software Distribution
## Reading the Structure: Joca ## Reading the Structure: Joca
Description: Uses OCR'ed text as an input, labels each word for Part-of-Speech, stopwords and sentiment. Then it generates a reading interface Uses OCR'ed text as an input, labels each word for Part-of-Speech, stopwords and sentiment. Then it generates a reading interface
where words with a specific label are hidden. Output can be saved as poster, or exported as json featuring the full data set. where words with a specific label are hidden. Output can be saved as poster, or exported as json featuring the full data set.
Run: `make reading_structure` Run: `make reading_structure`
Specific Dependencies: Specific Dependencies:
* nltk (http://www.nltk.org/install.html)
* tokenize.punkt, pos_tag, word_tokenize, sentiment.vader, vader_lexicon (python3, import NLTK, nltk.download() and select these models) * [NLTK](http://www.nltk.org/install.html) packages: tokenize.punkt, pos_tag, word_tokenize, sentiment.vader, vader_lexicon (python3; import nltk; nltk.download() and select these models)
* spaCy (https://spacy.io/usage/) * [spaCy](https://spacy.io/usage/) Python library
* spacy en_core_web_sm model (python3 -m spacy download en_core_web_sm) * spacy: en_core_web_sm model (python3 -m spacy download en_core_web_sm)
* weasyprint (http://weasyprint.readthedocs.io/en/latest/install.html) * [weasyprint](http://weasyprint.readthedocs.io/en/latest/install.html)
* jinja2 (http://jinja.pocoo.org/docs/2.10/intro/#installation) * [jinja2](http://jinja.pocoo.org/docs/2.10/intro/#installation)
* font: PT Sans (os font https://www.fontsquirrel.com/fonts/pt-serif) * font: [PT Sans]( https://www.fontsquirrel.com/fonts/pt-serif)
* font: Ubuntu Mono (os font https://www.fontsquirrel.com/fonts/ubuntu-mono) * font: [Ubuntu Mono](https://www.fontsquirrel.com/fonts/ubuntu-mono)
License: GNU AGPLv3 ### License: GNU AGPLv3
Permissions of this license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Permissions of this license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license.
Copyright and license notices must be preserved. Contributors provide an express grant of patent rights. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.
When a modified version is used to provide a service over a network, the complete source code of the modified version must be made available. When a modified version is used to provide a service over a network, the complete source code of the modified version must be made available.
See src/reading_structure/license.txt for the full license. See src/reading_structure/license.txt for the full license.
## Erase / Replace: Natasha ## Erase / Replace: Natasha
Description: Receives your scanned pages in order, then analyzes each image and its vocabulary. Finds and crops the least common words, and either erases them, or replaces them with the most common words. Outputs a PDF of increasingly distorted scan images. Description: Receives your scanned pages in order, then analyzes each image and its vocabulary. Finds and crops the least common words, and either erases them, or replaces them with the most common words. Outputs a PDF of increasingly distorted scan images.
for erase script run: `make erase` For erase script run: `make erase`
for replace script run: `make replace`
For replace script run: `make replace`
Specific Dependencies: Specific Dependencies:
* NLTK English Corpus: * NLTK English Corpus:
@ -89,62 +82,73 @@ Specific Dependencies:
* "Download" * "Download"
* Python Image Library (PIL): `pip3 install Pillow` * Python Image Library (PIL): `pip3 install Pillow`
* PDF generation for Python (FPDF): `pip3 install fpdf` * PDF generation for Python (FPDF): `pip3 install fpdf`
* HTML5lib: `pip3 install html5lib` * HTML5lib Python Library: `pip3 install html5lib`
Notes & Bugs: ### Notes & Bugs:
This script is very picky about the input images it can work with. For best results, please use high resolution images in RGB colorspace. Errors can occur when image modes do not match or tesseract cannot successfully make HOCR files. This script is very picky about the input images it can work with. For best results, please use high resolution images in RGB colorspace. Errors can occur when image modes do not match or tesseract cannot successfully make HOCR files.
Author: Alice Strete (RO)
## carlandre & over/under: Alice Strete
Person who aspires to call herself a software artist sometime next year. Person who aspires to call herself a software artist sometime next year.
License: ### License:
Copyright © 2018 Alice Strete Copyright © 2018 Alice Strete
This work is free. You can redistribute it and/or modify it under the This work is free. You can redistribute it and/or modify it under the
terms of the Do What The Fuck You Want To Public License, Version 2, terms of the Do What The Fuck You Want To Public License, Version 2,
as published by Sam Hocevar. See http://www.wtfpl.net/ for more details. as published by Sam Hocevar. See http://www.wtfpl.net/ for more details.
Programs:
## carlandre ### Dependencies:
* [pytest](https://docs.pytest.org/en/latest/getting-started.html)
Programs:
### carlandre
Description: Generates concrete poetry from a text file. If you're connected to a printer located in /dev/usb/lp0 you can print the poem. Description: Generates concrete poetry from a text file. If you're connected to a printer located in /dev/usb/lp0 you can print the poem.
run: make carlandre run: `make carlandre`
Dependencies:
* pytest (Documentation: https://docs.pytest.org/en/latest/getting-started.html)
## over/under ### over/under
Description: Interpreted programming language written in Python3 which translates basic weaving instructions into code and applies them to text. Description: Interpreted programming language written in Python3 which translates basic weaving instructions into code and applies them to text.
run: make overunder run: `make overunder`
Instructions: ### Instructions:
over/under works with specific commands which execute specific instructions. * over/under works with specific commands which execute specific instructions.
When running, an interpreter will open: * When running, an interpreter will open:
> `> `
To load your text, type 'load'. This is necessary before any other instructions. Every time you load the text, the previous instructions will be discarded. * To load your text, type 'load'. This is necessary before any other instructions. Every time you load the text, the previous instructions will be discarded.
To see the line you are currently on, type 'show'. * To see the line you are currently on, type 'show'.
To start your pattern, type 'over' or 'under', each followed by an integer, separated by a comma. * To start your pattern, type 'over' or 'under', each followed by an integer, separated by a comma.
e.g. over 5, under 5, over 6, under 10 e.g. over 5, under 5, over 6, under 10
To move on to the next line of text, press enter twice. * To move on to the next line of text, press enter twice.
To see your pattern, type 'pattern'. * To see your pattern, type 'pattern'.
To save your pattern in a text file, type 'save'. * To save your pattern in a text file, type 'save'.
To leave the program, type 'quit'. * To leave the program, type 'quit'.
## oulibot: Alex ## oulibot: Alex
Description: Chatbot that will help you to write a poem based on the text you inserted by giving you constraints. Description: Chatbot that will help you to write a poem based on the text you inserted by giving you constraints.
run: make oulibot run: `make oulibot`
#### Dependencies:
Python libraries
* irc : `pip3 install irc`
* rake_nltk Python library: `pip3 install rake_nltk`
* textblob: `pip3 install textblob`
* PIL: `pip3 install Pillow`
* numpy: `pip3 install numpy`
* tweepy: `pip3 install tweepy`
* NLTK stopwords:
* run NLTK downloader `python -m nltk.downloader`
* select menu "Corpora"
* select "stopwords"
* "Download"
Dependencies:
irc.bot: pip3 install irc_client
nltk: pip3 install nltk && python3 -m nltk.downloader
rake_nltk: pip3 install rake_nltk
nltk.tokenize
nltk.corpus
textblob: pip3 install textblob
PIL: pip3 install Pillow
numpy: pip3 install numpy
tweepy: pip3 install tweepy

Loading…
Cancel
Save