You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
5bf28905b6 | 7 years ago | |
---|---|---|
ocr | 7 years ago | |
src | 7 years ago | |
.gitignore | 7 years ago | |
HELP-makefile.md | 7 years ago | |
Makefile | 7 years ago | |
README | 7 years ago |
README
# OuNuPo Make Software experiments for the OuNuPo bookscanner, part of Special Issue 5 https://issue.xpub.nl/05/ https://xpub.nl/ ## License ## Authors Natasha Berting, Angeliki Diakrousi, Joca van der Horst, Alexander Roidl, Alice Strete and Zalán Szakács. ## Clone Repository `git clone https://git.xpub.nl/repos/OuNuPo-make.git` ## General depencies * Python3 * GNU make * Python3 NLTK `pip3 install nltk` * NLTK English Corpus: * run NLTK downloader `python -m nltk.downloader` * select menu "Corpora" * select "stopwords" * "Dowload" # Make commands ## N+7 (example) Author Description: Replaces every word with the 7th next word in a dictionary. run: `make N+7` Specific Dependencies: * a * b * c ## Sitting inside a pocket(sphinx): Angeliki Description: Speech recognition feedback loops using the first sentence of a scanned text as input run: `make ttssr-human-only` Specific Dependencies: * PocketSphinx pacakge `sudo aptitude install pocketsphinx pocketsphinx-en-us` * Speech Recognition: `sudo pip3 install SpeechRecognition` * TermColor: `sudo pip3 install termcolor` * PyAudio: `pip3 install pyaudio` ## Reading the Structure: Joca Description: Uses OCR'ed text as an input, labels each word for Part-of-Speech, stopwords and sentiment. Then it generates a reading interface where words with a specific label are hidden. Output can be saved as poster, or exported as json featuring the full data set. run: `make reading_structure` Specific Dependencies: * nltk (http://www.nltk.org/install.html) * nltk.tokenize.punkt, ne_chunk, pos_tag, word_tokenize, sentiment.vader * nltk.download('vader_lexicon') (https://www.nltk.org/data.html) * weasyprint (http://weasyprint.readthedocs.io/en/latest/install.html) * jinja2 (http://jinja.pocoo.org/docs/2.10/intro/#installation) * font: PT Sans (os font https://www.fontsquirrel.com/fonts/pt-serif) * font: Ubuntu Mono (os font https://www.fontsquirrel.com/fonts/ubuntu-mono)