|
|
|
@ -48,11 +48,18 @@ run: `make ttssr-human-only`
|
|
|
|
|
|
|
|
|
|
Specific Dependencies:
|
|
|
|
|
* [pocketsphinx](https://github.com/bambocher/pocketsphinx-python) `sudo pip3 install pocketsphinx` ---> FOLLOW THIS EXAMPLE
|
|
|
|
|
* SpeechRecognition 3.8.1
|
|
|
|
|
* PyAudio
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* SpeechRecognition 3.8.1
|
|
|
|
|
* PyAudio
|
|
|
|
|
|
|
|
|
|
## Reading the Structure: Joca
|
|
|
|
|
Description: Uses OCR'ed text as an input, labels each word for Part-of-Speech, stopwords and sentiment. Then it generates a reading interface
|
|
|
|
|
where words with a specific label are hidden. Output can be saved as poster, or exported as json featuring the full data set.
|
|
|
|
|
|
|
|
|
|
run: `make output/reading_structure/index.html`
|
|
|
|
|
|
|
|
|
|
Specific Dependencies:
|
|
|
|
|
* nltk: nltk.tokenize.punkt, ne_chunk, pos_tag, word_tokenize, sentiment.vader
|
|
|
|
|
* weasyprint
|
|
|
|
|
* jinja2
|
|
|
|
|
* font: PT Sans (os font https://www.fontsquirrel.com/fonts/pt-serif)
|
|
|
|
|
* font: Ubuntu Mono (os font https://www.fontsquirrel.com/fonts/ubuntu-mono)
|
|
|
|
|