You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

41 KiB

NLTK - Part of Speech

In [2]:
import nltk
import random
In [3]:
lines = open('txt/language.txt').readlines()
sentence = random.choice(lines)
print(sentence)
To complicate things even further, computer science has its own understanding of “operational semantics” in programming languages, for example in the construction of a programming language interpreter or compiler.

Tokens

In [4]:
tokens = nltk.word_tokenize(sentence)
print(tokens)
['To', 'complicate', 'things', 'even', 'further', ',', 'computer', 'science', 'has', 'its', 'own', 'understanding', 'of', '“', 'operational', 'semantics', '”', 'in', 'programming', 'languages', ',', 'for', 'example', 'in', 'the', 'construction', 'of', 'a', 'programming', 'language', 'interpreter', 'or', 'compiler', '.']

Part of Speech "tags"

In [5]:
tagged = nltk.pos_tag(tokens)
print(tagged)
[('To', 'TO'), ('complicate', 'VB'), ('things', 'NNS'), ('even', 'RB'), ('further', 'RB'), (',', ','), ('computer', 'NN'), ('science', 'NN'), ('has', 'VBZ'), ('its', 'PRP$'), ('own', 'JJ'), ('understanding', 'NN'), ('of', 'IN'), ('“', 'NNP'), ('operational', 'JJ'), ('semantics', 'NNS'), ('”', 'VBP'), ('in', 'IN'), ('programming', 'NN'), ('languages', 'NNS'), (',', ','), ('for', 'IN'), ('example', 'NN'), ('in', 'IN'), ('the', 'DT'), ('construction', 'NN'), ('of', 'IN'), ('a', 'DT'), ('programming', 'JJ'), ('language', 'NN'), ('interpreter', 'NN'), ('or', 'CC'), ('compiler', 'NN'), ('.', '.')]

Now, you could select for example all the type of verbs:

In [6]:
selection = []

for word, tag in tagged:
    if 'VB' in tag:
        selection.append(word)

print(selection)
['complicate', 'has', '”']

Where do these tags come from?

An off-the-shelf tagger is available for English. It uses the Penn Treebank tagset.

From: http://www.nltk.org/api/nltk.tag.html#module-nltk.tag

NLTK provides documentation for each tag, which can be queried using the tag, e.g. nltk.help.upenn_tagset('RB').

From: http://www.nltk.org/book_1ed/ch05.html

In [7]:
nltk.help.upenn_tagset('PRP')
PRP: pronoun, personal
    hers herself him himself hisself it itself me myself one oneself ours
    ourselves ownself self she thee theirs them themselves they thou thy us

An alphabetical list of part-of-speech tags used in the Penn Treebank Project (link):

Number
Tag
Description
1. CC Coordinating conjunction
2. CD Cardinal number
3. DT Determiner
4. EX Existential there
5. FW Foreign word
6. IN Preposition or subordinating conjunction
7. JJ Adjective
8. JJR Adjective, comparative
9. JJS Adjective, superlative
10. LS List item marker
11. MD Modal
12. NN Noun, singular or mass
13. NNS Noun, plural
14. NNP Proper noun, singular
15. NNPS Proper noun, plural
16. PDT Predeterminer
17. POS Possessive ending
18. PRP Personal pronoun
19. PRP\$ Possessive pronoun
20. RB Adverb
21. RBR Adverb, comparative
22. RBS Adverb, superlative
23. RP Particle
24. SYM Symbol
25. TO to
26. UH Interjection
27. VB Verb, base form
28. VBD Verb, past tense
29. VBG Verb, gerund or present participle
30. VBN Verb, past participle
31. VBP Verb, non-3rd person singular present
32. VBZ Verb, 3rd person singular present
33. WDT Wh-determiner
34. WP Wh-pronoun
35. WP$ Possessive wh-pronoun
36. WRB Wh-adverb

Applying to an entire text

In [8]:
language = open('txt/language.txt').read()
tokens = nltk.word_tokenize(language)
tagged = nltk.pos_tag(tokens)
In [9]:
tagged
Out[9]:
[('Language', 'NN'),
 ('Florian', 'JJ'),
 ('Cramer', 'NNP'),
 ('Software', 'NNP'),
 ('and', 'CC'),
 ('language', 'NN'),
 ('are', 'VBP'),
 ('intrinsically', 'RB'),
 ('related', 'VBN'),
 (',', ','),
 ('since', 'IN'),
 ('software', 'NN'),
 ('may', 'MD'),
 ('process', 'VB'),
 ('language', 'NN'),
 (',', ','),
 ('and', 'CC'),
 ('is', 'VBZ'),
 ('constructed', 'VBN'),
 ('in', 'IN'),
 ('language', 'NN'),
 ('.', '.'),
 ('Yet', 'CC'),
 ('language', 'NN'),
 ('means', 'VBZ'),
 ('different', 'JJ'),
 ('things', 'NNS'),
 ('in', 'IN'),
 ('the', 'DT'),
 ('context', 'NN'),
 ('of', 'IN'),
 ('computing', 'VBG'),
 (':', ':'),
 ('formal', 'JJ'),
 ('languages', 'NNS'),
 ('in', 'IN'),
 ('which', 'WDT'),
 ('algorithms', 'EX'),
 ('are', 'VBP'),
 ('expressed', 'VBN'),
 ('and', 'CC'),
 ('software', 'NN'),
 ('is', 'VBZ'),
 ('implemented', 'VBN'),
 (',', ','),
 ('and', 'CC'),
 ('in', 'IN'),
 ('so-called', 'JJ'),
 ('“', 'NNP'),
 ('natural', 'JJ'),
 ('”', 'NNP'),
 ('spoken', 'NN'),
 ('languages', 'NNS'),
 ('.', '.'),
 ('There', 'EX'),
 ('are', 'VBP'),
 ('at', 'IN'),
 ('least', 'JJS'),
 ('two', 'CD'),
 ('layers', 'NNS'),
 ('of', 'IN'),
 ('formal', 'JJ'),
 ('language', 'NN'),
 ('in', 'IN'),
 ('software', 'NN'),
 (':', ':'),
 ('programming', 'NN'),
 ('language', 'NN'),
 ('in', 'IN'),
 ('which', 'WDT'),
 ('the', 'DT'),
 ('software', 'NN'),
 ('is', 'VBZ'),
 ('written', 'VBN'),
 (',', ','),
 ('and', 'CC'),
 ('the', 'DT'),
 ('language', 'NN'),
 ('implemented', 'VBD'),
 ('within', 'IN'),
 ('the', 'DT'),
 ('software', 'NN'),
 ('as', 'IN'),
 ('its', 'PRP$'),
 ('symbolic', 'JJ'),
 ('controls', 'NNS'),
 ('.', '.'),
 ('In', 'IN'),
 ('the', 'DT'),
 ('case', 'NN'),
 ('of', 'IN'),
 ('compilers', 'NNS'),
 (',', ','),
 ('shells', 'NNS'),
 (',', ','),
 ('and', 'CC'),
 ('macro', 'NN'),
 ('languages', 'NNS'),
 (',', ','),
 ('for', 'IN'),
 ('example', 'NN'),
 (',', ','),
 ('these', 'DT'),
 ('layers', 'NNS'),
 ('can', 'MD'),
 ('overlap', 'VB'),
 ('.', '.'),
 ('“', 'VB'),
 ('Natural', 'NNP'),
 ('”', 'NNP'),
 ('language', 'NN'),
 ('is', 'VBZ'),
 ('what', 'WP'),
 ('can', 'MD'),
 ('be', 'VB'),
 ('processed', 'VBN'),
 ('as', 'IN'),
 ('data', 'NNS'),
 ('by', 'IN'),
 ('software', 'NN'),
 (';', ':'),
 ('since', 'IN'),
 ('this', 'DT'),
 ('processing', 'NN'),
 ('is', 'VBZ'),
 ('formal', 'JJ'),
 (',', ','),
 ('however', 'RB'),
 (',', ','),
 ('it', 'PRP'),
 ('is', 'VBZ'),
 ('restricted', 'VBN'),
 ('to', 'TO'),
 ('syntactical', 'JJ'),
 ('operations', 'NNS'),
 ('.', '.'),
 ('While', 'IN'),
 ('differentiation', 'NN'),
 ('of', 'IN'),
 ('computer', 'NN'),
 ('programming', 'VBG'),
 ('languages', 'NNS'),
 ('as', 'IN'),
 ('“', 'JJ'),
 ('artificial', 'JJ'),
 ('languages', 'NNS'),
 ('”', 'VBP'),
 ('from', 'IN'),
 ('languages', 'NNS'),
 ('like', 'VBP'),
 ('English', 'NNP'),
 ('as', 'IN'),
 ('“', 'NNP'),
 ('natural', 'JJ'),
 ('languages', 'NNS'),
 ('”', 'VBP'),
 ('is', 'VBZ'),
 ('conceptually', 'RB'),
 ('important', 'JJ'),
 ('and', 'CC'),
 ('undisputed', 'JJ'),
 (',', ','),
 ('it', 'PRP'),
 ('remains', 'VBZ'),
 ('problematic', 'JJ'),
 ('in', 'IN'),
 ('its', 'PRP$'),
 ('pure', 'NN'),
 ('terminology', 'NN'),
 (':', ':'),
 ('There', 'EX'),
 ('is', 'VBZ'),
 ('nothing', 'NN'),
 ('“', 'JJ'),
 ('natural', 'JJ'),
 ('”', 'NN'),
 ('about', 'IN'),
 ('spoken', 'JJ'),
 ('language', 'NN'),
 (';', ':'),
 ('it', 'PRP'),
 ('is', 'VBZ'),
 ('a', 'DT'),
 ('cultural', 'JJ'),
 ('construct', 'NN'),
 ('and', 'CC'),
 ('thus', 'RB'),
 ('just', 'RB'),
 ('as', 'IN'),
 ('“', 'JJ'),
 ('artificial', 'JJ'),
 ('”', 'NN'),
 ('as', 'IN'),
 ('any', 'DT'),
 ('formal', 'JJ'),
 ('machine', 'NN'),
 ('control', 'NN'),
 ('language', 'NN'),
 ('.', '.'),
 ('To', 'TO'),
 ('call', 'VB'),
 ('programming', 'NN'),
 ('languages', 'NNS'),
 ('“', 'VBP'),
 ('machine', 'NN'),
 ('languages', 'NNS'),
 ('”', 'VBP'),
 ('doesn', 'JJ'),
 ('', 'NNP'),
 ('t', 'NN'),
 ('solve', 'VBP'),
 ('the', 'DT'),
 ('problem', 'NN'),
 ('either', 'RB'),
 (',', ','),
 ('as', 'IN'),
 ('it', 'PRP'),
 ('obscures', 'VBZ'),
 ('that', 'IN'),
 ('“', 'FW'),
 ('machine', 'NN'),
 ('languages', 'NNS'),
 ('”', 'VBP'),
 ('are', 'VBP'),
 ('human', 'JJ'),
 ('creations', 'NNS'),
 ('.', '.'),
 ('High-level', 'JJ'),
 ('machine-independent', 'JJ'),
 ('programming', 'NN'),
 ('languages', 'NNS'),
 ('such', 'JJ'),
 ('as', 'IN'),
 ('Fortran', 'NNP'),
 (',', ','),
 ('C', 'NNP'),
 (',', ','),
 ('Java', 'NNP'),
 (',', ','),
 ('and', 'CC'),
 ('Basic', 'NNP'),
 ('are', 'VBP'),
 ('not', 'RB'),
 ('even', 'RB'),
 ('direct', 'JJ'),
 ('mappings', 'NNS'),
 ('of', 'IN'),
 ('machine', 'NN'),
 ('logic', 'NN'),
 ('.', '.'),
 ('If', 'IN'),
 ('programming', 'JJ'),
 ('languages', 'NNS'),
 ('are', 'VBP'),
 ('human', 'JJ'),
 ('languages', 'NNS'),
 ('for', 'IN'),
 ('machine', 'NN'),
 ('control', 'NN'),
 (',', ','),
 ('they', 'PRP'),
 ('could', 'MD'),
 ('be', 'VB'),
 ('called', 'VBN'),
 ('cybernetic', 'JJ'),
 ('languages', 'NNS'),
 ('.', '.'),
 ('But', 'CC'),
 ('these', 'DT'),
 ('languages', 'NNS'),
 ('can', 'MD'),
 ('also', 'RB'),
 ('be', 'VB'),
 ('used', 'VBN'),
 ('outside', 'JJ'),
 ('machines—in', 'NN'),
 ('programming', 'VBG'),
 ('handbooks', 'NNS'),
 (',', ','),
 ('for', 'IN'),
 ('example', 'NN'),
 (',', ','),
 ('in', 'IN'),
 ('programmer', 'NN'),
 ('', 'NNP'),
 ('s', 'NN'),
 ('dinner', 'NN'),
 ('table', 'JJ'),
 ('jokes', 'NNS'),
 (',', ','),
 ('or', 'CC'),
 ('as', 'IN'),
 ('abstract', 'JJ'),
 ('formal', 'JJ'),
 ('languages', 'NNS'),
 ('for', 'IN'),
 ('expressing', 'VBG'),
 ('logical', 'JJ'),
 ('constructs', 'NNS'),
 (',', ','),
 ('such', 'JJ'),
 ('as', 'IN'),
 ('in', 'IN'),
 ('Hugh', 'NNP'),
 ('Kenner', 'NNP'),
 ('', 'NNP'),
 ('s', 'NN'),
 ('use', 'NN'),
 ('of', 'IN'),
 ('the', 'DT'),
 ('Pascal', 'NNP'),
 ('programming', 'NN'),
 ('language', 'NN'),
 ('to', 'TO'),
 ('explain', 'VB'),
 ('aspects', 'NNS'),
 ('of', 'IN'),
 ('the', 'DT'),
 ('structure', 'NN'),
 ('of', 'IN'),
 ('Samuel', 'NNP'),
 ('Beckett', 'NNP'),
 ('', 'NNP'),
 ('s', 'VBD'),
 ('writing.1', 'NN'),
 ('In', 'IN'),
 ('this', 'DT'),
 ('sense', 'NN'),
 (',', ','),
 ('computer', 'NN'),
 ('control', 'NN'),
 ('languages', 'NNS'),
 ('could', 'MD'),
 ('be', 'VB'),
 ('more', 'RBR'),
 ('broadly', 'RB'),
 ('defined', 'VBN'),
 ('as', 'IN'),
 ('syntactical', 'JJ'),
 ('languages', 'NNS'),
 ('as', 'IN'),
 ('opposed', 'VBN'),
 ('to', 'TO'),
 ('semantic', 'JJ'),
 ('languages', 'NNS'),
 ('.', '.'),
 ('But', 'CC'),
 ('this', 'DT'),
 ('terminology', 'NN'),
 ('is', 'VBZ'),
 ('not', 'RB'),
 ('without', 'IN'),
 ('its', 'PRP$'),
 ('problems', 'NNS'),
 ('either', 'DT'),
 ('.', '.'),
 ('Common', 'JJ'),
 ('languages', 'NNS'),
 ('like', 'IN'),
 ('English', 'NNP'),
 ('are', 'VBP'),
 ('both', 'DT'),
 ('formal', 'JJ'),
 ('and', 'CC'),
 ('semantic', 'JJ'),
 (';', ':'),
 ('although', 'IN'),
 ('their', 'PRP$'),
 ('scope', 'NN'),
 ('extends', 'VBZ'),
 ('beyond', 'IN'),
 ('the', 'DT'),
 ('formal', 'JJ'),
 (',', ','),
 ('anything', 'NN'),
 ('that', 'WDT'),
 ('can', 'MD'),
 ('be', 'VB'),
 ('expressed', 'VBN'),
 ('in', 'IN'),
 ('a', 'DT'),
 ('computer', 'NN'),
 ('control', 'NN'),
 ('language', 'NN'),
 ('can', 'MD'),
 ('also', 'RB'),
 ('be', 'VB'),
 ('expressed', 'VBN'),
 ('in', 'IN'),
 ('common', 'JJ'),
 ('language', 'NN'),
 ('.', '.'),
 ('It', 'PRP'),
 ('follows', 'VBZ'),
 ('that', 'IN'),
 ('computer', 'NN'),
 ('control', 'NN'),
 ('languages', 'NNS'),
 ('are', 'VBP'),
 ('a', 'DT'),
 ('formal', 'JJ'),
 ('(', '('),
 ('and', 'CC'),
 ('as', 'IN'),
 ('such', 'JJ'),
 ('rather', 'RB'),
 ('primitive', 'JJ'),
 (')', ')'),
 ('subset', 'NN'),
 ('of', 'IN'),
 ('common', 'JJ'),
 ('human', 'JJ'),
 ('languages', 'NNS'),
 ('.', '.'),
 ('To', 'TO'),
 ('complicate', 'VB'),
 ('things', 'NNS'),
 ('even', 'RB'),
 ('further', 'RB'),
 (',', ','),
 ('computer', 'NN'),
 ('science', 'NN'),
 ('has', 'VBZ'),
 ('its', 'PRP$'),
 ('own', 'JJ'),
 ('understanding', 'NN'),
 ('of', 'IN'),
 ('“', 'NNP'),
 ('operational', 'JJ'),
 ('semantics', 'NNS'),
 ('”', 'VBP'),
 ('in', 'IN'),
 ('programming', 'NN'),
 ('languages', 'NNS'),
 (',', ','),
 ('for', 'IN'),
 ('example', 'NN'),
 ('in', 'IN'),
 ('the', 'DT'),
 ('construction', 'NN'),
 ('of', 'IN'),
 ('a', 'DT'),
 ('programming', 'JJ'),
 ('language', 'NN'),
 ('interpreter', 'NN'),
 ('or', 'CC'),
 ('compiler', 'NN'),
 ('.', '.'),
 ('Just', 'RB'),
 ('as', 'IN'),
 ('this', 'DT'),
 ('interpreter', 'NN'),
 ('doesn', 'NN'),
 ('', 'NNP'),
 ('t', 'NN'),
 ('perform', 'NN'),
 ('“', 'NNP'),
 ('interpretations', 'NNS'),
 ('”', 'VBP'),
 ('in', 'IN'),
 ('a', 'DT'),
 ('hermeneutic', 'JJ'),
 ('sense', 'NN'),
 ('of', 'IN'),
 ('semantic', 'JJ'),
 ('text', 'NN'),
 ('explication', 'NN'),
 (',', ','),
 ('the', 'DT'),
 ('computer', 'NN'),
 ('science', 'NN'),
 ('notion', 'NN'),
 ('of', 'IN'),
 ('“', 'JJ'),
 ('semantics', 'NNS'),
 ('”', 'JJ'),
 ('defies', 'NNS'),
 ('linguistic', 'JJ'),
 ('and', 'CC'),
 ('common', 'JJ'),
 ('sense', 'NN'),
 ('understanding', 'NN'),
 ('of', 'IN'),
 ('the', 'DT'),
 ('word', 'NN'),
 (',', ','),
 ('since', 'IN'),
 ('compiler', 'NN'),
 ('construction', 'NN'),
 ('is', 'VBZ'),
 ('purely', 'RB'),
 ('syntactical', 'JJ'),
 (',', ','),
 ('and', 'CC'),
 ('programming', 'VBG'),
 ('languages', 'NNS'),
 ('denote', 'VBP'),
 ('nothing', 'NN'),
 ('but', 'CC'),
 ('syntactical', 'JJ'),
 ('manipulations', 'NNS'),
 ('of', 'IN'),
 ('symbols', 'NNS'),
 ('.', '.'),
 ('What', 'WP'),
 ('might', 'MD'),
 ('more', 'JJR'),
 ('suitably', 'RB'),
 ('be', 'VB'),
 ('called', 'VBN'),
 ('the', 'DT'),
 ('semantics', 'NNS'),
 ('of', 'IN'),
 ('computer', 'NN'),
 ('control', 'NN'),
 ('languages', 'VBZ'),
 ('resides', 'NNS'),
 ('in', 'IN'),
 ('the', 'DT'),
 ('symbols', 'NNS'),
 ('with', 'IN'),
 ('which', 'WDT'),
 ('those', 'DT'),
 ('operations', 'NNS'),
 ('are', 'VBP'),
 ('denoted', 'VBN'),
 ('in', 'IN'),
 ('most', 'JJS'),
 ('programming', 'JJ'),
 ('languages', 'NNS'),
 (':', ':'),
 ('English', 'JJ'),
 ('words', 'NNS'),
 ('like', 'IN'),
 ('“', 'NN'),
 ('if', 'IN'),
 (',', ','),
 ('”', 'FW'),
 ('“', 'FW'),
 ('then', 'RB'),
 (',', ','),
 ('”', 'NNP'),
 ('“', 'NNP'),
 ('else', 'RB'),
 (',', ','),
 ('”', 'NNP'),
 ('“', 'NNP'),
 ('for', 'IN'),
 (',', ','),
 ('”', 'NNP'),
 ('“', 'NNP'),
 ('while', 'IN'),
 (',', ','),
 ('”', 'FW'),
 ('“', 'NNP'),
 ('goto', 'NN'),
 (',', ','),
 ('”', 'NNP'),
 ('and', 'CC'),
 ('“', 'NNP'),
 ('print', 'NN'),
 (',', ','),
 ('”', 'NN'),
 ('in', 'IN'),
 ('conjunction', 'NN'),
 ('with', 'IN'),
 ('arithmetical', 'JJ'),
 ('and', 'CC'),
 ('punctuation', 'NN'),
 ('symbols', 'NNS'),
 (';', ':'),
 ('in', 'IN'),
 ('alphabetic', 'JJ'),
 ('software', 'NN'),
 ('controls', 'NNS'),
 (',', ','),
 ('words', 'NNS'),
 ('like', 'IN'),
 ('“', 'NNP'),
 ('list', 'NN'),
 (',', ','),
 ('”', 'NNP'),
 ('“', 'NNP'),
 ('move', 'NN'),
 (',', ','),
 ('”', 'NNP'),
 ('“', 'NNP'),
 ('copy', 'NN'),
 (',', ','),
 ('”', 'NN'),
 ('and', 'CC'),
 ('“', 'NNP'),
 ('paste', 'NN'),
 ('”', 'NN'),
 (';', ':'),
 ('in', 'IN'),
 ('graphical', 'JJ'),
 ('software', 'NN'),
 ('controls', 'NNS'),
 (',', ','),
 ('such', 'JJ'),
 ('as', 'IN'),
 ('symbols', 'NNS'),
 ('like', 'IN'),
 ('the', 'DT'),
 ('trash', 'NN'),
 ('can', 'MD'),
 ('.', '.'),
 ('Ferdinand', 'NNP'),
 ('de', 'IN'),
 ('Saussure', 'NNP'),
 ('states', 'VBZ'),
 ('that', 'IN'),
 ('the', 'DT'),
 ('signs', 'NNS'),
 ('of', 'IN'),
 ('common', 'JJ'),
 ('human', 'JJ'),
 ('language', 'NN'),
 ('are', 'VBP'),
 ('arbitrary2', 'RB'),
 ('because', 'IN'),
 ('it', 'PRP'),
 ('', 'VBZ'),
 ('s', 'JJ'),
 ('purely', 'RB'),
 ('a', 'DT'),
 ('cultural-social', 'JJ'),
 ('convention', 'NN'),
 ('that', 'IN'),
 ('assigns', 'VBZ'),
 ('phonemes', 'NNS'),
 ('to', 'TO'),
 ('concepts', 'NNS'),
 ('.', '.'),
 ('Likewise', 'NNP'),
 (',', ','),
 ('it', 'PRP'),
 ('', 'VBZ'),
 ('s', 'JJ'),
 ('purely', 'RB'),
 ('a', 'DT'),
 ('cultural', 'JJ'),
 ('convention', 'NN'),
 ('to', 'TO'),
 ('assign', 'VB'),
 ('symbols', 'NNS'),
 ('to', 'TO'),
 ('machine', 'NN'),
 ('operations', 'NNS'),
 ('.', '.'),
 ('But', 'CC'),
 ('just', 'RB'),
 ('as', 'IN'),
 ('the', 'DT'),
 ('cultural', 'JJ'),
 ('choice', 'NN'),
 ('of', 'IN'),
 ('phonemes', 'NNS'),
 ('in', 'IN'),
 ('spoken', 'JJ'),
 ('language', 'NN'),
 ('is', 'VBZ'),
 ('restrained', 'VBN'),
 ('by', 'IN'),
 ('what', 'WP'),
 ('the', 'DT'),
 ('human', 'JJ'),
 ('voice', 'NN'),
 ('can', 'MD'),
 ('pronounce', 'VB'),
 (',', ','),
 ('the', 'DT'),
 ('assignment', 'NN'),
 ('of', 'IN'),
 ('symbols', 'NNS'),
 ('to', 'TO'),
 ('machine', 'NN'),
 ('operations', 'NNS'),
 ('is', 'VBZ'),
 ('limited', 'VBN'),
 ('to', 'TO'),
 ('what', 'WP'),
 ('can', 'MD'),
 ('be', 'VB'),
 ('efficiently', 'RB'),
 ('processed', 'VBN'),
 ('by', 'IN'),
 ('the', 'DT'),
 ('machine', 'NN'),
 ('and', 'CC'),
 ('of', 'IN'),
 ('good', 'JJ'),
 ('use', 'NN'),
 ('to', 'TO'),
 ('humans.3', 'VB'),
 ('This', 'DT'),
 ('compromise', 'NN'),
 ('between', 'IN'),
 ('operability', 'NN'),
 ('and', 'CC'),
 ('usability', 'NN'),
 ('is', 'VBZ'),
 ('obvious', 'JJ'),
 ('in', 'IN'),
 (',', ','),
 ('for', 'IN'),
 ('example', 'NN'),
 (',', ','),
 ('Unix', 'NNP'),
 ('commands', 'VBZ'),
 ('.', '.'),
 ('Originally', 'RB'),
 ('used', 'VBN'),
 ('on', 'IN'),
 ('teletype', 'NN'),
 ('terminals', 'NNS'),
 (',', ','),
 ('the', 'DT'),
 ('operation', 'NN'),
 ('“', 'NNP'),
 ('copy', 'NN'),
 ('”', 'NN'),
 ('was', 'VBD'),
 ('abbreviated', 'VBN'),
 ('to', 'TO'),
 ('the', 'DT'),
 ('command', 'NN'),
 ('“', 'NNP'),
 ('cp', 'NN'),
 (',', ','),
 ('”', 'NNP'),
 ('“', 'NNP'),
 ('move', 'NN'),
 ('”', 'NN'),
 ('to', 'TO'),
 ('“', 'VB'),
 ('mv', 'NN'),
 (',', ','),
 ('”', 'NNP'),
 ('“', 'NNP'),
 ('list', 'NN'),
 ('”', 'NN'),
 ('to', 'TO'),
 ('“', 'VB'),
 ('ls', 'NN'),
 (',', ','),
 ('”', 'NNP'),
 ('etc.', 'NN'),
 (',', ','),
 ('in', 'IN'),
 ('order', 'NN'),
 ('to', 'TO'),
 ('cut', 'VB'),
 ('down', 'RP'),
 ('machine', 'NN'),
 ('memory', 'NN'),
 ('use', 'NN'),
 (',', ','),
 ('teletype', 'JJ'),
 ('paper', 'NN'),
 ('consumption', 'NN'),
 (',', ','),
 ('and', 'CC'),
 ('human', 'JJ'),
 ('typing', 'VBG'),
 ('effort', 'NN'),
 ('at', 'IN'),
 ('the', 'DT'),
 ('same', 'JJ'),
 ('time', 'NN'),
 ('.', '.'),
 ('Any', 'DT'),
 ('computer', 'NN'),
 ('control', 'NN'),
 ('language', 'NN'),
 ('is', 'VBZ'),
 ('thus', 'RB'),
 ('a', 'DT'),
 ('cultural', 'JJ'),
 ('compromise', 'NN'),
 ('between', 'IN'),
 ('the', 'DT'),
 ('constraints', 'NNS'),
 ('of', 'IN'),
 ('machine', 'NN'),
 ('design—which', 'NN'),
 ('is', 'VBZ'),
 ('far', 'RB'),
 ('from', 'IN'),
 ('objective', 'JJ'),
 (',', ','),
 ('but', 'CC'),
 ('based', 'VBN'),
 ('on', 'IN'),
 ('human', 'JJ'),
 ('choices', 'NNS'),
 (',', ','),
 ('culture', 'NN'),
 (',', ','),
 ('and', 'CC'),
 ('thinking', 'VBG'),
 ('style', 'NN'),
 ('itself', 'PRP'),
 ('4—and', 'CD'),
 ('the', 'DT'),
 ('equally', 'RB'),
 ('subjective', 'JJ'),
 ('user', 'NN'),
 ('preferences', 'NNS'),
 (',', ','),
 ('involving', 'VBG'),
 ('fuzzy', 'JJ'),
 ('factors', 'NNS'),
 ('like', 'IN'),
 ('readability', 'NN'),
 (',', ','),
 ('elegance', 'NN'),
 (',', ','),
 ('and', 'CC'),
 ('usage', 'JJ'),
 ('efficiency', 'NN'),
 ('.', '.'),
 ('The', 'DT'),
 ('symbols', 'NNS'),
 ('of', 'IN'),
 ('computer', 'NN'),
 ('control', 'NN'),
 ('languages', 'VBZ'),
 ('inevitably', 'RB'),
 ('do', 'VBP'),
 ('have', 'VB'),
 ('semantic', 'JJ'),
 ('connotations', 'NNS'),
 ('simply', 'RB'),
 ('because', 'IN'),
 ('there', 'EX'),
 ('exist', 'VBP'),
 ('no', 'DT'),
 ('symbols', 'NNS'),
 ('with', 'IN'),
 ('which', 'WDT'),
 ('humans', 'NNS'),
 ('would', 'MD'),
 ('not', 'RB'),
 ('associate', 'VB'),
 ('some', 'DT'),
 ('meaning', 'NN'),
 ('.', '.'),
 ('But', 'CC'),
 ('symbols', 'NNS'),
 ('can', 'MD'),
 ('', 'VB'),
 ('t', 'JJ'),
 ('denote', 'NN'),
 ('any', 'DT'),
 ('semantic', 'JJ'),
 ('statements', 'NNS'),
 (',', ','),
 ('that', 'DT'),
 ('is', 'VBZ'),
 (',', ','),
 ('they', 'PRP'),
 ('do', 'VBP'),
 ('not', 'RB'),
 ('express', 'VB'),
 ('meaning', 'VBG'),
 ('in', 'IN'),
 ('their', 'PRP$'),
 ('own', 'JJ'),
 ('terms', 'NNS'),
 (';', ':'),
 ('humans', 'NNS'),
 ('metaphorically', 'RB'),
 ('read', 'VB'),
 ('meaning', 'VBG'),
 ('into', 'IN'),
 ('them', 'PRP'),
 ('through', 'IN'),
 ('associations', 'NNS'),
 ('they', 'PRP'),
 ('make', 'VBP'),
 ('.', '.'),
 ('Languages', 'NNS'),
 ('without', 'IN'),
 ('semantic', 'JJ'),
 ('denotation', 'NN'),
 ('are', 'VBP'),
 ('not', 'RB'),
 ('historically', 'RB'),
 ('new', 'JJ'),
 ('phenomena', 'NNS'),
 (';', ':'),
 ('mathematical', 'JJ'),
 ('formulas', 'NNS'),
 ('are', 'VBP'),
 ('their', 'PRP$'),
 ('oldest', 'JJS'),
 ('example', 'NN'),
 ('.', '.'),
 ('In', 'IN'),
 ('comparison', 'NN'),
 ('to', 'TO'),
 ('common', 'JJ'),
 ('human', 'JJ'),
 ('languages', 'NNS'),
 (',', ','),
 ('the', 'DT'),
 ('multitude', 'NN'),
 ('of', 'IN'),
 ('programming', 'VBG'),
 ('languages', 'NNS'),
 ('is', 'VBZ'),
 ('of', 'IN'),
 ('lesser', 'JJR'),
 ('significance', 'NN'),
 ('.', '.'),
 ('The', 'DT'),
 ('criterion', 'NN'),
 ('of', 'IN'),
 ('Turing', 'NNP'),
 ('completeness', 'NN'),
 ('of', 'IN'),
 ('a', 'DT'),
 ('programming', 'NN'),
 ('language', 'NN'),
 (',', ','),
 ('that', 'WDT'),
 ('is', 'VBZ'),
 (',', ','),
 ('that', 'IN'),
 ('any', 'DT'),
 ('computation', 'NN'),
 ('can', 'MD'),
 ('be', 'VB'),
 ('expressed', 'VBN'),
 ('in', 'IN'),
 ('it', 'PRP'),
 (',', ','),
 ('means', 'VBZ'),
 ('that', 'IN'),
 ('every', 'DT'),
 ('programming', 'NN'),
 ('language', 'NN'),
 ('is', 'VBZ'),
 (',', ','),
 ('formally', 'RB'),
 ('speaking', 'VBG'),
 (',', ','),
 ('just', 'RB'),
 ('a', 'DT'),
 ('riff', 'NN'),
 ('on', 'IN'),
 ('every', 'DT'),
 ('other', 'JJ'),
 ('programming', 'NN'),
 ('language', 'NN'),
 ('.', '.'),
 ('Nothing', 'NN'),
 ('can', 'MD'),
 ('be', 'VB'),
 ('expressed', 'VBN'),
 ('in', 'IN'),
 ('a', 'DT'),
 ('Turingcomplete', 'JJ'),
 ('language', 'NN'),
 ('such', 'JJ'),
 ('as', 'IN'),
 ('C', 'NNP'),
 ('that', 'IN'),
 ('couldn', 'NN'),
 ('', 'NNP'),
 ('t', 'NN'),
 ('also', 'RB'),
 ('be', 'VB'),
 ('expressed', 'VBN'),
 ('in', 'IN'),
 ('another', 'DT'),
 ('Turingcomplete', 'NNP'),
 ('language', 'NN'),
 ('such', 'JJ'),
 ('as', 'IN'),
 ('Lisp', 'NNP'),
 ('(', '('),
 ('or', 'CC'),
 ('Fortran', 'NNP'),
 (',', ','),
 ('Smalltalk', 'NNP'),
 (',', ','),
 ('Java', 'NNP'),
 ('...', ':'),
 (')', ')'),
 ('and', 'CC'),
 ('vice', 'NN'),
 ('versa', 'NN'),
 ('.', '.'),
 ('This', 'DT'),
 ('ultimately', 'JJ'),
 ('proves', 'VBZ'),
 ('the', 'DT'),
 ...]
In [ ]: