# Patterns (part 2)

## Searching for patterns

For this Notebook, we will use snippets from *Language and Software Studies*: https://monoskop.org/images/f/f9/Cramer_Florian_Anti-Media_Ephemera_on_Speculative_Arts_2013.pdf#142, a text by Florian Cramer written in 2005.

Published in the book *Anti Media* (2013): https://monoskop.org/log/?p=20259

In [14]:
lines = [
    'Software and language are intrinsically related, since software may process language, and is constructed in language.',
    'Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages.',
    'There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls.',
    'In the case of compilers, shells, and macro languages, for example, these layers can overlap.',
    '‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.'
]

In [15]:
for line in lines:
    if 'software' in line:
        print('-----')
        print(line)

-----
Software and language are intrinsically related, since software may process language, and is constructed in language.
-----
Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages.
-----
There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls.
-----
‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.


In [16]:
lines = [
    'Software and language are intrinsically related.',
    'Yet language means different things in the context of computing.',
    'There are at least two layers of formal language in software.',
    'These layers can overlap.',
    '‚ÄúNatural‚Äù language is what can be processed as data by software..'
]

In [17]:
for line in lines:
    if 'software' in line:
        print('-----')
        print(line)

-----
There are at least two layers of formal language in software.
-----
‚ÄúNatural‚Äù language is what can be processed as data by software..


In [20]:
# Software != software
for line in lines:
    if 'software' in line.lower():
        print('-----')
        print(line)

-----
Software and language are intrinsically related.
-----
There are at least two layers of formal language in software.
-----
‚ÄúNatural‚Äù language is what can be processed as data by software..


## Software != software

In [27]:
a = 'software'
b = 'Software'
if a == b:
    print('They are the same!')
else:
    print('Nope, not the same...')

Nope, not the same...


In [28]:
a.upper()

'SOFTWARE'

In [70]:
b.lower()

'software'

In [29]:
a.capitalize()

'Software'

In [31]:
lines[0].title()

'Software And Language Are Intrinsically Related.'

In [33]:
b.swapcase()

'sOFTWARE'

## From string to lines: split()

In [119]:
txt = 'Software and language are intrinsically related, since software may process language, and is constructed in language. Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages. There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls. In the case of compilers, shells, and macro languages, for example, these layers can overlap. ‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.'
print(txt)

Software and language are intrinsically related, since software may process language, and is constructed in language. Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages. There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls. In the case of compilers, shells, and macro languages, for example, these layers can overlap. ‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.


In [126]:
lines = txt.split('. ')
for line in lines:
    print('---')
    print(line)

---
Software and language are intrinsically related, since software may process language, and is constructed in language
---
Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages
---
There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls
---
In the case of compilers, shells, and macro languages, for example, these layers can overlap
---
‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.


## From lines to string: ' '.join()

In [None]:
lines = [
    'Software and language are intrinsically related.',
    'Yet language means different things in the context of computing.',
    'There are at least two layers of formal language in software.',
    'These layers can overlap.',
    '‚ÄúNatural‚Äù language is what can be processed as data by software..'
]

In [130]:
txt = ' ----- '.join(lines)
print(txt)

Software and language are intrinsically related, since software may process language, and is constructed in language ----- Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages ----- There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls ----- In the case of compilers, shells, and macro languages, for example, these layers can overlap ----- ‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.


---------------------------------------

## From string to words: split(), strip()

In [55]:
lines = [
    'Software and language are intrinsically related, since software may process language, and is constructed in language.',
    'Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages.',
    'There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls.',
    'In the case of compilers, shells, and macro languages, for example, these layers can overlap.',
    '‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.'
]
storage = []
for line in lines:
    words = line.split()
    for word in words:
        #word = word.strip('.,:;')
        print(word)
        storage.append(word)

Software
and
language
are
intrinsically
related,
since
software
may
process
language,
and
is
constructed
in
language.
Yet
language
means
different
things
in
the
context
of
computing:
formal
languages
in
which
algorithms
are
expressed
and
software
is
implemented,
and
in
so-called
‚Äúnatural‚Äù
spoken
languages.
There
are
at
least
two
layers
of
formal
language
in
software:
programming
language
in
which
the
software
is
written,
and
the
language
implemented
within
the
software
as
its
symbolic
controls.
In
the
case
of
compilers,
shells,
and
macro
languages,
for
example,
these
layers
can
overlap.
‚ÄúNatural‚Äù
language
is
what
can
be
processed
as
data
by
software;
since
this
processing
is
formal,
however,
it
is
restricted
to
syntactical
operations.


In [69]:
# Bag of words
print(storage)

['Software', 'and', 'language', 'are', 'intrinsically', 'related,', 'since', 'software', 'may', 'process', 'language,', 'and', 'is', 'constructed', 'in', 'language.', 'Yet', 'language', 'means', 'different', 'things', 'in', 'the', 'context', 'of', 'computing:', 'formal', 'languages', 'in', 'which', 'algorithms', 'are', 'expressed', 'and', 'software', 'is', 'implemented,', 'and', 'in', 'so-called', '‚Äúnatural‚Äù', 'spoken', 'languages.', 'There', 'are', 'at', 'least', 'two', 'layers', 'of', 'formal', 'language', 'in', 'software:', 'programming', 'language', 'in', 'which', 'the', 'software', 'is', 'written,', 'and', 'the', 'language', 'implemented', 'within', 'the', 'software', 'as', 'its', 'symbolic', 'controls.', 'In', 'the', 'case', 'of', 'compilers,', 'shells,', 'and', 'macro', 'languages,', 'for', 'example,', 'these', 'layers', 'can', 'overlap.', '‚ÄúNatural‚Äù', 'language', 'is', 'what', 'can', 'be', 'processed', 'as', 'data', 'by', 'software;', 'since', 'this', 'processing', 'is'

In [57]:
storage.count('language')

6

In [59]:
for word in storage:
    if 'language' in word:
        print(word)

language
language,
language.
language
languages
languages.
language
language
language
languages,
language


In [64]:
for word in storage:
    if word.endswith('ing'):
        print(word)

programming
processing


In [67]:
for word in storage:
    if len(word) < 3:
        print(word)

is
in
in
of
in
is
in
at
of
in
in
is
as
In
of
is
be
as
by
is
it
is
to


In [66]:
# Extra: print previous or next words in the storage list
count = 0
for word in storage:
    if 'language' in word:
        type_of_language = storage[count - 1] 
        print(type_of_language, word)
        count += 1

operations. language
Software language,
and language.
language language
are languages
intrinsically languages.
related, language
since language
software language
may languages,
process language


-----------------------------------------

## open(), read(), readlines()

Let's work with some more lines.

We will use the plain text version of this text for it.

In [72]:
# read()
filename = 'language.txt'
txt = open(filename, 'r').read()
print(txt)

Language

Florian Cramer 

Software and language are intrinsically related, since software may process language, and is constructed in language.
Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages.
There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls.
In the case of compilers, shells, and macro languages, for example, these layers can overlap.
‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.
While differentiation of computer programming languages as ‚Äúartificial languages‚Äù from languages like English as ‚Äúnatural languages‚Äù is conceptually important and undisputed, it remains problematic in its pure te

In [131]:
# readlines()
filename = 'language.txt'
lines = open(filename, 'r').readlines()
print(lines)

['Language\n', '\n', 'Florian Cramer \n', '\n', 'Software and language are intrinsically related, since software may process language, and is constructed in language.\n', 'Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called ‚Äúnatural‚Äù spoken languages.\n', 'There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls.\n', 'In the case of compilers, shells, and macro languages, for example, these layers can overlap.\n', '‚ÄúNatural‚Äù language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations.\n', 'While differentiation of computer programming languages as ‚Äúartificial languages‚Äù from languages like English as ‚Äúnatural languages‚Äù is conceptually important and un

------------------------------------------

## read() & readlines() from an etherpad

Let's also use another input source: an etherpad

In [101]:
# You can read the content of an etherpad, by using the /export/txt function of etherpad
# See the end of the url below:

from urllib.request import urlopen

url = 'https://pad.xpub.nl/p/archivefever/export/txt'

response = urlopen(url)
#print(response)

txt = response.read()
#txt = response.read().decode('UTF-8')
print(txt)

b'\n\n\n\n\n\nXPUB 1\nPost digital itch\n\n\n\n\n\n\n\n\nSWARM\n01\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nAn annotated version of:\n\nArchive Fever: A Freudian Impression\nAuthor(s): Jacques Derrida and Eric Prenowitz\nSource: Diacritics, Vol. 25, No. 2 (Summer, 1995), pp. 9-63\n\n\n\n\n\n\n\n\n\n\n\nGLOSSARY OF TERMS\n\n1 Commencement\nan act or instance of commencing; beginning.\n\n2 Commandment\na command or mandate.\n\n3 Ontological\nof or relating to ontology, the branch of metaphysics that studies the nature of existence or being as such; metaphysical.\n\n4 Nomology\nthe science of law or laws.\n\n5 Arkh\xc3\xa9\nFrom \xe1\xbc\x84\xcf\x81\xcf\x87\xcf\x89 (\xc3\xa1rkh\xc5\x8d, \xe2\x80\x9cto begin\xe2\x80\x9d) +\xe2\x80\x8e -\xce\xb7 (-\xc4\x93, verbal noun suffix)\n\t1. beginning, origin\n\t2. sovereignty, dominion, authority (in plural: \xe1\xbc\x80\xcf\x81\xcf\x87\xce\xb1\xce\xaf)\n\n6 Jussive\nform, mood, case, construction, or word expressing a command.\n\n7 Cleav




In [117]:
# To come back to searching for patterns ... you can use the if/else statements to search for annotation symbols in one of the pads you're annotation atm!
from urllib.request import urlopen

url = 'https://pad.xpub.nl/p/!%E2%80%93Nina_Power/export/txt'
response = urlopen(url)

lines = response.readlines()

for line in lines:
    line = line.decode('UTF-8')
        
    if 'üò°' in line:
        print(line)

In the era of emojis[üò°üò°], we have forgotten about the politics of punctuation. Which mark or sign holds sway[0] over us in the age of Twitter, Facebook, YouTube comments, emails, and text messages? If we take the tweets of Donald Trump as some kind of symptomatic indicator[9], we can see quite well that it is the exclamation mark*** [6] ‚Äì ! ‚Äì that dominates. A quick look at his tweets from the last 48 hour period shows that almost all of them end with a single declarative sentence or word followed by a ‚Äò!‚Äô: ‚ÄòBig trade imbalance!‚Äô, ‚ÄòNo more!‚Äô, ‚ÄòThey‚Äôve gone CRAZY!‚Äô, ‚ÄòHappy National Anthem Day!‚Äô, ‚ÄòREST IN PEACE BILLY GRAHAM!‚Äô, [1]‚ÄòIF YOU DON‚ÄôT HAVE STEEL, YOU DON‚ÄôT HAVE A COUNTRY!‚Äô, (we shall leave the matter of all caps for another time), ‚Äò$800 Billion Trade Deficit-have no choice!, ‚ÄòJobless claims at a 49 year low!‚Äô and so on ‚Ä¶ you get the picture. Trump‚Äôs exclamation mark is the equivalent of a boss slamming [8] his fist down on th

In [112]:
# Try to search for more patterns in a pad!