master
Rita Graça 5 years ago
commit 7769924d64

File diff suppressed because it is too large Load Diff

@ -0,0 +1,55 @@
import collections
# this script was adapted from:
# https://towardsdatascience.com/very-simple-python-script-for-extracting-most-common-words-from-a-story-1e3570d0b9d0
# open and read file
file = open(input("\nFile you want to categorize is: \n"), encoding="utf8")
a = file.read()
# my stopwords are common words I don't want to count, like "a", "an", "the".
stopwords = set(line.strip() for line in open('stopwords.txt'))
# dictionary
wordcount = {}
# spliting words from punctuation so "book" and "book!" counts as the same word
for word in a.lower().split():
word = word.replace(".","")
word = word.replace(",","")
word = word.replace(":","")
word = word.replace("\"","")
word = word.replace("!","")
word = word.replace("“","")
word = word.replace("‘","")
word = word.replace("*","")
# counting
if word not in stopwords:
if word not in wordcount:
wordcount[word] = 1
else:
wordcount[word] += 1
# print x most common words
# n_print = int(input("How many most common words to print: "))
n_print = int(3)
print("\nMost common words are:")
word_counter = collections.Counter(wordcount)
for word, count in word_counter.most_common(n_print):
print(word,"", count)
# categories
# words that are inside the category Library Studies
library_studies = set(line.strip() for line in open('library_studies.txt'))
for word, count in word_counter.most_common(n_print):
if word in library_studies:
print("\nWe suggest the following categorization for this file:\nLibrary Studies\n")
break
else:
print("\nWe don't have any suggestion of categorization for this file.\n")
# Close the file
file.close()

Binary file not shown.

After

Width:  |  Height:  |  Size: 234 KiB

File diff suppressed because it is too large Load Diff

@ -0,0 +1,19 @@
archives
author
bibliographic
bibliotheca
book
bookcase
books
bookshelf
bookstore
catalogue
e-book
librarian
librarianship
library
literature
manuscripts
papyrus
read
reading

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

@ -0,0 +1,46 @@
Own Nothing
Balázs Bodó
Flow My Tears
My tears cut deep grooves into the dust on my face. Drip, drip, drop, they hit the floor and disappear among the torn pages scattered on the floor.
This year it dawned on us that we cannot postpone it any longer: our personal library has to go. Our family moved countries more than half a decade ago, we switched cultures, languages, and chose another future. But the past, in the form of a few thousand books in our personal library, was still neatly stacked in our old apartment, patiently waiting, books that we bought and enjoyed — and forgot; books that we bought and never opened; books that we inherited from long-dead parents and half-forgotten friends. Some of them were important. Others were relevant at one point but no longer, yet they still reminded us who we once were.
When we moved, we took no more than two suitcases of personal belongings. The books were left behind. The library was like a sick child or an ailing parent, it hung over our heads like an unspoken threat, a curse. It was clear that sooner or later something had to be done about it, but none of the options available offered any consolation. It made no sense to move three thousand books to the other side of this continent. We decided to emigrate, and not to take our past with us, abandon the contexts we were fleeing from. We made a choice to leave behind the history, the discourses, the problems and the pain that accumulated in the books of our library. I knew exactly what it was I didnt want to teach to my children once we moved. So we did not move the books. We pretended that we would never have to think about what this decision really meant. Up until today. This year we needed to empty the study with the shelves. So Im standing in our library now, the dust covering my face, my hands, my clothes. In the middle of the floor there are three big crates and one small box. The small box swallows what well ultimately take with us, the books I want to show to my son when he gets older, in case he still wants to read. One of the big crates will be taken away by the antiquarian. The other will be given to the school library next door. The third is the wastebasket, where everything else will ultimately go.
Drip, drip, drip, my tears flow as I throw the books into this last crate, drip, drip, drop. Sometimes I look at my partner, working next to me, and I can see on her face that she is going through the same emotions. I sometimes catch the sight of her trembling hand, hesitating for a split second where a book should ultimately go, whether we could, whether we should save that particular one, because... But we either save them all or we are as ruthless as all those millions of people throughout history, who had an hour to pack their two suitcases before they needed to leave. Do we truly need this book? Is this a book well want to read? Is this book an inseparable part of our identity? Did we miss this book at all in the last five years? Is this a text I want to preserve for the future, for potential grandchildren who may not speak my mother tongue at all? What is the function of the book? What is the function of this particular book in my life? Why am I hesitating throwing it out? Why should I hesitate at all? Drop, drop, drop, a decision has been made. Drop, drop, drop, books are falling to the bottom of the crates.
We are killers, gutting our library. We are like the half-drown sailor, who got entangled in the ropes, and went down with the ship, and who now frantically tries to cut himself free from the detritus that prevents him to reach the freedom of the surface, the sunlight and the air.
Own Nothing, Have Everything
Do you remember Napsters slogan after it went legit, trying to transform itself into a legal music service around 2005? Own nothing, have everything that was the headline that was supposed to sell legal streaming music. How stupid, I thought. How could you possibly think that lack of ownership would be a good selling point? What does it even mean to have everything without ownership? And why on earth would not everyone want to own the most important constituents of their own self, their own identity? The things I read, the things I sing, make me who I am. Why wouldnt I want to own these things?
How revolutionary this idea had been I reflected as I watched the local homeless folks filling up their sacks with the remains of my library. How happy I would be if I could have all this stuff I had just thrown away without actually having to own any of it. The proliferation of digital texts led me to believe that we wont be needing dead wood libraries at all, at least no more than we need vinyl to listen to, or collect music. There might be geeks, collectors, specialists, who for one reason or another still prefer the physical form to the digital, but for the rest of us convenience, price, searchability, and all the other digital goodies give enough reason not to collect stuff that collects dust.
I was wrong to think that. I now realize that the future is not fully digital, it is more a physical-digital hybrid, in which the printed book is not simply an endangered species protected by a few devoted eccentrics who refuse to embrace the obvious advantages of a fully digital book future. What I see now is the emergence of a strange and shapeshifting-hybrid of diverse physical and electronic objects and practices, where the relative strengths and weaknesses of these different formats nicely complement each other.
This dawned on me after we had moved into an apartment without a bookshelf. I grew up in a flat that housed my parents extensive book collection. I knew the books by their cover and from time to time something made me want to take it from the shelf, open it and read it. This is how I discovered many of my favorite books and writers. With the e-reader, and some of the best shadow libraries at hand, I felt the same at first. I felt liberated. I could experiment without cost or risk, I could start—or stop—a book, I didnt have to consider the cost of buying and storing a book that was ultimately not meant for me. I could enjoy the books without having to carry the burden and responsibility of ownership.
Did you notice how deleting an epub file gives you a different feeling than throwing out a book? You dont have to feel guilty, you dont have to feel anything at all.
So I was reading, reading, reading like never before. But at that time my son was too young to read, so I didnt have to think about him, or anyone else besides myself. But as he was growing, it slowly dawned on me: without these physical books how will I be able to give him the same chance of serendipity, and of discovery, enchantment, and immersion that I got in my fathers library? And even later, what will I give him as his heritage? Son, look into this folder of PDFs: this is my legacy, your heritage, explore, enjoy, take pride in it?
Collections of anything, whether they are art, books, objects, people, are inseparable from the person who assembled that collection, and when that person is gone, the collection dies, as does the most important inroad to it: the will that created this particular order of things has passed away. But the heavy and unavoidable physicality of a book collection forces all those left behind to make an effort to approach, to force their way into, and try to navigate that garden of forking paths that is someone elses library. Even if you ultimately get rid of everything, you have to introduce yourself to every book, and let every book introduce itself to you, so you know what youre throwing out. Even if youll ultimately kill, you will need to look into the eyes of all your victims.
With a digital collection thats, of course, not the case.
The e-book is ephemeral. It has little past and even less chance to preserve the fingerprints of its owners over time. It is impersonal, efficient, fast, abundant, like fast food or plastic, it flows through the hand like sand. It lacks the embodiment, the materiality which would give it a life in a temporal dimension. If you want to network the dead and the unborn, as is the ambition of every book, then you need to print and bind, and create heavy objects that are expensive, inefficient and a burden. This burden subsiding in the object is the bridge that creates the intergenerational dimension, that forces you to think of the value of a book.
Own nothing, have nothing. Own everything, and your children will hate you when you die.
I have to say, Im struggling to find a new balance here. I started to buy books again, usually books that Id already read from a stolen copy on-screen. I know what I want to buy, I know what is worth preserving. I know what I want to show to my son, what I want to pass on, what I would like to take care of over time. Before, book buying for me was an investment into a stranger. Now that thrill is gone forever. I measure up the merchandise well beforehand, I build an intimate relationship, we make love again and again, before moving in together.
It is certainly a new kind of relationship with the books I bought since I got my e-reader. I still have to come to terms with the fact that the books I bought this way are rarely opened, as I already know them, and their role is not to be read, but to be together. What do I buy, and what do I get? Temporal, existential security? The chance of serendipity, if not for me, then for the people around me? The reassuring materiality of the intimacy I built with these texts through another medium?
All of these and maybe more. But in any case, I sense that this library, the physical embodiment of a physical-electronic hybrid collection with its unopened books and overflowing e-reader memory cards, is very different from the library I had, and the library Im getting rid of at this very moment. The library that I inherited, the library that grew organically from the detritus of the everyday, the library that accumulated books similar to how the books accumulated dust, as is the natural way of things, this library was full of unknowns, it was a library of potentiality, of opportunities, of trips waiting to happen. This new, hybrid library is a collection of things that Im familiar with. I intimately know every piece, they hold little surprise, they offer few discoveries — at least for me. The exploration, the discovery, the serendipity, the pre-screening takes place on the e-reader, among the ephemeral, disposable PDFs and epubs.
We Won
This new hybrid model is based on the cheap availability of digital books. In my case, the free availability of pirated copies available through shadow libraries. These libraries dont have everything on offer, but they have books in an order of magnitude larger than Ill ever have the time and chance to read, so they offer enough, enough for me to fill up hard drives with books I want to read, or at least skim, to try, to taste. As if I moved into an infinite bookstore or library, where I can be as promiscuous, explorative, nomadic as I always wanted to be. I can flirt with books, I can have a quickie, or I can leave them behind without shedding a single tear.
I dont know how this hybrid library, and this analogue-digital hybrid practice of reading and collecting would work without the shadow libraries which make everything freely accessible. I rely on their supply to test texts, and feed and grow my print library. E-books are cheaper than their print versions, but they still cost money, carry a risk, a cost of experimentation. Book-streaming, the flat-rate, the all-you-can-eat format of accessing books is at the moment only available to audiobooks, but rarely for e-books. I wonder why.
Did you notice that there are no major book piracy lawsuits?
Of course there is the lawsuit against Sci-Hub and Library Genesis in New York, and there is another one in Canada against aaaaarg, causing major nuisance to those who have been named in these cases. But this is almost negligible compared to the high profile wars the music and audiovisual industries waged against Napster, Grokster, Kazaa, megaupload and their likes. It is as if book publishers have completely given up on trying to fight piracy in the courts, and have launched a few lawsuits only to maintain the appearance that they still care about their digital copyrights. I wonder why.
I know the academic publishing industry slightly better than the mainstream popular fiction market, and I have the feeling that in the former copyright-based business models are slowly being replaced by something else. We see no major anti-piracy efforts from publishers, not because piracy is non-existent — on the contrary, it is global, and it is big — but because the publishers most probably realized that in the long run the copyright-based exclusivity model is unsustainable. The copyright wars of the last two decades taught them that law cannot put an end to piracy. As the Sci-Hub case demonstrates, you can win all you want in a New York court, but this has little real-world effect as long as the conditions that attract the users to the shadow libraries remain.
Exclusivity-based publishing business models are under assault from other sides as well. Mandated open access in the US and in the EU means that there is a quickly growing body of new research for the access of which publishers cannot charge money anymore. LibGen and Sci-Hub make it harder to charge for the back catalogue. Their sheer existence teaches millions on what uncurtailed open access really is, and makes it easier for university libraries to negotiate with publishers, as they dont have to worry about their patrons being left without any access at all.
The good news is that radical open access may well be happening. It is a less and less radical idea to have things freely accessible. One has to be less and less radical to achieve the openness that has been long overdue. Maybe it is not yet obvious today and the victory is not yet universal, maybe itll take some extra years, maybe it wont ever be evenly distributed, but it is obvious that this genie, these millions of books on everything from malaria treatments to critical theory, cannot be erased, and open access will not be undone, and the future will be free of access barriers.
Who is downloading books and articles? Everyone. Radical open access? We won, if you like.
Drip, drip, drop, its only nostalgia. My heart is light, as I dont have to worry about gutting the library. Soon it wont matter at all.
We Are Not Winning at All
But did we really win? If publishers are happy to let go of access control and copyright, it means that theyve found something that is even more profitable than selling back to us academics the content that we have produced. And this more profitable something is of course data. Did you notice where all the investment in academic publishing went in the last decade? Did you notice SSRN, Mendeley, Academia.edu, ScienceDirect, research platforms, citation software, manuscript repositories, library systems being bought up by the academic publishing industry? All these platforms and technologies operate on and support open access content, while they generate data on the creation, distribution, and use of knowledge; on individuals, researchers, students, and faculty; on institutions, departments, and programs. They produce data on the performance, on the success and the failure of the whole domain of research and education. This is the data that is being privatized, enclosed, packaged, and sold back to us.
Taylorism reached academia. In the name of efficiency, austerity, and transparency, our daily activities are measured, profiled, packaged, and sold to the highest bidder. But in this process of quantification, knowledge on ourselves is lost for us, unless we pay. We still have some patchy datasets on what we do, on who we are, we still have this blurred reflection in the data-mirrors that we still do control. But this path of self-enlightenment is quickly waning as less and less data sources about us are freely available to us.
I strongly believe that information on the self is the foundation of self-determination. We need to have data on how we operate, on what we do in order to know who we are. This is what is being privatized away from the academic community, this is being taken away from us.
Radical open access. Not of content, but of the data about ourselves. This is the next challenge. We will digitize every page, by hand if we must, that process cannot be stopped anymore. No outside power can stop it and take that from us. Drip, drip, drop, this is what I console myself with, as another handful of books land among the waste.
But the data we lose now will not be so easy to reclaim.

File diff suppressed because it is too large Load Diff

@ -0,0 +1,71 @@
-
a
about
all
an
and
are
as
at
be
but
by
can
do
down
for
from
get
had
has
have
he
I
i
if
in
into
is
it
its
me
more
my
not
of
on
one
or
other
out
should
so
some
such
than
that
the
their
them
then
there
these
they
this
those
to
up
was
were
what
when
which
who
whom
will
with
would
you
your
|
Loading…
Cancel
Save