Training model to recognize annotations in text.

Вы не можете выбрать более 25 тем Темы должны начинаться с буквы или цифры, могут содержать дефисы(-) и должны содержать не более 35 символов.

Перейти к файлу

rita 892d285569 Update 'README.md'		4 лет назад
dataset	first attempt	5 лет назад
README.md	Update 'README.md'	4 лет назад
model.h5	first attempt	5 лет назад
model.yaml	first attempt	5 лет назад
predict.py	first attempt	5 лет назад
requirements.txt	first attempt	5 лет назад
test.jpg	first attempt	5 лет назад
test.jpg.predicted.png	first attempt	5 лет назад
test2.jpg	first attempt	5 лет назад
test2.jpg.predicted.png	first attempt	5 лет назад
test3.jpg	first attempt	5 лет назад
test3.jpg.predicted.png	first attempt	5 лет назад
test4.jpg	first attempt	5 лет назад
test4.jpg.predicted.png	first attempt	5 лет назад
test5.jpg	first attempt	5 лет назад
test5.jpg.predicted.png	first attempt	5 лет назад
test6.jpg	first attempt	5 лет назад
test6.jpg.predicted.png	first attempt	5 лет назад
test7.jpg	first attempt	5 лет назад
test7.jpg.predicted.png	first attempt	5 лет назад
test8.jpg	first attempt	5 лет назад
test8.jpg.predicted.png	first attempt	5 лет назад
test9.jpg	first attempt	5 лет назад
test9.jpg.predicted.png	first attempt	5 лет назад
test9.png	first attempt	5 лет назад
test10.jpg	first attempt	5 лет назад
test10.jpg.predicted.png	first attempt	5 лет назад
train.py	first attempt	5 лет назад

README.md

Убрать экранирование Экранировать

Image Classifier for Annotations

At the time of the research for The Library is Open, a point of interest for everyone was annotations.

The Library is Open was a research project focused on knowledge production and its systems. The work questioned the shadows and biases cast by knowledge taxonomy; examined digital proprietary tools as impediments to the free access and circulation of knowledge; and developed annotation tools to foster collective interpretation towards existent knowledge.

As a research group, we were reading and annotating texts together and debating the possibilities of sharing these notes. One particular discussion was about what could/should be considered an annotation: folding corners of pages, linking to other contents, highlighting, scribbling, drawing. I was curious if we could train a computer to see all of these traces, so I started prototyping some examples.

Aim: make the computer recognise "clean" pages of books or "annotated" pages of books.

Each set (test and training) had 50 examples of "clean" pages and "annotated" pages, it makes sense to add more in the future. The results were not very accurate. Pages with hand-written text gave better results while highlighting and computer notes were often misinterpreted. It’s useful to try to see what the computer is looking for, understand if the script is breaking the image in parts, and try other scripts.

README.md Убрать экранирование Экранировать

Image Classifier for Annotations

README.md

Убрать экранирование Экранировать