You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

187 lines
19 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Markdown - HTML - print"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pypandoc\n",
"from weasyprint import HTML, CSS\n",
"from weasyprint.fonts import FontConfiguration"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Markdown → HTML"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pandoc: \"If you need to convert files from one markup format into another, **pandoc is your swiss-army knife**.\"\n",
"\n",
"https://pandoc.org/\n",
"\n",
"The Python library for Pandoc:\n",
"\n",
"https://github.com/bebraw/pypandoc \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Convert a Markdown file to HTML ...\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<p>Language</p>\n",
"<p>Florian Cramer</p>\n",
"<p>Software and language are intrinsically related, since software may process language, and is constructed in language. Yet language means different things in the context of computing: formal languages in which algorithms are expressed and software is implemented, and in so-called “natural” spoken languages. There are at least two layers of formal language in software: programming language in which the software is written, and the language implemented within the software as its symbolic controls. In the case of compilers, shells, and macro languages, for example, these layers can overlap. “Natural” language is what can be processed as data by software; since this processing is formal, however, it is restricted to syntactical operations. While differentiation of computer programming languages as “artificial languages” from languages like English as “natural languages” is conceptually important and undisputed, it remains problematic in its pure terminology: There is nothing “natural” about spoken language; it is a cultural construct and thus just as “artificial” as any formal machine control language. To call programming languages “machine languages” doesnt solve the problem either, as it obscures that “machine languages” are human creations. High-level machine-independent programming languages such as Fortran, C, Java, and Basic are not even direct mappings of machine logic. If programming languages are human languages for machine control, they could be called cybernetic languages. But these languages can also be used outside machines—in programming handbooks, for example, in programmers dinner table jokes, or as abstract formal languages for expressing logical constructs, such as in Hugh Kenners use of the Pascal programming language to explain aspects of the structure of Samuel Becketts writing.1 In this sense, computer control languages could be more broadly defined as syntactical languages as opposed to semantic languages. But this terminology is not without its problems either. Common languages like English are both formal and semantic; although their scope extends beyond the formal, anything that can be expressed in a computer control language can also be expressed in common language. It follows that computer control languages are a formal (and as such rather primitive) subset of common human languages. To complicate things even further, computer science has its own understanding of “operational semantics” in programming languages, for example in the construction of a programming language interpreter or compiler. Just as this interpreter doesnt perform “interpretations” in a hermeneutic sense of semantic text explication, the computer science notion of “semantics” defies linguistic and common sense understanding of the word, since compiler construction is purely syntactical, and programming languages denote nothing but syntactical manipulations of symbols. What might more suitably be called the semantics of computer control languages resides in the symbols with which those operations are denoted in most programming languages: English words like “if,” “then,” “else,” “for,” “while,” “goto,” and “print,” in conjunction with arithmetical and punctuation symbols; in alphabetic software controls, words like “list,” “move,” “copy,” and “paste”; in graphical software controls, such as symbols like the trash can. Ferdinand de Saussure states that the signs of common human language are arbitrary2 because its purely a cultural-social convention that assigns phonemes to concepts. Likewise, its purely a cultural convention to assign symbols to machine operations. But just as the cultural choice of phonemes in spoken language is restrained by what the human voice can pronounce, the assignment of symbols to machine operations is limited to what can be efficiently processed by the machine and of good use to humans.3 This compromise between operability and usability is obvious in, for example, Unix commands. Origi
"<p>Notes</p>\n",
"<ol type=\"1\">\n",
"<li>Hugh Kenner, “Beckett Thinking,” in Hugh Kenner, The Mechanic Muse, 83107.</li>\n",
"<li>Ferdinand de Saussure, Course in General Linguistics, ”Chapter I: Nature of the Linguistic Sign.”</li>\n",
"<li>See the section, “Saussurean Signs and Material Matters,” in N. Katherine Hayles, My Mother Was a Computer, 4245.</li>\n",
"<li>For example, Steve Wozniaks design of the Apple I mainboard was consijdered “a beautiful work of art” in its time according to Steven Levy, Insanely Great: The Life and Times of Macintosh, 81.</li>\n",
"<li>Joseph Weizenbaum, “ELIZA—A Computer Program for the Study of Natural Language Communication between Man and Machine.”</li>\n",
"<li>Marsha Pascual, “Black Monday, Causes and Effects.”</li>\n",
"<li>Among them concrete poetry writers, French Oulipo poets, the German poet Hans Magnus Enzensberger, and the Austrian poets Ferdinand Schmatz and Franz Josef Czernin.</li>\n",
"<li>Jef Raskin, The Humane Interface: New Directions for Designing Interactive Systems.</li>\n",
"<li>According to Nelson Goodmans definition of writing in The Languages of Art, 143.</li>\n",
"<li>Alan Kay, an inventor of the graphical user interface, conceded in 1990 that “it would not be surprising if the visual system were less able in this area than the mechanism that solve noun phrases for natural language. Although it is not fair to say that iconic languages cant work just because no one has been able to design a good one, it is likely that the above explanation is close to truth.” This status quo hasnt changed since. Alan Kay, “User Interface: A Personal View,” in, Brenda Laurel ed. The Art of Human-Computer Interface Design, Reading: Addison Wesley, 1989, 203.</li>\n",
"<li>Swift, Jonathan, Gullivers Travels, Project Gutenberg Ebook, available at http:// www.gutenberg.org / dirs / extext197 / gltrv10.txt / .</li>\n",
"<li>See Wolfgang Hagen, “The Style of Source Codes.”</li>\n",
"</ol>\n",
"\n"
]
}
],
"source": [
"# ... directly from a file\n",
"html = pypandoc.convert_file('language.md', 'html')\n",
"print(html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## HTML → PDF"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"for this we can use Weasyprint again"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<weasyprint.HTML object at 0xaf851e30>\n"
]
}
],
"source": [
"html = HTML(string=html)\n",
"print(html)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"css = CSS(string='''\n",
"@page{\n",
" size: A4;\n",
" margin: 15mm;\n",
" }\n",
" body{\n",
" font-family: serif;\n",
" font-size: 12pt;\n",
" line-height: 1.4;\n",
" color: magenta;\n",
" }\n",
"''')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's actually interesting and useful to have a close look at paged media properties in CSS: \n",
"\n",
"https://developer.mozilla.org/en-US/docs/Web/CSS/%40page/size"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"html.write_pdf('language.pdf', stylesheets=[css])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}