diff --git a/mediawiki-api-part-2.ipynb b/mediawiki-api-part-2.ipynb new file mode 100644 index 0000000..a338f36 --- /dev/null +++ b/mediawiki-api-part-2.ipynb @@ -0,0 +1,401 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# MediaWiki API (part 2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This notebook:\n", + "\n", + "* continues with exploring the connections between `Hypertext` & `Dérive`\n", + "* uses the `query` & `parse` actions of the `MediaWiki API`, which we can use to work with wiki pages as (versioned and hypertextual) technotexts\n", + "\n", + "## Epicpedia\n", + "\n", + "Reference: Epicpedia (2008), Annemieke van der Hoek \\\n", + "(from: https://diversions.constantvzw.org/wiki/index.php?title=Eventual_Consistency#Towards_diffractive_technotexts)\n", + "\n", + "> In Epicpedia (2008), Annemieke van der Hoek creates a work that makes use of the underlying history that lies beneath the surface of each Wikipedia article.[20] Inspired by the work of Berthold Brecht and the notion of Epic Theater, Epicpedia presents Wikipedia articles as screenplays, where each edit becomes an utterance performed by a cast of characters (both major and minor) that takes place over a span of time, typically many years. The work uses the API of wikipedia to retrieve for a given article the sequence of revisions, their corresponding user handles, the summary message (that allows editors to describe the nature of their edit), and the timestamp to then produce a differential reading. \n", + "\n", + "![](https://diversions.constantvzw.org/wiki/images/b/b0/Epicpedia_EpicTheater02.png)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import urllib\n", + "import json\n", + "from IPython.display import JSON # iPython JSON renderer" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Query & Parse" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will work again with the `Dérive` page on the wiki: https://pzwiki.wdka.nl/mediadesign/D%C3%A9rive (i moved it here, to make the URL a bit simpler)\n", + "\n", + "And use the `API help page` on the PZI wiki as our main reference: https://pzwiki.wdka.nl/mw-mediadesign/api.php" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# query the wiki page Dérive\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&titles=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# parse the wiki page Dérive\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=parse&page=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Links, contributors, edit history" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can ask the API for different kind of material/information about the page.\n", + "\n", + "Such as:\n", + "\n", + "* a list of wiki links\n", + "* a list of external links\n", + "* a list of images\n", + "* a list of edits\n", + "* a list of contributors\n", + "* page information\n", + "* reverse links (What links here?)\n", + "* ...\n", + "\n", + "We can use the query action again, to ask for these things:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# wiki links: prop=links\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&prop=links&titles=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# external links: prop=extlinks\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&prop=extlinks&titles=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# images: prop=images\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&prop=images&titles=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# edit history: prop=revisions\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&prop=revisions&titles=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# contributors: prop=contributors\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&prop=contributors&titles=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# page information: prop=info\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&prop=info&titles=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# reverse links (What links here?): prop=linkshere + lhlimit=25 (max. nr of results)\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=query&prop=linkshere&lhlimit=100&titles=Prototyping&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Use the `data` responses in Python (and save data in variables)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# For example with the action=parse request\n", + "request = 'https://pzwiki.wdka.nl/mw-mediadesign/api.php?action=parse&page=D%C3%A9rive&format=json'\n", + "response = urllib.request.urlopen(request).read()\n", + "data = json.loads(response)\n", + "JSON(data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "text = data['parse']['text']['*']\n", + "print(text)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "title = data['parse']['title']\n", + "print(title)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "images = data['parse']['images']\n", + "print(images)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Use these variables to generate HTML pages " + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": {}, + "outputs": [], + "source": [ + "# open a HTML file to write to \n", + "output = open('myfilename.html', 'w')" + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "2813" + ] + }, + "execution_count": 58, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# write to this HTML file you just opened\n", + "output.write(text)" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": {}, + "outputs": [], + "source": [ + "# close the file again (Jupyter needs this to actually write a file)\n", + "output.close()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Use these variables to generate HTML pages (using the template language Jinja)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Jinja (template language): https://jinja.palletsprojects.com/" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}