diff --git a/mediawiki-api-dérive.ipynb b/mediawiki-api-dérive.ipynb
new file mode 100644
index 0000000..5aab04f
--- /dev/null
+++ b/mediawiki-api-dérive.ipynb
@@ -0,0 +1,464 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# MediaWiki API (part 2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This notebook:\n",
+ "\n",
+ "* continues with exploring the connections between `Hypertext` & `Dérive`\n",
+ "* saves (parts of) wiki pages as html files"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import urllib\n",
+ "import json\n",
+ "from IPython.display import JSON # iPython JSON renderer\n",
+ "import sys"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Parse"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's use another wiki this time: the English Wikipedia.\n",
+ "\n",
+ "You can pick any page, i took the Hypertext page for this notebook as an example: https://en.wikipedia.org/wiki/Hypertext"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# parse the wiki page Hypertext\n",
+ "request = 'https://en.wikipedia.org/w/api.php?action=parse&page=Hypertext&format=json'\n",
+ "response = urllib.request.urlopen(request).read()\n",
+ "data = json.loads(response)\n",
+ "JSON(data)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Wiki links dérive\n",
+ "\n",
+ "Select the wiki links from the `data` response:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "links = data['parse']['links']\n",
+ "JSON(links)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's save the links as a list of pagenames, to make it look like this:\n",
+ "\n",
+ "`['hyperdocuments', 'hyperwords', 'hyperworld']`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "# How is \"links\" structured now?\n",
+ "print(links)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "It helps to copy paste a small part of the output first:\n",
+ "\n",
+ "`[{'ns': 0, 'exists': '', '*': 'Metatext'}, {'ns': 0, '*': 'De man met de hoed'}]`\n",
+ "\n",
+ "and to write it differently with indentation:\n",
+ "\n",
+ "```\n",
+ "links = [\n",
+ " { \n",
+ " 'ns' : 0,\n",
+ " 'exists' : '',\n",
+ " '*', 'Metatext'\n",
+ " }, \n",
+ " {\n",
+ " 'ns' : 0,\n",
+ " 'exists' : '',\n",
+ " '*' : 'De man met de hoed'\n",
+ " } \n",
+ "]\n",
+ "```\n",
+ "\n",
+ "We can now loop through \"links\" and add all the pagenames to a new list called \"wikilinks\"."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "wikilinks = []\n",
+ "\n",
+ "for link in links:\n",
+ " \n",
+ " print('link:', link)\n",
+ " \n",
+ " for key, value in link.items():\n",
+ " print('----- key:', key)\n",
+ " print('----- value:', value)\n",
+ " print('-----')\n",
+ " \n",
+ " pagename = link['*']\n",
+ " print('===== pagename:', pagename)\n",
+ " \n",
+ " wikilinks.append(pagename)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "wikilinks"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Saving the links in a HTML page\n",
+ "\n",
+ "Let's convert the list of pagenames into HTML link elements (``):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "html = ''\n",
+ "\n",
+ "for wikilink in wikilinks:\n",
+ " print(wikilink)\n",
+ " \n",
+ " # let's use the \"safe\" pagenames for the filenames \n",
+ " # by replacing the ' ' with '_'\n",
+ " filename = wikilink.replace(' ', '_')\n",
+ " \n",
+ " a = f'{ wikilink }'\n",
+ " html += a\n",
+ " html += '\\n'"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "print(html)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Let's save this page in a separate folder, i called it \"mediawiki-api-dérive\"\n",
+ "# We can make this folder here using a terminal command, but you can also do it in the interface on the left\n",
+ "! mkdir mediawiki-api-dérive"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "output = open('mediawiki-api-dérive/Hypertext.html', 'w')\n",
+ "output.write(html)\n",
+ "output.close()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Recursive parsing\n",
+ "\n",
+ "We can now repeat the steps for each wikilink that we collected!\n",
+ "\n",
+ "We can make an API request for each wikilink, \\\n",
+ "ask for all the links on the page \\\n",
+ "and save it as an HTML page."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# First we save the Hypertext page again:\n",
+ "\n",
+ "startpage = 'Hypertext'\n",
+ "\n",
+ "# parse the first wiki page\n",
+ "request = f'https://en.wikipedia.org/w/api.php?action=parse&page={ startpage }&format=json'\n",
+ "response = urllib.request.urlopen(request).read()\n",
+ "data = json.loads(response)\n",
+ "JSON(data)\n",
+ "\n",
+ "# select the links\n",
+ "links = data['parse']['links']\n",
+ "\n",
+ "# turn it into a list of pagenames\n",
+ "wikilinks = []\n",
+ "for link in links:\n",
+ " pagename = link['*']\n",
+ " wikilinks.append(pagename)\n",
+ "\n",
+ "# turn the wikilinks into a set of links\n",
+ "html = ''\n",
+ "for wikilink in wikilinks:\n",
+ " filename = wikilink.replace(' ', '_')\n",
+ " a = f'{ wikilink }'\n",
+ " html += a\n",
+ " html += '\\n'\n",
+ "\n",
+ "# save it as a HTML page\n",
+ "startpage = startpage.replace(' ', '_') # let's again stay safe on the filename side\n",
+ "output = open(f'mediawiki-api-dérive/{ startpage }.html', 'w')\n",
+ "output.write(html)\n",
+ "output.close()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Then we loop through the list of wikilinks\n",
+ "# and repeat the steps for each page\n",
+ " \n",
+ "for wikilink in wikilinks:\n",
+ " \n",
+ " # let's copy the current wikilink pagename, to avoid confusion later\n",
+ " currentwikilink = wikilink \n",
+ " print('Now requesting:', currentwikilink)\n",
+ " \n",
+ " # parse this wiki page\n",
+ " wikilink = wikilink.replace(' ', '_')\n",
+ " request = f'https://en.wikipedia.org/w/api.php?action=parse&page={ wikilink }&format=json'\n",
+ " \n",
+ " # --> we insert a \"try and error\" condition, \n",
+ " # to catch errors in case a page does not exist \n",
+ " try: \n",
+ " \n",
+ " # continue the parse request\n",
+ " response = urllib.request.urlopen(request).read()\n",
+ " data = json.loads(response)\n",
+ " JSON(data)\n",
+ "\n",
+ " # select the links\n",
+ " links = data['parse']['links']\n",
+ "\n",
+ " # turn it into a list of pagenames\n",
+ " wikilinks = []\n",
+ " for link in links:\n",
+ " pagename = link['*']\n",
+ " wikilinks.append(pagename)\n",
+ "\n",
+ " # turn the wikilinks into a set of links\n",
+ " html = ''\n",
+ " for wikilink in wikilinks:\n",
+ " filename = wikilink.replace(' ', '_')\n",
+ " a = f'{ wikilink }'\n",
+ " html += a\n",
+ " html += '\\n'\n",
+ "\n",
+ " # save it as a HTML page\n",
+ " currentwikilink = currentwikilink.replace(' ', '_') # let's again stay safe on the filename side\n",
+ " output = open(f'mediawiki-api-dérive/{ currentwikilink }.html', 'w')\n",
+ " output.write(html)\n",
+ " output.close()\n",
+ " \n",
+ " except:\n",
+ " error = sys.exc_info()[0]\n",
+ " print('Skipped:', wikilink)\n",
+ " print('With the error:', error)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## What's next?\n",
+ "\n",
+ "?\n",
+ "\n",
+ "You could add more loops to the recursive parsing, adding more layers ...\n",
+ "\n",
+ "You could request all images of a page (instead of links) ...\n",
+ "\n",
+ "or something else the API offers ... (contributors, text, etc)\n",
+ "\n",
+ "or ..."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}