target blank added

master
Pedro Sá Couto 6 years ago
parent 76fbb463ef
commit 31a77d7282

BIN
.DS_Store vendored

Binary file not shown.

@ -888,24 +888,24 @@
--><div class="popup" id="text05">
<p class='link'><button onclick="code0()">Gathering Mastodon Toot Answers</button></p>
<p class='code' id="code00">Taking advantage from Mastodon API, this code iterates through different instances within a dictionary, matches them with an access key and gathers the answers to a predetermined toot identified by its ID. Later on, writes on a text file the date on when the information was collected and transcribes the text to HTML, making it easier to use in the online publication.
<br><br><span class="git"><a class="grey" href=https://git.xpub.nl/pedrosaclout/answers_mastodon_api>https://git.xpub.nl/pedrosaclout/answers_mastodon_api</a></span></p>
<br><br><span class="git"><a class="grey" target="_blank" href=https://git.xpub.nl/pedrosaclout/answers_mastodon_api>https://git.xpub.nl/pedrosaclout/answers_mastodon_api</a></span></p>
<p class='link'><button onclick="code1()">Gathering Mastodon DMS</button></p>
<p class='code' id="code01">This code iterates through different instances within a dictionary and matches them with an access key. Later on, still relying on the Mastodon API python library, it is able to find private descendants to the toot, which means, it is able to separate and identify the answers that were sent as a private message. After, it writes on a text file the date on when the data was collected and transcribes the text gathered to HTML, making it easier to use within the online publication.
<br><br><span class="git"><a class="grey" href=https://git.xpub.nl/pedrosaclout/dms_mastodon_api>https://git.xpub.nl/pedrosaclout/dms_mastodon_api</a></span></p>
<br><br><span class="git"><a class="grey" target="_blank" href=https://git.xpub.nl/pedrosaclout/dms_mastodon_api>https://git.xpub.nl/pedrosaclout/dms_mastodon_api</a></span></p>
<p class='link'><button onclick="code3()">Scraping instance peers</button></p>
<p class='code' id="code03">The code works with two different processes that do not run in parallel. Firstly, relying on the Mastodon API python library, we gather the peers of the instance where our account is registered. After this, we store them in a dictionary. Finally, we iterate through them. In this script the idea is to scrape in an effective way the <em>/about</em> page of an instance where they have their description and in some cases, their code of conduct, for this, with Selenium while opening a Firefox window for each iteration we save the image and description of the instance where we are in and close it right after. All the images are stored in a folder and the information scraped is stored in a txt file.
<br><br><span class="git"><a class="grey" href=https://git.xpub.nl/pedrosaclout/scrape_peers_mastodon_ai>https://git.xpub.nl/pedrosaclout/scrape_peers_mastodon_ai</a></span></p>
<br><br><span class="git"><a class="grey" target="_blank" href=https://git.xpub.nl/pedrosaclout/scrape_peers_mastodon_ai>https://git.xpub.nl/pedrosaclout/scrape_peers_mastodon_ai</a></span></p>
<p class='link'><button onclick="code4()">Scraping instance peers' "about more"</button></p>
<p class='code' id="code04">Working with the same method as the script to scrape the <em>/about</em> page of an instance, in this script the idea is to scrape in an effective way the <em>/about/more</em> page of an instance. Most of the times, here is the place where admins write more extensive rules, codes of conduct and also descriptions of requirements necessary to belong to the instance.
<br><br><span class="git"><a class="grey" href=https://git.xpub.nl/pedrosaclout/scrape_about_more_peers_mastodon>https://git.xpub.nl/pedrosaclout/scrape_about_more_peers_mastodon</a></span></p>
<br><br><span class="git"><a class="grey" target="_blank" href=https://git.xpub.nl/pedrosaclout/scrape_about_more_peers_mastodon>https://git.xpub.nl/pedrosaclout/scrape_about_more_peers_mastodon</a></span></p>
<p class='link'><button onclick="code2()">Scraping "instance.social"</button></p>
<p class='code' id="code02">This was the first script with which I started scraping instances. Before being introduced to the powerful mastodon python library I built a script with Selenium that would go to https://instances.social, and iterate through the "h2" of a determined table where a list of instances was displayed. Selenium would open a new Firefox tab for each iteration, save from the new tab the image and description of the instance and close it right after. All the images are stored in a folder and the information scraped is stored in a txt file. This wasn't the perfect method, because an instance has to register to be part of this list, making this process biased.
<br><br><span class="git"><a class="grey" href=https://git.xpub.nl/pedrosaclout/instance_scrape>https://git.xpub.nl/pedrosaclout/instance_scrape</a></span></p>
<br><br><span class="git"><a class="grey" target="_blank" href=https://git.xpub.nl/pedrosaclout/instance_scrape>https://git.xpub.nl/pedrosaclout/instance_scrape</a></span></p>
</div>
</main>

Loading…
Cancel
Save