You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

397 lines
25 KiB
HTML

5 years ago
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>pattern</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link type="text/css" rel="stylesheet" href="../clips.css" />
<style>
/* Small fixes because we omit the online layout.css. */
h3 { line-height: 1.3em; }
#page { margin-left: auto; margin-right: auto; }
#header, #header-inner { height: 175px; }
#header { border-bottom: 1px solid #C6D4DD; }
table { border-collapse: collapse; }
#checksum { display: none; }
</style>
<link href="../js/shCore.css" rel="stylesheet" type="text/css" />
<link href="../js/shThemeDefault.css" rel="stylesheet" type="text/css" />
<script language="javascript" src="../js/shCore.js"></script>
<script language="javascript" src="../js/shBrushXml.js"></script>
<script language="javascript" src="../js/shBrushJScript.js"></script>
<script language="javascript" src="../js/shBrushPython.js"></script>
</head>
<body class="node-type-page one-sidebar sidebar-right section-pages">
<div id="page">
<div id="page-inner">
<div id="header"><div id="header-inner"></div></div>
<div id="content">
<div id="content-inner">
<div class="node node-type-page"
<div class="node-inner">
<div class="breadcrumb">View online at: <a href="http://www.clips.ua.ac.be/pages/pattern" class="noexternal" target="_blank">http://www.clips.ua.ac.be/pages/pattern</a></div>
<h1>pattern</h1>
<!-- Parsed from the online documentation. -->
<div id="node-1350" class="node node-type-page"><div class="node-inner">
<div class="content">
<p><span class="big">Pattern is a web mining module for the Python programming language.</span></p>
<p><span class="big">It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and &lt;canvas&gt; visualization.</span></p>
<p>The module is free, well-document and bundled with 50+ examples and 350+ unit tests.</p>
<p><img src="../g/pattern_schema.gif" alt="" width="620" height="180" /></p>
<hr />
<h2>Download</h2>
<table>
<tbody>
<tr>
<td><a onclick="javascript:_gaq.push(['_trackPageview', '/downloads/pattern']);" href="http://www.clips.ua.ac.be/media/pattern-2.6.zip" target="_self"><img src="../g/download.gif" alt="download" align="left" /></a></td>
<td><strong>Pattern 2.6</strong>&nbsp;| <a onclick="javascript:_gaq.push(['_trackPageview', '/downloads/pattern']);" href="http://www.clips.ua.ac.be/media/pattern-2.6.zip" target="_self">download</a> (.zip, 23MB)<br />
<ul>
<li>Requires: Python 2.5+ on Windows | Mac | Linux</li>
<li>Licensed under <a href="http://www.linfo.org/bsdlicense.html" target="_blank">BSD</a></li>
<li>Latest releases: <a class="noexternal" href="http://www.clips.ua.ac.be/media/pattern-2.6.zip">2.6</a> |&nbsp;<a class="noexternal" href="http://www.clips.ua.ac.be/media/pattern-2.5.zip">2.5</a> |&nbsp;<a class="noexternal" href="http://www.clips.ua.ac.be/media/pattern-2.4.zip">2.4</a> | <a class="noexternal" href="http://www.clips.ua.ac.be/media/pattern-2.3.zip">2.3</a>&nbsp;|&nbsp;<a class="noexternal" href="http://www.clips.ua.ac.be/media/pattern-2.2.zip">2.2</a> |&nbsp;<a class="noexternal" href="http://www.clips.ua.ac.be/media/pattern-2.1.zip">2.1</a> |&nbsp;<a class="noexternal" href="http://www.clips.ua.ac.be/media/pattern-2.0.zip">2.0</a></li>
<li>Authors:<br />&nbsp;Tom De Smedt (<em>tom at organisms.be</em>)<br />&nbsp;Walter Daelemans&nbsp;</li>
</ul>
<p><span class="small"><span style="text-decoration: underline;">Reference</span>: De Smedt, T. &amp; Daelemans, W. (2012)</span>.<br /><span class="small">Pattern for Python. <em>Journal of Machine Learning Research</em>, 13: 20312035.</span></p>
<p id="checksum" class="grey"><span class="small"><span style="text-decoration: underline;">SHA256</span> checksum&nbsp;of the .zip:<br />28213f05d94a86d2de1d8a03525d456a9e68dc3b563dc2481ad08fe3db180d02</span></p>
</td>
<td>
</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<hr />
<table border="0">
<tbody>
<tr>
<td style="width: 200px;">
<h2>Modules</h2>
<ul>
<li><a href="pattern-web.html">pattern.web</a></li>
<li><a href="pattern-db.html">pattern.db</a></li>
<li><a href="pattern-en.html">pattern.en</a>&nbsp;|&nbsp;<a href="pattern-es.html">es</a>&nbsp;| <a href="pattern-de.html">de</a> | <a href="pattern-fr.html">fr</a> | <a href="pattern-it.html">it</a> |&nbsp;<a href="pattern-nl.html">nl</a></li>
<li><a href="pattern-search.html">pattern.search</a></li>
<li><a href="pattern-vector.html">pattern.vector</a></li>
<li><a href="pattern-graph.html">pattern.graph</a>&nbsp;</li>
</ul>
<p><span class="smallcaps">Helper modules</span></p>
<ul style="margin-top: 0;">
<li><a href="pattern-metrics.html">pattern.metrics</a></li>
<li><a href="pattern-canvas.html">canvas.js</a></li>
</ul>
<p><span class="smallcaps">Command-line</span></p>
<ul style="margin-top: 0;">
<li><a href="pattern-shell.html">Command-line interface</a></li>
</ul>
</td>
<td>
<h2><a name="contribute"></a>Contribute</h2>
<ul>
<li><a href="pattern-dev.html">Developer documentation</a></li>
<li><a href="https://github.com/clips/pattern" target="_blank">GitHub repository</a></li>
<li><a href="http://groups.google.com/group/pattern-for-python" target="_blank">Google group</a></li>
</ul>
<form action="https://www.paypal.com/cgi-bin/webscr" method="post"><input type="hidden" name="cmd" value="_s-xclick" /> <input type="hidden" name="hosted_button_id" value="HW2GU5PNWYQV8" /> <input type="image" name="submit" src="../g/paypal-donate.jpg" alt="PayPal - The safer, easier way to pay online!" /> <img src="https://www.paypalobjects.com/en_US/i/scr/pixel.gif" alt="" width="1" height="1" border="0" /></form>
</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<hr />
<h2>Installation</h2>
<p>Pattern is written for Python 2.5+ (also supports Python 3.6+). The module has no external dependencies, except <span class="inline_code">LSA</span> in the pattern.vector module, which requires <a href="http://numpy.scipy.org/" target="_blank">NumPy</a> (installed by default on Mac OS X).&nbsp;</p>
<p>To install Pattern so that the module is available in all Python scripts, from the command line do:</p>
<div class="install">
<pre class="gutter:false; light:true;">&gt; cd pattern-3.6
&gt; python setup.py install&nbsp;</pre></div>
<p>If you have pip, you can automatically download and install from the PyPi repository:</p>
<div class="install">
<pre class="gutter:false; light:true;">&gt; pip install pattern</pre></div>
<p>If none of the above works, you can make Python aware of the module in three ways:</p>
<ul>
<li>Put the <span class="inline_code">pattern</span>&nbsp;subfolder in the .zip archive in the same folder as your script.</li>
<li>Put the <span class="inline_code">pattern</span>&nbsp;subfolder in the standard location for modules so it is available to all scripts:<br /><span class="inline_code">c:\python27\Lib\site-packages\</span>&nbsp;(Windows),<br /><span class="inline_code"> /Library/Python/2.7/site-packages/</span>&nbsp;(Mac),<br /><span class="inline_code">/usr/lib/python2.7/site-packages/</span>&nbsp;(Unix).<span style="font-family: Courier, monospace; font-size: small;"><span style="font-size: 12px;"><br /></span></span></li>
<li>Add the location of the module to&nbsp;<span class="inline_code">sys.path</span>&nbsp;in your Python script, before importing it:</li>
</ul>
<div class="example">
<pre class="brush:python; gutter:false; light:true;">&gt;&gt;&gt; import sys; sys.path.append('/users/tom/desktop/pattern')
&gt;&gt;&gt; from pattern.web import Twitter </pre></div>
<p>&nbsp;</p>
<hr />
<h2>Quick overview</h2>
<h3>pattern.web</h3>
<p>The&nbsp;<a href="pattern-web.html">pattern.web</a>&nbsp;module is a web toolkit that contains API's (Google, Gmail, Bing, Twitter, Facebook, Wikipedia, Wiktionary, DBPedia, Flickr, ...), a robust HTML DOM parser and a web crawler.</p>
<div class="example">
<pre class="brush:python; gutter:false; light:true;">&gt;&gt;&gt; from pattern.web import Twitter, plaintext
&gt;&gt;&gt;
&gt;&gt;&gt; twitter = Twitter(language='en')
&gt;&gt;&gt; for tweet in twitter.search('"more important than"', cached=False):
&gt;&gt;&gt; print plaintext(tweet.text)
'The mobile web is more important than mobile apps.'
'Start slowly, direction is more important than speed.'
'Imagination is more important than knowledge. - Albert Einstein'
... </pre></div>
<h3>pattern.en</h3>
<p>The&nbsp;<a href="pattern-en.html">pattern.en</a>&nbsp;module is a natural language processing (NLP) toolkit for English. Because language is ambiguous (e.g., <em>I can</em>&nbsp;<em>a can</em>) it uses statistical approaches + regular expressions. This means that it is fast, quite accurate and occasionally incorrect. It has a part-of-speech tagger that identifies word types (e.g., noun, verb, adjective), word inflection (conjugation, singularization) and a WordNet API.</p>
<div class="example">
<pre class="brush:python; gutter:false; light:true;">&gt;&gt;&gt; from pattern.en import parse
&gt;&gt;&gt;
&gt;&gt;&gt; s = 'The mobile web is more important than mobile apps.'
&gt;&gt;&gt; s = parse(s, relations=True, lemmata=True)
&gt;&gt;&gt; print s
'The/DT/B-NP/O/NP-SBJ-1/the mobile/JJ/I-NP/O/NP-SBJ-1/mobile' ...
</pre></div>
<table class="border">
<tbody>
<tr>
<td class="smallcaps" style="text-align: right;">word</td>
<td class="smallcaps" style="text-align: center;">tag</td>
<td class="smallcaps" style="text-align: center;">chunk</td>
<td class="smallcaps" style="text-align: center;">role</td>
<td class="smallcaps" style="text-align: center;">id</td>
<td class="smallcaps" style="text-align: center;">pnp</td>
<td class="smallcaps">lemma</td>
</tr>
<tr>
<td style="text-align: right;">The</td>
<td class="inline_code" style="text-align: center;">DT</td>
<td class="inline_code" style="text-align: center;">NP&nbsp;</td>
<td class="inline_code" style="text-align: center;">SBJ</td>
<td class="inline_code" style="text-align: center;">1</td>
<td class="inline_code" style="text-align: center;">-</td>
<td><em>the</em></td>
</tr>
<tr>
<td style="text-align: right;">mobile</td>
<td class="inline_code" style="text-align: center;">JJ</td>
<td class="inline_code" style="text-align: center;">NP^</td>
<td class="inline_code" style="text-align: center;">SBJ</td>
<td class="inline_code" style="text-align: center;">1</td>
<td class="inline_code" style="text-align: center;">-</td>
<td><em>mobile</em></td>
</tr>
<tr>
<td style="text-align: right;">web</td>
<td class="inline_code" style="text-align: center;">NN</td>
<td class="inline_code" style="text-align: center;">NP^</td>
<td class="inline_code" style="text-align: center;">SBJ</td>
<td class="inline_code" style="text-align: center;">1</td>
<td class="inline_code" style="text-align: center;">-</td>
<td><em>web</em></td>
</tr>
<tr>
<td style="text-align: right;">is</td>
<td class="inline_code" style="text-align: center;">VBZ</td>
<td class="inline_code" style="text-align: center;">VP&nbsp;</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">1</td>
<td class="inline_code" style="text-align: center;">-</td>
<td><em>be</em></td>
</tr>
<tr>
<td style="text-align: right;">more</td>
<td class="inline_code" style="text-align: center;">RBR</td>
<td class="inline_code" style="text-align: center;">ADJP&nbsp;</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td><em>more</em></td>
</tr>
<tr>
<td style="text-align: right;">important</td>
<td class="inline_code" style="text-align: center;">JJ</td>
<td class="inline_code" style="text-align: center;">ADJP^</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td><em>important</em></td>
</tr>
<tr>
<td style="text-align: right;">than</td>
<td class="inline_code" style="text-align: center;">IN</td>
<td class="inline_code" style="text-align: center;">PP&nbsp;</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">PNP</td>
<td><em>than</em></td>
</tr>
<tr>
<td style="text-align: right;">mobile</td>
<td class="inline_code" style="text-align: center;">JJ</td>
<td class="inline_code" style="text-align: center;">NP&nbsp;</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">PNP</td>
<td><em>mobile</em></td>
</tr>
<tr>
<td style="text-align: right;">apps</td>
<td class="inline_code" style="text-align: center;">NNS</td>
<td class="inline_code" style="text-align: center;">NP^</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">PNP</td>
<td><em>app</em></td>
</tr>
<tr>
<td style="text-align: right;">.</td>
<td class="inline_code" style="text-align: center;">.</td>
<td class="inline_code" style="text-align: center;">-&nbsp;</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td class="inline_code" style="text-align: center;">-</td>
<td>.</td>
</tr>
</tbody>
</table>
<p>The text has been annotated with word types,&nbsp;for example nouns (<span class="postag">NN</span>), verbs(<span class="postag">VB</span>),&nbsp;adjectives (<span class="postag">JJ</span>) and determiners (<span class="postag">DT</span>), word types (e.g.,&nbsp;sentence subject&nbsp;<span class="postag">SBJ</span>) and prepositional noun phrases (<span class="postag">PNP</span>). To iterate over the parts in the tagged text we can construct a <em>parse tree</em>.</p>
<div class="example">
<pre class="brush:python; gutter:false; light:true;">&gt;&gt;&gt; from pattern.en import parsetree
&gt;&gt;&gt;
&gt;&gt;&gt; s = 'The mobile web is more important than mobile apps.'
&gt;&gt;&gt; s = parsetree(s)
&gt;&gt;&gt; for sentence in s:
&gt;&gt;&gt; for chunk in sentence.chunks:
&gt;&gt;&gt; for word in chunk.words:
&gt;&gt;&gt; print word,
&gt;&gt;&gt; print
Word(u'The/DT') Word(u'mobile/JJ') Word(u'web/NN')
Word(u'is/VBZ')
Word(u'more/RBR') Word(u'important/JJ')
Word(u'than/IN')
Word(u'mobile/JJ') Word(u'apps/NNS')
</pre></div>
<p>Parsers for Spanish, French, Italian, German and Dutch are also available: <br /><a href="pattern-es.html">pattern.es</a>&nbsp;| <a href="pattern-fr.html">pattern.fr</a> | <a href="pattern-it.html">pattern.it</a> |&nbsp;<a href="pattern-de.html">pattern.de</a>&nbsp;|&nbsp;<a href="pattern-nl.html">pattern.nl</a></p>
<h3>pattern.search</h3>
<p>The&nbsp;<a href="pattern-search.html">pattern.search</a>&nbsp;module contains a search algorithm to retrieve sequences of words (called <em>n-grams</em>) from tagged text.</p>
<div class="example">
<pre class="brush:python; gutter:false; light:true;">&gt;&gt;&gt; from pattern.en import parsetree
&gt;&gt;&gt; from pattern.search import search
&gt;&gt;&gt;
&gt;&gt;&gt; s = 'The mobile web is more important than mobile apps.'
&gt;&gt;&gt; s = parsetree(s, relations=True, lemmata=True)
&gt;&gt;&gt;
&gt;&gt;&gt; for match in search('NP be RB?+ important than NP', s):
&gt;&gt;&gt; print match.constituents()[-1], '=&gt;', \
&gt;&gt;&gt; match.constituents()[0]
Chunk('mobile apps/NP') =&gt; Chunk('The mobile web/NP-SBJ-1')
</pre></div>
<p>The search pattern&nbsp;<span class="inline_code">NP</span> <span class="inline_code">be</span> <span class="inline_code">RB?+</span> <span class="inline_code">important</span> <span class="inline_code">than</span> <span class="inline_code">NP</span> means any noun phrase (<span class="postag">NP</span>) followed by the verb <em>to be</em>, followed by zero or more adverbs (<span class="postag">RB</span>, e.g.,&nbsp;<em>much</em>, <em>more</em>), followed by the words <em>important than</em>, followed by any noun phrase. It will also match "<em>The mobile web <span style="text-decoration: underline;">will</span> <span style="text-decoration: underline;">be</span> <span style="text-decoration: underline;">much</span> <span style="text-decoration: underline;">less</span> important than mobile apps</em>" and other grammatical variations.</p>
<h3>pattern.vector</h3>
<p>The&nbsp;<a href="pattern-vector.html">pattern.vector</a>&nbsp;module is a toolkit for machine learning, based on a vector space model&nbsp;of bag-of-words documents with weighted features (e.g., tf-idf) and distance metrics (e.g., cosine similarity, infogain).&nbsp;Models can be used for clustering (<em>k</em>-means, hierarchical), classification (Naive Bayes, Perceptron,&nbsp;<em>k-</em>NN, SVM) and latent semantic analysis (LSA).</p>
<div>
<div class="example">
<pre class="brush: python;gutter: false; fontsize: 100; first-line: 1; ">&gt;&gt;&gt; from pattern.web import Twitter
&gt;&gt;&gt; from pattern.en import tag
&gt;&gt;&gt; from pattern.vector import KNN, count
&gt;&gt;&gt;
&gt;&gt;&gt; twitter, knn = Twitter(), KNN()
&gt;&gt;&gt;
&gt;&gt;&gt; for i in range(1, 10):
&gt;&gt;&gt; for tweet in twitter.search('#win OR #fail', start=i, count=100):
&gt;&gt;&gt; s = tweet.text.lower()
&gt;&gt;&gt; p = '#win' in s and 'WIN' or 'FAIL'
&gt;&gt;&gt; v = tag(s)
&gt;&gt;&gt; v = [word for word, pos in v if pos == 'JJ'] # JJ = adjective
&gt;&gt;&gt; v = count(v)
&gt;&gt;&gt; if v:
&gt;&gt;&gt; knn.train(v, type=p)
&gt;&gt;&gt;
&gt;&gt;&gt; print knn.classify('sweet potato burger')
&gt;&gt;&gt; print knn.classify('stupid autocorrect')
'WIN'
'FAIL' </pre></div>
</div>
<p>This example trains a classifier on adjectives mined from Twitter. First, tweets with hashtag #win or #fail are mined. For example: <em>"$20 tip off a <span style="text-decoration: underline;">sweet</span> <span style="text-decoration: underline;">little</span> <span style="text-decoration: underline;">old</span> lady today #win"</em>. The word part-of-speech tags are parsed, keeping only adjectives. Each tweet is transformed to a vector, a dictionary of adjective → count items, labeled <span class="inline_code">WIN</span> or <span class="inline_code">FAIL</span>. The classifier uses the vectors to learn which other, unknown tweets look more like&nbsp;<span class="inline_code">WIN</span>&nbsp;(e.g., <em>sweet potato burger</em>) or more like <span class="inline_code">FAIL</span> (e.g., <em>stupid autocorrect</em>).</p>
<h3>pattern.graph</h3>
<p>The&nbsp;<a href="pattern-graph.html">pattern.graph</a>&nbsp;module provides a graph data structure that represents relations between nodes (e.g., terms, concepts). Graphs can be exported as HTML <span class="inline_code">&lt;canvas&gt;</span> animations (<span class="link-maintenance"><a href="http://www.clips.ua.ac.be/media/pattern-graph" target="_blank">demo</a></span>). In the example below, more <em>central</em> nodes (= more incoming traffic) are colored in blue.</p>
<p><img class="border" src="../g/pattern_graph5.jpg" alt="" width="610" height="198" /></p>
<div class="example">
<pre class="brush:python; gutter:false; light:true;">&gt;&gt;&gt; from pattern.web import Bing, plaintext
&gt;&gt;&gt; from pattern.en import parsetree
&gt;&gt;&gt; from pattern.search import search
&gt;&gt;&gt; from pattern.graph import Graph
&gt;&gt;&gt;
&gt;&gt;&gt; g = Graph()
&gt;&gt;&gt; for i in range(10):
&gt;&gt;&gt; for result in Bing().search('"more important than"', start=i+1, count=50):
&gt;&gt;&gt; s = r.text.lower()
&gt;&gt;&gt; s = plaintext(s)
&gt;&gt;&gt; s = parsetree(s)
&gt;&gt;&gt; p = '{NP} (VP) more important than {NP}'
&gt;&gt;&gt; for m in search(p, s):
&gt;&gt;&gt; x = m.group(1).string # NP left
&gt;&gt;&gt; y = m.group(2).string # NP right
&gt;&gt;&gt; if x not in g:
&gt;&gt;&gt; g.add_node(x)
&gt;&gt;&gt; if y not in g:
&gt;&gt;&gt; g.add_node(y)
&gt;&gt;&gt; g.add_edge(g[x], g[y], stroke=(0,0,0,0.75)) # R,G,B,A
&gt;&gt;&gt;
&gt;&gt;&gt; g = g.split()[0] # Largest subgraph.
&gt;&gt;&gt;
&gt;&gt;&gt; for n in g.sorted()[:40]: # Sort by Node.weight.
&gt;&gt;&gt; n.fill = (0, 0.5, 1, 0.75 * n.weight)
&gt;&gt;&gt;
&gt;&gt;&gt; g.export('test', directed=True, weighted=0.6) </pre></div>
<p>Some relations (= edges) could use some extra post-processing, e.g., in <em>nothing is more important than life</em>, <em>nothing</em> is <span style="text-decoration: underline;">not</span> more important than <em>life</em>.</p>
<p>&nbsp;</p>
<hr />
<h2>Case studies&nbsp;</h2>
<p>Case studies with hands-on source code examples.</p>
<table border="0">
<tbody>
<tr>
<td>
<p><a href="http://www.clips.ua.ac.be/pages/modeling-creativity-with-a-semantic-network-of-common-sense"><img src="../g/pattern_example_semantic_network.jpg" alt="" width="70" height="70" /><br /></a></p>
</td>
<td>&nbsp;</td>
<td><span class="smallcaps">modeling creativity with a semantic network of common sense </span><span class="small">(2013)</span>&nbsp;<br />This case study offers a computational model of creativity, by representing the mind as a semantic network of common sense, using <a class="link-maintenance" href="pattern-graph.html">pattern.graph</a>&nbsp;&amp; <a class="link-maintenance" href="pattern-web.html">web</a>.<br /><a href="http://www.clips.ua.ac.be/pages/modeling-creativity-with-a-semantic-network-of-common-sense">read more »</a></td>
</tr>
<tr>
<td>
<p><a class="noexternal" href="http://www.clips.ua.ac.be/pages/using-wiktionary-to-build-an-italian-part-of-speech-tagger"><img src="../g/pattern_example_italian.jpg" alt="" width="70" height="70" /><br /></a></p>
</td>
<td>&nbsp;</td>
<td><span class="smallcaps">using wiktionary to build an italian part-of-speech tagger </span><span class="small">(2013)</span> <br />This case study demonstrates how a part-of-speech tagger for Italian (see <a class="link-maintenance" href="pattern-it.html">pattern.it</a>) can be built by mining Wiktionary and Wikipedia. &nbsp;<br /><a href="http://www.clips.ua.ac.be/pages/using-wiktionary-to-build-an-italian-part-of-speech-tagger">read more »</a></td>
</tr>
<tr>
<td>
<p><a class="noexternal" href="http://www.clips.ua.ac.be/pages/using-wikicorpus-nltk-to-build-a-spanish-part-of-speech-tagger"><img src="../g/pattern_example_spanish.jpg" alt="" width="70" height="70" /><br /></a></p>
</td>
<td>&nbsp;</td>
<td><span class="smallcaps">using wikicorpus and nltk to build a spanish part-of-speech tagger </span><span class="small">(2012)</span><br />This case study demonstrates how a part-of-speech tagger for Spanish (see <a class="link-maintenance" href="pattern-es.html">pattern.es</a>) can be built by using NLTK and the freely available Wikicorpus. <br /><a href="http://www.clips.ua.ac.be/pages/using-wikicorpus-nltk-to-build-a-spanish-part-of-speech-tagger">read more »</a></td>
</tr>
<tr>
<td>
<p><a class="noexternal" href="http://www.clips.ua.ac.be/pages/pattern-examples-elections"><img src="../g/pattern_example_elections.jpg" alt="" width="70" height="70" /><br /></a></p>
</td>
<td>&nbsp;</td>
<td><span class="smallcaps">belgian elections</span><span class="smallcaps">, twitter sentiment analysis&nbsp;</span><span class="small">(2010)</span><br />This case study uses sentiment analysis (e.g., positive or negative tone) on 7,500 Dutch and French tweets (see <a class="link-maintenance" href="pattern-web.html">pattern.web</a> |&nbsp;<a class="link-maintenance" href="pattern-nl.html">nl</a>&nbsp;|&nbsp;<a class="link-maintenance" href="pattern-fr.html">fr</a>) in the weeks before the Belgian 2010 elections. <br /><a href="http://www.clips.ua.ac.be/pages/pattern-examples-elections">read more »</a></td>
</tr>
<tr>
<td>
<p><a class="noexternal" href="http://www.clips.ua.ac.be/pages/pattern-examples-100days"><img src="../g/pattern_example_100days.jpg" alt="" width="70" height="70" /><br /></a></p>
</td>
<td>&nbsp;</td>
<td><span class="smallcaps">web mining and visualization </span><span class="small">(2010)</span><br />This case study uses a number of different approaches to mine, correlate and visualize about 6,000 Google News items and 70,000 tweets.&nbsp;<br /><a href="http://www.clips.ua.ac.be/pages/pattern-examples-100days">read more »</a></td>
</tr>
</tbody>
</table>
</div>
</div></div>
</div>
</div>
</div>
</div>
</div>
</div>
<script>
SyntaxHighlighter.all();
</script>
</body>
</html>