The Talk of Europe project curates Linked Open Data about the European Parliament (EP). The dataset covers all plenary debates held in the EP between July 1999 and January 2014, and biographical information about the members of parliament. The dataset includes: information on the monthly sessions of the EP, the agenda of debates, the spoken words and translations thereof in 23 languages; the speakers, their role and the country they represent; membership of national parties, European parties and commissions.
LinkedEP contains links to GeoNames, DBpedia and the official RDF database of the Italian parliament. The European Union Data Portal provides links between Member of Parliament instances in LinkedEP and their named entity resource JRC-Names, available through their SPARQL endpoint.
In the second Talk of Europe Creative Camp, Adam Funk and Wim Peters (University of Sheffield) used their in-house text engineering infrastructure GATE to annotate the speeches with the concepts in them and their degree of occurrence across the proceedings. They also interconnected these concepts based on their semantic relationship. The resulting RDF (n-triples) is available for download here.
To obtain data about the plenary debates, we generated RDF from the HTML pages published on the official website of the EP. We collaborated with the Political Mashup project by Maarten Marx at the University of Amsterdam, who provided scripts to scrape the HTML pages.
The bibliographical data about members of parliament come from the Automated Database of the European Parliament of the University of Oslo [Høyland et al., 2009]. We translated this database to RDF, linked it to the debate data, and made it available as Linked Data as part of the LinkedEP dataset.
28 January 2016: We had a film made about the Talk of Europe project! Available on YouTube (5 min.)
26 January 2016: The example SPARQL queries below are not clickable. Click to see the results in the YASGUI SPARQL editor.
23 June 2015: We have had to reset the server but all is up and running again.
2 March 2015: Problems with incorrect language tags fixed. On http://europarl.europa.eu/, speeches are sometimes displayed in other languages than the user-selected language. This happens when translations are not available. Until now, this problem persisted in LinkedEP. In the current version, we have fixed the majority of the incorrect language tags of speeches, although some remain.
18 Feb 2015: The dataset now covers the complete fifth, sixth, and seventh term (1999-2014) of the European Parliament. Note that the declared prefixes have changed, see the updated model depiction and example queries below.
The schema of classes and properties used in the LinkedEP dataset is displayed in the figure below. For a description see here.
?s ?p ?o.
}} LIMIT 10
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX lp: <http://purl.org/linkedpolitics/>
PREFIX lpv: <http://purl.org/linkedpolitics/vocabulary/>
PREFIX xml: <http://www.w3.org/XML/1998/namespace>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
?sessionday dcterms:hasPart ?agendaitem.
?sessionday dc:date ?date.
?agendaitem dcterms:hasPart ?speech.
?agendaitem lpv:number ?agendaitemnr.
?speech lpv:number ?speechnr.
?speech lpv:text ?text.
FILTER ( ?date >= "2009-05-06"^^xsd:date && ?date <= "2010-05-06"^^xsd:date )
} ORDER BY ?date ?agendaitemnr ?speechnr LIMIT 10
SELECT ?partyname (COUNT(DISTINCT ?speech) AS ?speechno)
<http://purl.org/linkedpolitics/eu/plenary/2010-12-16_AgendaItem_4> dcterms:hasPart ?speech.
?speech lpv:spokenAs ?function.
?function lpv:institution ?party.
?party rdf:type lpv:EUParty.
?party lpv:acronym ?partyname.
} GROUP BY ?partyname
SELECT (COUNT (DISTINCT ?ai) as ?count)
?ai rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/plenary/AgendaItem
?ai dcterms:hasPart ?speech.
?speech lpv:speaker ?speaker.
?speaker lpv:countryOfRepresentation ?country.
?country rdfs:label ?label.
Get the transcript of (a random selection of 10) speeches that contain the word "agriculture".
SELECT DISTINCT ?text
?speech lpv:text ?text.
FILTER regex(str(?text), "agriculture").
} LIMIT 10
Get the number of speeches held in each language.
SELECT DISTINCT ?language (COUNT(DISTINCT ?speech) AS ?speechno)
?speech dc:language ?language.
} GROUP BY ?language
Bjørn Høyland, Indraneel Sircar, Simon Hix. An Automated Database of the European Parliament. European Union Politics, 2009, Vol 10, Issue 1, 143 -- 152.