News & Articles

With the introduction of Demscore v3, you can merge datasets in a dyadic format with datasets in a country-year format.

Let’s say you want to run an analysis using variables from the UCDP Dyadic and the V-Dem Country-Year dataset. These variables are difficult to combine, as the datasets collect data on different levels of analysis: the former on a dyad-year level, and the latter on a country-year level. See example below:

tabel 1 plain

Table 1 shows an example of the UCDP dyad and year dataset. This table has three columns 'dyad_id', 'year', and 'location' with corresponding values in 4 rows.

tabel 2 plain

Table 2 shows an example of the V-Dem Country and Year dataset, with two columns 'country' and 'year', with corresponding values in six rows.

The Demscore v3 update represents a significant advancement, providing greater flexibility and compatibility for data analysis purposes. With the new dyad-location-year Output Format, we create an Output Unit that can append both datasets, as it includes one observation per location, dyad and year.

table 3 plain

To create this unit we stretch the UCDP Dyadic dataset using the comma-separated observations in the location column, i.e. creating one row per location, dyad, and year.

table 4 plain

For variables coming from the V-Dem Country-Year dataset, we can now match years to years and countries to locations.

If certain observations from a variable do not have a match in the end Output Unit, they get the value -11111 (“missing from merge”).

Disclaimer: Please note that we merge V-Dem countries to UCDP locations! A country in V-Dem is a political unit enjoying at least some degree of functional and/or formal sovereignty, while a location in UCDP is either the location in which an event takes place, or the country of the incompatibility/actor.

More articles like this

While DEMSCORE v4 with 145 datasets and 25.000 variables is bigger than ever, we know that there are plenty of other datasets out there that we do not (yet) include. In this post, we will walk you through a little DEMSCORE-hack, illustrating how DEMSCORE can be of help even when using external datasets.

Those with experience in merging data can attest that it is an exceedingly demanding task, and if you are unlucky, it can consume a significant portion of your time devoted to a project. At least that used to be the case before the launch of Demscore, a research e-infrastructure that streamlines data merging and harmonization.