Andreas Beger

Data science, computational social science, geopolitical risk

I’m a data scientist working on forecasting, rare-event prediction, and data infrastructure to support them. Over the past decade I’ve worked on client-driven projects in defense, security, academic, and non-governmental sectors. PhD in political science (2012), former US military intelligence officer (Army National Guard). American in Tallinn, Estonia.

On the side, I co-founded and organize PyData Tallinn, which has grown to 600+ members and collaborations with multiple global tech companies in Tallinn and Tartu. I am also data & ML program director for Digit.dev this year (2026).

I’ve learned over the years that I have a particular flavor as a data scientist:

Where you can find me

What else is here

For a list of academic publications, see my research page.

For a period of time, I used to blog.

POLECAT event data: Some resources for the POLECAT event data, which is available at https://dataverse.harvard.edu/dataverse/POLECAT.

CAMEO Event Type Ontology: A web version of the CAMEO event types ontology, from the official PDF codebook on Phil Schrodt’s website.

et1000: the 1,000 most common Estonian words: Estonian is a somewhat boutique language. At the time I did this, you couldn’t find a list of the most commonly used Estonian words online, so I made this.

R packages

I wrote and maintain several open source R packages:

icews: The ICEWS event data consists of more than 270 million event data records extracted from global news stories. The raw data is delivered via dataverse. The icews R package automates the process of keeping an up to date local copy, using either a file- or SQLite-based storage backend.

states: I used to frequently work with global data for independent states. This package has some utility functions for making it easier to work with the two major lists of state system membership, Gleditsch & Ward and COW.

spduration: Implements a time-varying covariate split-population duration regression model for survival data where an unknown portion of the cases are immune from failure. These are sometimes also called cure models.

Bonus trivia

The 5th most interesting thing about me is that at the 2017 MyFitness Madness City Race at the Tallinn Song Festival Grounds, I was, due to a clerical error, part of the best all-female team. (I had gone as a team with my wife and two other women. We have the trophy at home.)