New paper: Machine Assisted Dossiers

I'm excited to announce the publication of Machine Assisted Dossiers, a new paper on knowledge systems and data management written in collaboration with Forest Gregg and Tim McGovern for the Computation + Journalism 2017 Symposium held this past October at Northwestern University.

Here's the abstract:

One of the great disappointments of data journalism is that so much available data is simply bad. It is unreliable, ambiguous, and contradictory. Developing an accurate image of the world still requires discernment, sorting, and judgment.

We are still only beginning to build technologies that complement these human capacities—but allow them to scale. In this paper, we present the capabilities we believe an adequate knowledge system must have, drawing heavily from the field of genealogy and our own work modeling international security forces.

We'll discuss the overall requirements for such a system and try to envision its user experience and its data architecture; we'll also survey where currently available technologies can fill in the gaps between the two.

We had a lot of fun cataloguing the many layers of evidence that go into the collection of claims about the world, as well as sketching out some interfaces for managing that evidence. We may have reinvented a few epistemological wheels along the way, but the Stanford Encyclopedia of Philosophy was an enormous help, as were our patient colleagues Bob Lannon, Kathryn Lindeman, and Michael Castelle, who provided comments early on and helped connect us with the intellectual resources we needed.

Read the full paper online, or head over to the repo to download the PDF.