Login / Register
ORNo Account? Register here.
Digital Data Mining 101
Want to help crack this—or another—case? Paul Haynes, an Internet sleuth whom Michelle McNamara connected with online while tracking the Golden State Killer’s crimes, shares his top five resources for DIY detectives
If you’re a sleuth on a budget and want to limit your number of paid subscriptions to one, Ancestry.com is the essential one-stop resource for most of your data mining needs.
Ancestry.com consolidates an expansive variety of record types, from city directories and address histories to birth, marriage, divorce, death, and census records. While the address data aggregated here and elsewhere only dates back so far, items such as marriage and divorce records can supplement that material and also help shape a more precise time line of a subject’s geographic history.
Ancestry is not without limitations of its own. Not every U.S. state has made the same sets of records publicly available. So, for example, while you’ll find 20th-century birth records for some states (like California, luckily for me), you won’t find them for many others.
2. People search engines
A public records aggregator, or “people search” engine, typically prompts you to enter a name and a location and then returns a list of matching individuals, along with their current age and a list of other cities they’ve called home.
The best among the several I use is PeopleSmart. A free membership not only allows you to search for individuals—with an impressively versatile Advanced Search feature—but also provides unlimited reverse address searches. More detailed reports are available for competitively priced à la carte fees.
The overarching problem with these aggregators is that residential address data only reliably dates back through about 1985; the staler the data, the less likely it is that it will appear in your search results. No aggregator offers any guarantee that its data will not continue to be altered or outright purged, although PeopleFinders—another very useful aggregator—seems to have pretty stable data. A monthly PeopleFinders membership buys unlimited individual address histories, and some of the data is exclusive to this site.
Essential to data mining is having a starting point. The names you’ll run through Ancestry.com or PeopleSmart will have to come from somewhere. Yearbooks are especially useful starting points because they offer an immediate visual impression to compare against established criteria. Since the Golden State Killer was likely active in Visalia in 1974 and 1975, it’s reasonable that he may have been from there or attended high school there. So any white male in any Visalia yearbook published between 1962 and 1975 is fair game for triage; people search engines and Ancestry.com can be used to sniff out the ones who’ve also lived in other relevant places.
Classmates.com hosts a digital archive of more than 200,000 yearbooks. Aside from being useful as a data reservoir ripe for mining, it features an OCR-based search tool that can help determine where someone attended high school and what he or she looked like.
4. Grantor/grantee indexes and court records
The recorder divisions of most U.S. counties maintain free online transaction archives. Because this data is unchanging and specific, this is a highly dependable type of resource that will tell you if and when someone has purchased and sold property in that county. Of course the scope of what’s offered and the ease of use are highly variable from county to county, and while some indexes date back over a century, others only date back a decade or two.
Less comprehensive are online court records, which can not only illuminate a subject’s criminal history, but may also further augment his or her geographic time line. Online court records don’t date back as far as recorder data and are vulnerable to periodic purging. In rare instances, a court case index, like that for Orange County, CA, will tell you someone’s height, weight, hair color, and eye color, which is useful.
5. Newspaper archives
NewspaperArchive.com, Ancestry.com, and the Google News Archive each functions as a sort of virtual microform library. A well-worn cinematic cliché is that of the amateur sleuth poring over microfilm, looking for that veritable needle in the haystack of old newsprint. Online repositories eliminate this time-devouring tedium with OCR-based text searches. A search combining phrases like “obscene phone calls” and “Sacramento,” or “Peeping Tom,” “window screen,” and “Modesto” could well uncover the elusive clue that explodes the Golden State Killer’s case.
Just as important is the news archive’s potential for finding ancillary information on a subject—be it a classified ad that places him in the right city at the right time or some criminal offense that occurred too