Historical primary sources online: Historical newspapers & news broadcasts
US historical news media
Thematic/specialized historical newspaper collections
Historical newspapers outside the US
About text mining
The availability of news content for text-mining and other computation analysis is extremely complicated.
- The library's vendors of historical archives generally permit text mining of those collections we have purchased. Talk to us before you contact the vendor, and allow time for negotiation before your research clock starts. Costs and contractual obligations vary. Assume that brute-force extraction from our vendors violates university policy, our contracts, and assorted laws.
- Open-access collections of publications that are out of copyright may be amenable to text mining -- Virginia Tech researchers have a long track record with Chronicling America, for example -- though they may not offer you much technical support.
- Providers of current news (eg, Factiva) usually do not permit text-mining and the like. They rarely own the copyrights of the news they aggregate.
- News publishers set their own policy about computational access to their content. Look on each paper's website about API access. The New York Times is very generous. On the other hand, the Washington Post is very restrictive.
Directories of online news archives, mostly free, from around the world
Recent news sources with some historical coverage
Most recent news collections start coverage in the late 1980s or later. Typically, they do not include the photos, graphics, other illustrations, nor advertising that appeared in original print, on microfilm, or in archival digital facsimiles.