By clicking on an article you will be given a suite of options to choose from:
- zoom in to the selected article
- view the computer-generated text of the article
- clip the article and view separately
- download a PDF or PNG of the entire page
You can exit the zoomable viewer anytime by clicking on "Regular view" just above the viewer box.
Optical Character Recognition (OCR)
Optical Character Recognition is a process by which software reads a page image and translates it into a text file by recognising the shapes of the letters. All the newspaper text in this collection has been automatically generated using OCR software. It has not been manually reviewed or corrected. To look at the OCR text for an article, click on the "View computer-generated text" link on the Article page.
OCR enables searching of large quantities of full text data, but it is never 100% accurate. For the Papers Past project, the level of accuracy depends on the print quality of the original newspaper, its condition at the time of microfilming, and the level of detail captured by the microfilm scanner. Newspapers with poor quality paper, small print, mixed fonts, multiple column layouts, or damaged pages may have poor OCR accuracy. This means that most pages will have some errors in the computer-generated text, and some will have a lot of errors.
Veridian Digital Library Software
Veridian is computer software for making digital collections available in full-text searchable form over the Internet. It is designed specifically to support collections of digitised printed materials (e.g. newspapers, books, and journals), and to take advantage of the latest technologies used in large digitisation projects.
Veridian was developed in New Zealand by DL Consulting Ltd, using the Greenstone digital library software. Greenstone was developed by the New Zealand Digital Library Project at the University of Waikato, and is distributed in cooperation with UNESCO and the Human Info NGO.