Expanding discovery and access to Canada’s topographic maps at Scholars GeoPortal: Using AI workflows for metadata enhancement

Last modified by Sabina Pagotto on 2025-11-19, 09:54

Map of Vancouver showing detail around the Burrard Inlet

Detail from the first edition 1:50k map of the North Vancouver, British Columbia region (Sheet No. 092G06), published in 1949.

Canada’s National Topographic System (NTS) maps depict government-surveyed topographic features, covering the time period of approximately 1905-2012. Reflecting on library metadata workflows to support discovery, access, and reuse of the maps, this recent initiative introduces AI and machine learning with large language models (LLMs) and OCR image-to-text extraction workflows to support metadata enhancement and access in the future Scholars GeoPortal.

Enhanced metadata for historical maps

For decades, previous library collaborations in Canada have aimed to address shared challenges associated with large-scale map series digitization, inventories, metadata, georeferencing, and long-term preservation of the NTS series maps. Together, the Scholars GeoPortal and library partners are taking another major step toward improving metadata for the NTS series maps and data collections. 

Timeline of historical topographic map projects. The OCUL Historical Topographic Map project runs from 2015 to 2019, with the McGill and UofT-led 1:50k project running from 2019 to present and the GeoPortal redevelopment kicking off in 2023.

Figure 1. Project timeline (2015-2025) for the National Topographic System (NTS) Canadian Historical Topographic Maps Projects at Scholars GeoPortal. 

As part of the ongoing Scholars GeoPortal Redevelopment Project, we are developing new automated metadata workflows for over 15,000 historical georeferenced maps and data collections from the NTS 1:50,000 series, which will be available in the new Beta Scholars GeoPortal next year. 

Map showing areas currently available on the GeoPortal (most of Ontario, Alberta, BC, southern Quebec, and the Maritimes, plus several additional urban areas) and areas that will be available on next year's beta GeoPortal (most of Northern Canada).

Figure 2. Geographic coverage for the collection. This map represents the areas (NTS grid cells) covered by the map sheets in the Canadian historical topographic maps collection. Green represents map areas for which map sheets are already available on Scholars GeoPortal; brown represents new regions that will be available on Scholars GeoPortal beta in 2026.

The workflow combines existing metadata sources with new OCR image-to-text extraction outputs, and then uses AI LLMs for metadata enhancement and validation workflows, aiming to improve metadata accuracy and efficiencies. First, Tesseract OCR extracts raw text from the scanned map images. The raw outputs are fed into an LLM, which is also fed the existing metadata sources as a way to provide the LLM with context and a structured example for benchmarking. Then, a detailed prompt directs the LLM to parse specific information out of the raw text, and output as tabular data, with each column representing a targeted metadata field such as title, subtitle, year, edition number, coverage, for a quality control review. 

Scanned map with the edition and NTS number highlighted.

Scanned map with the community name and year of printing highlighted.

Figures 3a and 3b. In these images, red boxes are highlighting text sections extracted using Tesseract OCR and an LLM. Above, the top-right edge of a map (“Grand Forks, British Columbia”, NTS 082E1 east half, 1st edition published in 1966) containing edition information and the printed NTS number. At the bottom, the bottom-right edge of the same map shows the map’s title and the year printed.

Metadata Curation and Quality Control

Extracted and enriched metadata is incorporated into an integrated GIS map-enabled ArcGIS Pro dashboard, allowing side-by-side comparison of extracted and original metadata sources in a structured mosaic database table. This supports map image validation and streamlined QA/QC by metadata reviewers, aiming to reduce manual effort by automating repetitive tasks and improving accuracy of map metadata in the repository.

ArcGIS Pro dahsboard showing the digitized map and extracted metadata for validation.

Figure 4. Workflow environment on ArcGIS Pro. This figure shows the project's workflow environment, which integrates different metadata sources with the collection's spatial metadata on ArcGIS Pro in a mosaic dataset with associated tables. At the center, sheets can be viewed as georeferenced objects on a map, alongside the NTS grid files produced by NRCan. At the bottom, the data table can be sorted, filtered and queried, and metadata for the highlighted item can be viewed and edited on the right side panel. This setup makes cross-checking between the content of map sheets, metadata fields, and geographic information a much more efficient task, optimizing for data accuracy and richness as well as for reduced friction in metadata curation and quality control.

In a single window, geoTIFFs can be visualized on an ArcGIS Pro map window alongside the data table, where metadata validation can be reviewed and manual corrections can be made. Wherever the extracted text or the geospatial information for a given sheet does not match the original metadata, a record can be flagged for a human to review the difference. Reviewers will be able to focus most of their efforts on the sheets that have been flagged as part of the automated metadata validation process, minimizing the need for random spot-checks, which may feel like a haphazard way of identifying issues (the data equivalent of wandering in the dark). 

Moving forward, we hope this OCR image-to-text extraction and LLM processing can be further refined and adapted to streamline the processing of over 15,000 updated and new maps and accompanying ISO 19115 XML standard metadata for the NTS 1:50,000 series and other historical maps in the future GeoPortal infrastructure. 

Learn More

Digitized NTS maps and data collections are available in the current Scholars GeoPortal repository for the 1:25,000 series (Ontario), 1:63,360 series (Ontario), 1:50,000 series (Canada-wide). You can learn more with our guide to accessing historical maps on the GeoPortal and the website for OCUL’s 2014-2017 Historical Map Digitization Project which covers early-NTS maps. More historical map collections, including over 22,000 digitized maps from the McGill Library’s NTS 1:50k series, are available via Canadiana.

Learn more about the Scholars GeoPortal Redevelopment Project and AI metadata enhancement workflows at the upcoming Scholars Portal Fall Updates Webinar on November 27 at 1 p.m. ET. 

Stay tuned for more updates and news about the Scholars GeoPortal Redevelopment Project and the NTS series maps, or contact us at datagis@scholarsportal.info  or topomaps@scholarsportal.info with any questions.