BTAA-GIN Technology Orientation¶

2025

🎯 Goals for Today¶

  • Identify resources for the BTAA Geoportal
  • Learn how to submit metadata
  • Practice fixing common metadata issues

Format: Short explanations + hands-on exercises

The BTAA Geoportal¶

geoportal

  • A discovery tool for geospatial resources
  • Built with GeoBlacklight, an open-source software application
  • A metadata catalog (not the data itself!)
  • (Most) resources are public domain; free & open data

Geoportal Special Features¶

  • Searching with the map (requires coordinates in the metadata)
  • Resource previews (if available from provider):
    • Maps: IIIF (example)
    • Datasets: web services (example)

1. Geoportal Content¶

Formats¶

Always in scope

  • Shapefile, GeoJSON, other spatial formats
  • Geodatabase, geopackage
  • TIFF, JPEG, JPEG2000, PNG
  • GeoTIFF

Added by request

  • Static websites, Interactive databases, Storymaps
  • Tabular data, PDFs, Text files

Sources¶

Always in scope

  • Local government: state, county, city, regional
  • Nongovernment orgs: nonprofits, historical societies
  • Academic: libraries, departments, research institutes

Added by request

  • Federal data
  • Licensed data
  • Maps in copyright

Geography¶

Always in scope

  • Data covering an area within the Big Ten
  • Data covering any area, but created by a Big Ten researcher
  • Maps of any area held at a Big Ten university

Added by request

  • Data from another state or nation
  • Maps held at a non-Big Ten university

Your Primary Role: Keep looking for new resources!

🛠️ Activity¶

Find GIS Data Sources¶

🔍 Task: Look up a county in your state & check if it has a GIS data portal

🤔 Ask Yourself:

  • What data is available?
  • Would it be a good fit for the Geoportal?

💬 Discussion: What did you find?

2. How to submit resources¶

This is an informal process. Every collection has a slightly different workflow. Our goal is to obtain the metadata. Here are some things to look for.

GIS Data¶

  • Is it hosted on a standard platform, such as ArcGIS Hub or CKAN portal? Try adding /data.json to the end of the base URL. This is the API, and it can be harvested programmatically.

  • If not, are there individual metadata documents anywhere, such as a catalog of XMLs?

Scanned Library Maps¶

  • Ask your IT department for a full export of metadata
  • Make sure the export includes IDs and links

A website¶

  • Get the URL and publisher
  • Create a title and description

💬 Discussion¶

How will we get the metadata from scanned maps held at your library?

3. Metadata Profile¶

GeoBTAA Metadata Profile.¶

Includes:

  1. OpenGeoMetadata Aardvark
  2. Custom elements

OpenGeoMetadata (OGM) Aardvark¶

  • designed for GeoBlacklight
  • intended for discoverability
  • often generated from more complete geospatial metadata, such as ISO 19139 or FGDC
  • mainly a mix of Dublin Core and GeoBlacklight application-specific fields

Custom GeoBTAA Elements¶

  • augments OGM Aardvark
  • intended to serve as standalone metadata
  • includes geospatial technical fields, like projection & scale
  • multiple fields for life cycle tracking

🔍 Template¶

Let's take a look at the Primary Metadata Template z.umn.edu/b1g-template

Key things to know about the template

  • Make a copy or request a customized template
  • Separate multiple values with a pipe (|)

4. Tricky Fields¶

Bounding Boxes¶

  • Format as decimal degrees (instead of degrees-minutes-seconds)
  • Use the order West,South,East,North
  • If coordinates are missing, consider adding in batches with identical coverage or assign to student workers

Klokan Bounding Box Demo¶

  • Klokan Bounding Box is a tool for generating extents in various formats
  • Select the "CSV" output option
  • More detailed instructions: https://gin.btaa.org/metadata/recipes/add-bbox

🛠️ Activity¶

Troubleshoot a Bounding Box format¶

Find 3 things to fix about the following bounding box for Chicago:

W87°56', -87°31', 42°01', 41°38'

Answer¶

  1. It needs to be in decimal degrees
  2. The western coordinate has a "W" instead of a negative sign
  3. The coordinates are in the wrong order (These are W,E,N,S; we need W,S,E,N)

Format for the BTAA Geoportal:

-87.9,41.6,-87.5,42.0

Place Names¶

  • Format all place names as FAST subject headings
  • For local US data, the format looks like:
    • state--county 
    • state--city
    • state

Example:

Illinois--Chicago

🛠️ Activity¶

Reformat place names¶

What is the FAST format for the following place names?

  • Portland (Or.)
  • Seattle (Wash.)

Answer¶

  • For "Portland (Or.)" → Oregon--Portland

  • For "Seattle (Wash.)" → Washington (State)--Seattle

To improve user searches, we add the state as a separate entry (separated by a pipe), like this:

Oregon--Portland|Oregon

5. Distributions¶

The BTAA Geoportal does not host data or maps, so we need links with the metadata.

The Geoportal has over 2 dozen types of links. Common types:

  • Landing page
  • Download (can be multiple)
  • IIIF Image API or Presentation (manifest) API
  • OpenIndexMap (GeoJSON)
  • Geospatial web services from ArcGIS or GeoServer
  • Supplemental metadata file

🔍 Template¶

Let's take a look at the Distributions Metadata Template z.umn.edu/b1g-template

Key things to know about the Distributions template

  • One link per line
  • Use the ID from the Primary template
  • The same record may have multiple rows of links

6. Full Resource Lifecycle¶

ResourceLifecycle

1. Identify¶

Team members seek out new content for the geoportal.

2. Obtain and Process Metadata¶

We harvest the metadata, convert it to the GeoBTAA Schema, edit, and validate it.

a. Harvest¶

Here are the most common ways that we harvest the metadata:

  1. a BTAA-GIN Team Member sends us the metadata as files or CSV
  2. an API
  3. scrape an HTML page with Python
  4. we manually copy and paste the metadata into a spreadsheet
  5. a combination of one or more of the above

b. Crosswalk¶

We "crosswalk" or convert the metadata into the schema needed for the Geoportal. Our goal is to end up with a spreadsheet containing columns matching our metadata template.

c. Edit¶

Manually fix, improve, and augment the metadata as needed.

d. Validate¶

Run a validation and cleaning script to ensure the records conform to the required elements of our schema.

3. Index Metadata¶

a. Ingest to GBL Admin¶

We upload the completed spreadsheet to GBL Admin, which serves as the administrative interface for the Geoportal. If GBL Admin detects any formatting errors, it will issue a warning and may reject the upload.

b. Publish new records to the Geoportal¶

Once the metadata is successfully uploaded to GBL Admin, we can publish the records to the Geoportal. The technology that actually stores the records and enables searching is called Solr.

c. Unpublish¶

Periodically, we need to remove records from the Geoportal. To do this, we use GBL Admin to either delete them or change their status to "unpublished."

4. Maintenance¶

a. Monitor sources¶

We monitor our sources to check for new and retired content.

b. Monitor Geoportal¶

We regularly assess currentness of the content in the Geoportal and check for broken links.

c. Schedule re-harvests¶

We schedule re-harvests from sources based on how frequently they update their content. See the Collections Dashboard for this schedule.

❓Questions¶

💬 Discussion¶