BTAA-GIN Technology Orientation¶
2025
🎯 Goals for Today¶
- Identify resources for the BTAA Geoportal
- Learn how to submit metadata
- Practice fixing common metadata issues
Format: Short explanations + hands-on exercises
The BTAA Geoportal¶
- A discovery tool for geospatial resources
- Built with GeoBlacklight, an open-source software application
- A metadata catalog (not the data itself!)
- (Most) resources are public domain; free & open data
1. Geoportal Content¶
Formats¶
Always in scope
- Shapefile, GeoJSON, other spatial formats
- Geodatabase, geopackage
- TIFF, JPEG, JPEG2000, PNG
- GeoTIFF
Added by request
- Static websites, Interactive databases, Storymaps
- Tabular data, PDFs, Text files
Sources¶
Always in scope
- Local government: state, county, city, regional
- Nongovernment orgs: nonprofits, historical societies
- Academic: libraries, departments, research institutes
Added by request
- Federal data
- Licensed data
- Maps in copyright
Geography¶
Always in scope
- Data covering an area within the Big Ten
- Data covering any area, but created by a Big Ten researcher
- Maps of any area held at a Big Ten university
Added by request
- Data from another state or nation
- Maps held at a non-Big Ten university
Your Primary Role: Keep looking for new resources!
2. How to submit resources¶
This is an informal process. Every collection has a slightly different workflow. Our goal is to obtain the metadata. Here are some things to look for.
GIS Data¶
Is it hosted on a standard platform, such as ArcGIS Hub or CKAN portal? Try adding
/data.json
to the end of the base URL. This is the API, and it can be harvested programmatically.If not, are there individual metadata documents anywhere, such as a catalog of XMLs?
Scanned Library Maps¶
- Ask your IT department for a full export of metadata
- Make sure the export includes IDs and links
A website¶
- Get the URL and publisher
- Create a title and description
💬 Discussion¶
How will we get the metadata from scanned maps held at your library?
3. Metadata Profile¶
OpenGeoMetadata (OGM) Aardvark¶
- designed for GeoBlacklight
- intended for discoverability
- often generated from more complete geospatial metadata, such as ISO 19139 or FGDC
- mainly a mix of Dublin Core and GeoBlacklight application-specific fields
Custom GeoBTAA Elements¶
- augments OGM Aardvark
- intended to serve as standalone metadata
- includes geospatial technical fields, like projection & scale
- multiple fields for life cycle tracking
🔍 Template¶
Let's take a look at the Primary Metadata Template z.umn.edu/b1g-template
Key things to know about the template
- Make a copy or request a customized template
- Separate multiple values with a pipe (|)
4. Tricky Fields¶
Bounding Boxes¶
- Format as decimal degrees (instead of degrees-minutes-seconds)
- Use the order West,South,East,North
- If coordinates are missing, consider adding in batches with identical coverage or assign to student workers
Klokan Bounding Box Demo¶
- Klokan Bounding Box is a tool for generating extents in various formats
- Select the "CSV" output option
- More detailed instructions: https://gin.btaa.org/metadata/recipes/add-bbox
Answer¶
- It needs to be in decimal degrees
- The western coordinate has a "W" instead of a negative sign
- The coordinates are in the wrong order (These are
W,E,N,S
; we needW,S,E,N
)
Format for the BTAA Geoportal:
-87.9,41.6,-87.5,42.0
Place Names¶
- Format all place names as FAST subject headings
- For local US data, the format looks like:
state--county
state--city
state
Example:
Illinois--Chicago
Answer¶
For "Portland (Or.)" →
Oregon--Portland
For "Seattle (Wash.)" →
Washington (State)--Seattle
To improve user searches, we add the state as a separate entry (separated by a pipe), like this:
Oregon--Portland|Oregon
5. Distributions¶
The BTAA Geoportal does not host data or maps, so we need links with the metadata.
The Geoportal has over 2 dozen types of links. Common types:
- Landing page
- Download (can be multiple)
- IIIF Image API or Presentation (manifest) API
- OpenIndexMap (GeoJSON)
- Geospatial web services from ArcGIS or GeoServer
- Supplemental metadata file
🔍 Template¶
Let's take a look at the Distributions Metadata Template z.umn.edu/b1g-template
Key things to know about the Distributions template
- One link per line
- Use the ID from the Primary template
- The same record may have multiple rows of links
6. Full Resource Lifecycle¶
1. Identify¶
Team members seek out new content for the geoportal.
2. Obtain and Process Metadata¶
We harvest the metadata, convert it to the GeoBTAA Schema, edit, and validate it.
a. Harvest¶
Here are the most common ways that we harvest the metadata:
- a BTAA-GIN Team Member sends us the metadata as files or CSV
- an API
- scrape an HTML page with Python
- we manually copy and paste the metadata into a spreadsheet
- a combination of one or more of the above
b. Crosswalk¶
We "crosswalk" or convert the metadata into the schema needed for the Geoportal. Our goal is to end up with a spreadsheet containing columns matching our metadata template.
c. Edit¶
Manually fix, improve, and augment the metadata as needed.
d. Validate¶
Run a validation and cleaning script to ensure the records conform to the required elements of our schema.
3. Index Metadata¶
a. Ingest to GBL Admin¶
We upload the completed spreadsheet to GBL Admin, which serves as the administrative interface for the Geoportal. If GBL Admin detects any formatting errors, it will issue a warning and may reject the upload.
c. Unpublish¶
Periodically, we need to remove records from the Geoportal. To do this, we use GBL Admin to either delete them or change their status to "unpublished."
4. Maintenance¶
a. Monitor sources¶
We monitor our sources to check for new and retired content.
b. Monitor Geoportal¶
We regularly assess currentness of the content in the Geoportal and check for broken links.
c. Schedule re-harvests¶
We schedule re-harvests from sources based on how frequently they update their content. See the Collections Dashboard for this schedule.