In upgrading legacy metadata, I spend a lot of time wrestling with controlled or not-so-very controlled terms: LCSH terms, local terms, red terms, blue terms, FAST terms, slow terms.
Here's a quick, not-overly-detailed sample workflow for 1) merging multiple columns containing subject terms, 2) reconciling subject terms to LC's vocabularies, and 3) splitting matched (LC) and unmatched (non-LC --> local) subject terms into two separate columns.
This workflow assumes that you have set up reconciliation services. If you have not, Freeyourmetadata.org has easy-to-follow instructions. I strongly recommend that you locally host a data dump of whichever vocabulary you are wanting to reconcile against so that 1) the process will run faster, and 2) you aren't taxing someone else's server.
Friday, June 30, 2017
Recipe - Removing the first or last terms from a delimited string
Given the following sample data:
If I wanted to remove the first delimited term from each row (e.g. "Santa Cruz Beach Boardwalk" from row 1 and "Surfing" from all the other rows, I would use the following recipe:
Edit cells->Transform
and the following GREL expression:
value.split("<delimiter>").slice(1).join("<delimiter>")
where <delimiter> is whatever delimiter is being used for your data string.
Alternatively, if you wanted to remove the last delimited term (in this case, "Santa Cruz"),
you would do the transform as above, but you would use this GREL expression:
value.split("<delimiter>").slice(0,length(value.split("<delimiter>"))-1.join("<delimiter>")
The -1 is needed because of the way Open Refine indexes an array. (I'll cover arrays and indexing in a future post)
Labels:
recipes,
removing first term,
removing last term
Introduction to the all new blog
Most of the Open Refine documentation as it exists is aimed at people with a programming background. Rachel and I wanted to have a central repository for the documentation I developed for the Metadata Services department at UC Santa Cruz.
We also wanted to have a place for other librarians to archive their own Open Refine recipes/documentation. If you want yours posted or if you have a URL to a useful recipe from someone else, please email tyleet@ucsc.edu or jaffer@ucsc.edu
We also wanted to have a place for other librarians to archive their own Open Refine recipes/documentation. If you want yours posted or if you have a URL to a useful recipe from someone else, please email tyleet@ucsc.edu or jaffer@ucsc.edu
Subscribe to:
Posts (Atom)