Wednesday, February 19, 2020

Recipe :Splitting out an OCLC number (or a number with a label prefix) out of a delimited list

I've been working a lot with Alma Analytics, and before the February release, our 035 field was repeated. This unfortunately meant that if I used the "network number" field for Analytics, I would get this:

(CaPaEBR)ebr10975975; (OCoLC)654843296; (CU-SC)b44543517-01cdl_scr_inst

You could use a formula in Analytics to give you just the OCLC number, but you could also do it in Open Refine. The recipe is this:

forEach(value.split(";"), v, if (contains(v, "<prefix label for the number you want to split out>"), v, "")).join("")

So, for example, if I want to split out the OCLC number, I would use:
forEach(value.split(";"), v, if (contains(v, "(OCoLC)"), v, "")).join("")

If I wanted the CaPaEBR number, I would use:
forEach(value.split(";"), v, if (contains(v, "(CaPaEBR)"), v, "")).join("")

If I wanted our old bib number (which is the last number in this string), I would use:
forEach(value.split(";"), v, if (contains(v, "(CU-SC)"), v, "")).join("")