Wednesday, August 23, 2017

Troubleshooting problems with numerical strings

One problem I came across while I was working with imported Marcedit files was that the first two characters of any field that didn't have indicators, like the 008 and the 005, were winding up in the indicators column.

I figured out how to concatenate them together, but then I was getting strange results.



So, if I had:
Indicators            Contents
26                        20311
20                        151219051217

Here's what happened when I did the concatenate:








The problem was that my older version of Refine had imported some of my numerical strings as numbers instead of a string of characters. The weird result in the screenshot was because it had added the indicator to the contents.

It turns out that if Open Refine considers a series of digits to be a number, it will highlight them in green. Like so:




That seemed like an easy fix, since Open Refine had a toString() function. (or you can use edit column->common Transformations->to Text.

Unfortunately, this is what happened :








I poked around the web, and found out that this was an old bug. To get around it, you would do toString(floor(value)):










And now the text is black instead of green:




The transformation now works as expected:

No comments: