Thursday, April 12, 2018

Recipe: Computing the Millennium checkdigit

 More detail will be posted later. I just wanted to note this down.

We needed to extract our course reserves in Millennium into .csv format and the check digit for the item records associated with the instructors was cut off. To further complicate matters, one of our classes had about 106 associated items.

In order to calculate the check digit, you had to do the following calculation:

for item record i1234567, you had to multiply the 7 by 2, the 6 by 3, the 5 by 4, the 4 by 5, the 3 by 6, the 2 by 7, and the 1 by 8. Then, you had to add all those values together, divide by 11, and the remainder was the check digit. Anything that was remainder 10 became check digit x. So, for this item record, the summation is 112. Divide that by 11, and the remainder is 2 - so the full item record number is i12345672.

1. First, I changed the , delimiter right before the item record numbers to } in Notepad++. That way, when I imported to Open Refine, item records i3070797,i3054061,i3054062,i3054064,i3070931,,,,,,,,,,,,,

would all be in the same cell.

2. I split the column of item records into multi-valued cells using the , as a delimiter. My column was named "Column2" for this example.
3. I used add column based on this column and created column "remainder" with this GREL: value.replace ("i","")
This is so I could do the calculation for the check digit in a separate column to make it easier to concatenate later.
3. To do the multiplication and summation calculation, I did a transform on the remainder column and input this GREL:
sum(forEachIndex(reverse(splitByLengths(value,1,1,1,1,1,1,1)),i,v, (i+2)*toNumber(v)))
4. Unfortunately, this inserted a lot of zeros into the fields I wanted to leave blank, so I did another transform on remainder and did this to blank out the zeros: if(value == 0, "", value)
(You could probably just facet on the 0 and blank it out, too). This step has to be done, otherwise the calculation of the remainder will fill in the cells with the zeros with unwanted values.
5. Do another transform on column remainder to calculate the remainder:
mod(value,11)
6. Now concatenate Column2 and remainder together (can be either a transform or a new column):
if(isBlank(value), "",if (cells.remainder.value < 10, value + cells.remainder.value, value + "x"))
(Leaving out the check for isBlank gave me funky values again.)
7.  Now remove any extra columns and join multi-valued cells on the concatenated column.