Isolating unique labels
To see which labels were unique to one set I wrote a Python script which compared the labels in the two input CSVs and outputted two new CSVs, one for each image set with only the labels unique to that set. I also had the script return a count of unique labels. The full-resolution base image set contained 101 unique labels. The reduced-resolution dataset contained 120 unique labels.
The Python script I wrote for this stage along with the two output CSVs from the script are available at the link below this section’s commentary and critique. Also included is a PDF of the consolidated comparison document with images that had unique labels on both the original and lower-res versions highlighted.
Context and critique
This stage consisted of a transformation through a script. The opportunity for error or unintended consequences here is again in the coding, but I spot checked the results by searching for labels it deemed unique in the other image sets output files from earlier stages, and everything seems to be functioning as intended.
Stage 4 data consists of 1 script and 2 output CSVs (the same script was used for each). Also available is a PDF of the consolidated data with unique labels highlighted. Click the links below to download from GitHub.