Our results on queries from correct predictions

Our results on queries from wrong predictions

Random results on queries from correct predictions

Random results on queries from wrong predictions

Note: the "Random results" are intended to serve as a baseline. For each query it will randomly select 5 traning datapoints with the same label as the query label.

Results show that most of the selected images share similar semantics, color, texture etc. with the query.

For example, for queries with correct main model prediction:

For queries with incorrect predictions:

This information is useful to diagnose the model. For example, we can deduce the misprediction of #31 is caused by the grayscale which is rare in the training dataset. Similarly, for #0, #1 and #15, the misprediction is likely due to blur and overexposure. For #39, the misprediction is caused by the dog image with a similar environment, posture and color.