·
65 commits
to main
since this release
We release the cross-validated predicted probabilities for the QuickDraw dataset. These probabilities were trained using 4-fold cross-validation for all 50,426,266 examples and 345 classes. The resulting predicted probabilities (pyx numpy matrix) is shape 50426266 x 345. The resulting file is 33GB in np.float16 format.
Note, pyx is short for prob(y = label | data example x).
Download the QuickDraw Cross-validated Predicted Probabilities as an numpy matrix.
Make sure pigz and wget are installed:
# on Mac OS
brew install wget pigz
# on Ubuntu
sudo apt-get install pigzDownload the pyx files
base_url="https://github.com/cgnorthcutt/label-errors/releases/download/"
base_filename="quickdraw-pyx-v1/quickdraw_pyx.tar.gz-parta"
for part in $(eval echo "{a..k}"); do
wget --continue $base_url$base_filename$part
doneDecompress the tar.gz file parts into the final pyx numpy matrix:
cat quickdraw_pyx.tar.gz-part?? | unpigz | tar -xvC .Ancillary extra details
To compress the pyx probabilities file prior to uploading, we used the followign command
tar -I pigz -cvf - quickdraw_pyx.npy | split --bytes=1800M - "quickdraw_pyx.tar.gz-part"