Coming soon: We’re working on a brand new, revamped Community experience. Want to receive updates? Sign up now!

0 votes

Hi All,

There is a nice python package google-images-download, that will download images given one or more search terms.

In a python code recipe I have

import dataiku, os.path

from google_images_download import google_images_download   
response = google_images_download.googleimagesdownload()   

handle = dataiku.Folder("peugeots")
path = handle.get_path()

arguments = {
    "keywords":"peugeot 206,peugeot 306",
    "output_directory": path
}   #creating list of arguments 

Now this works, but the thing is: the images are separated into subfolders of the managed folder, and if I then want to apply the deeplearning plugin, for image classification, with an input folder that contains sub folders it does not work.




1 Answer

0 votes

This is not currently supported. After the download loop in your downloading recipe, you could add Python code that flattens everything at top-level of the directory.
OK, thanks for the tip.

The reason to keep it separated though is because these subfolders already form the two (or more) categories for which to retrain a pretrained network.
1,337 questions
1,362 answers
11,912 users

©Dataiku 2012-2018 - Privacy Policy