Using the deeplearning classification on images in a subfolder of managed folder?

longhowlam
Level 3
Using the deeplearning classification on images in a subfolder of managed folder?

Hi All,



There is a nice python package google-images-download, that will download images given one or more search terms.



In a python code recipe I have




import dataiku, os.path

from google_images_download import google_images_download
response = google_images_download.googleimagesdownload()

handle = dataiku.Folder("peugeots")
path = handle.get_path()

arguments = {
"keywords":"peugeot 206,peugeot 306",
"limit":20,"print_urls":True,
"output_directory": path
} #creating list of arguments

response.download(arguments)


Now this works, but the thing is: the images are separated into subfolders of the managed folder, and if I then want to apply the deeplearning plugin, for image classification, with an input folder that contains sub folders it does not work.





Regards,



Longhow

0 Kudos
2 Replies
Clรฉment_Stenac
Hi,

This is not currently supported. After the download loop in your downloading recipe, you could add Python code that flattens everything at top-level of the directory.
0 Kudos
longhowlam
Level 3
Author
OK, thanks for the tip.

The reason to keep it separated though is because these subfolders already form the two (or more) categories for which to retrain a pretrained network.
0 Kudos