Building an Image Classification Machine - VOD and Recap

MichaelG · ‎08-21-2020

Thanks to all of you who joined us for the first event of the London User Group! Josh Cooper (Data Scientist at Dataiku) presented a image classification project. Users also had the opportunity to connect with each other as well as quiz Josh on the intricacies of the project.

With today’s abundance of recyclable materials, it’s more important than ever to make sure that we’re getting rid of our waste responsibly. We took a look at a Dataiku DSS project that tries to help with that.

Josh showed how to get a pre-trained image classification model retrained using owned data, and how to get this model to identify different recyclable materials on the fly, through a laptop camera, using a custom webapp.

Here is the recording for those of you who couldn't make the event!

Be sure to join the London User Group to be informed of upcoming events and chat with fellow DSS users based in London!

As you may know, Dataiku User Groups are led by volunteer users who contribute their time and communication skills to enable fellow users to learn from each other. If you’d like to run this group, please fill out this quick form and we’ll get back. Thanks for your interest!

I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!

tgb417 · ‎09-18-2020

Congratulations on a wonderful regional user group meeting.

I’d love to have an opportunity to play around with this project. Are you able to export this project and share the project in a way that I could reproduce this in my own environment?

How long does retraining take? I see you are running the flow presentation from a Macintosh. Many transfer learning machine learning projects work best with Nvidia GPU. Macintosh computers don’t play nice with Nvidia. I’m wondering where and how you are retraining.

--Tom

JoshC · ‎09-24-2020

Hi @tgb417 - it's great to hear you found it useful!

I've attached the project below, but unfortunately I wasn't able to bundle either the models or training images due to copyright restrictions. So to reproduce, you'd need to do the following things:

- Populate the training set with images you find of the relevant materials,

- Download the image recognition model of your choice using the plugin macro.

- You would also need to deploy to an API node the api endpoint that's been designed in python in API Designer inside the project, and edit the webapp to point to this endpoint.

I retrained the model with a few epochs on my macbook, and that took about 20 minutes, but there is also a GPU version of the plugin that would be able to leverage the powerful parallel computation you'd get with a GPU.