Computer Vision for Everyone

1) The innovation

2) Why do it? (The real pain)

3) Who will use it? How do they benefit after using your product?

4) How to do it?

The kind of CV applications is broadly divided into these categories — Classification, segmentation, Saliency, Object Detection.

if data is present -

Use gsutil multithreaded uploading to our data server after converting into tfrecord( A serialized file for the dataset)- most people who will do it for the first time will likely make the mistake of using rsync or ftp without correct serialisation- too slow.

if data is already in the cloud — sync with it

Choose one of the categories.

choose the best algorithm(handled by the backend — mostly pretrained to reduce training time, preprocessing of images, data augmentation will be hidden from the user)

Use Kubeflow to reduce training time — distributed training using Kubernetes.

Present an API endpoint

if No data present -

Provide an interface to search for data regarding specific problems

integration with data labeling APIs like etc to get data labeled

proceed in the normal flow

5) Who will kick your ass?

Google cloud platform autoML framework + Vision API



6) What is so great about this?

Labelbox looks really close to what I am thinking. Though it only focuses on the data part.

The platform I am thinking is an intuitive, easy to use, get to production in record time with almost zero knowledge or experience in computer vision.

If aren’t aware of how hard it is at a large scale, watch this talk by Andrej Karpathy on building Computer Vision algorithms at Tesla —



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store