Computer Vision for Everyone

1) The innovation

Empower Data Analyst/Software Engineer/Data Scientists to build state of the art computer vision solutions

2) Why do it? (The real pain)

Current deep learning methods require a huge amount of data and compute. Most Companies can benefit from having some kind of in-house CV related algorithms. Yet the expertise and money required to build a production-grade solution is too high for most companies. Even for an experienced Data Scientist, a quick prototype (considering data is available) can take 3–4 days. Most people also do not care about the algorithm used, they only care about the end results. APIs available are also too generic and/or are not optimized for the use-case.

3) Who will use it? How do they benefit after using your product?

Anyone with a computer will be able to build a computer vision solution. Only at Data collection process, they will be required to interact with the software. All the technical parts of Data Augmentation, Training, API building will be taken care of. It will help create a new breed of Data Analysts/software engineers which can also build cv solutions.

4) How to do it?

An Electron app/ Web-based tool. The software will benefit from being able to utilise more resources from the computer.

The kind of CV applications is broadly divided into these categories — Classification, segmentation, Saliency, Object Detection.

if data is present -

Use gsutil multithreaded uploading to our data server after converting into tfrecord( A serialized file for the dataset)- most people who will do it for the first time will likely make the mistake of using rsync or ftp without correct serialisation- too slow.

if data is already in the cloud — sync with it

Choose one of the categories.

choose the best algorithm(handled by the backend — mostly pretrained to reduce training time, preprocessing of images, data augmentation will be hidden from the user)

Use Kubeflow to reduce training time — distributed training using Kubernetes.

Present an API endpoint

if No data present -

Provide an interface to search for data regarding specific problems

integration with data labeling APIs like etc to get data labeled

proceed in the normal flow

5) Who will kick your ass?

Tensorflow API

Google cloud platform autoML framework + Vision API



6) What is so great about this?

Tensorflow API, Google cloud platform autoML framework + Vision API, AWS all have a huge learning curve. This works for experienced software engineers/Data Scientists. But not for everyone. They are amazing if you want to create something totally new — a new algorithm, massive datasets(50 TB +). Most companies don’t need this.

Labelbox looks really close to what I am thinking. Though it only focuses on the data part.

The platform I am thinking is an intuitive, easy to use, get to production in record time with almost zero knowledge or experience in computer vision.

If aren’t aware of how hard it is at a large scale, watch this talk by Andrej Karpathy on building Computer Vision algorithms at Tesla —




Connecting the Dots.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

10 Things I Learned In 4 Months At BookMyShow

Easy on-chain Oracles

Ruby On Rails Web Development

Ruby On Rails Web Development

Multithreading for OpenCV Functions with Qt Signal Slot

Top 5 Mobile Apps to Learn and Practice Coding

Top 5 Mobile Apps to Learn and Practice Coding

Dropshipping Without Oberlo: 15 Awesome Oberlo Alternatives

Use c9 in Google Cloud Platform

Meanwhile, research has linked some activities to a drop in mental flexibility. Media Library

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aashay Sachdeva

Aashay Sachdeva

Connecting the Dots.

More from Medium

How Computer Vision Can Improve Industrial Production?

Machine Learning Is Saving Lives: The Critical Role of Computer Vision in the Future of Cancer…

Am I Allergic to This? Developing a Voice Assistant for Sight Impaired People

Studying Cross Transferability of Vision Transformers using HAM10000 skin cancer dataset