Artificial Intelligence (AI) is booming. From customer support chatbots to workflow automation, AI is infiltrating and disrupting nearly every part of our everyday lives, and it’s unlikely to cease. In fact, the global market intelligence firm IDC, forecast that the AI market will grow tenfold: by 2024, it will ‘break the $500 billion mark with a five-year compound annual growth rate (CAGR) of 17.5%’. There are many different branches of AI, so it’s no surprise that it’s proving so successful in its ‘takeover’, but one subdivision that has an especially fascinating ‘come-up’ story is computer vision.
To find out about the technical information within computer vision and how it functions in the enterprise, we spoke to Dr Appu Shaji, Founder and Chief Scientist of Mobius Labs. Appu has over 18 years of experience in AI and, more specifically, computer vision. He spent a number of years studying it at university level and worked as a postdoc at the Image and Visual Representation Group, EPFL and Computer Vision Lab. His most recent company, Mobius Labs, strives to make computer vision adaptable to every application.
Thanks for joining us, Appu! It’s great to have you here at EM360. Now, I can’t ignore your cool job title, so to kick off, can you give us an insight into what it’s like to be a Chief Scientist?
Mobius is a special breed of company in the sense that it’s a deep tech company. A lot of our products are based on scientific work, and what we do is slightly different to other AI companies. We spend time researching and building new algorithms, architecture, models and ways of handling data. We have a team of PHD’s and post grads who have been researching computer vision for a long time and I'm privileged to lead that team. Our main goal is to make machines better and smarter than anything that has been possible with computer vision before. We work with algorithms that are popular with deep learning methods and fundamentally it is about getting the maximum value of these and so we think about three aspects in our day to day exercises. One is what is the data this model should learn to make them compatible for our clients needs; what are the best algorithms in terms of the architectures used that we need to think about and what are these networks learning? Can they only fit a narrow subset of data or something larger? The second is more desirable than others and is important to do this better. And last but not least we think about how to make the algos faster and more efficient so that we can deploy them. There are a lot of fundamental scientific activities that we do which is in between what happens at university and in the company itself, we have a lot of fun with that and get the best of both worlds!
What is computer vision and how does it fit in with the business world?
Human sight or vision is one of the core perceptive senses we have. 50% of our brain is dedicated to processing visual signals. We use vision to make decisions, e.g. what to buy, where to put something or when you are doing a mechanical task you are always using input from your visual system. We can take that skill and impart that into machines so that some applications can be deployed in a smarter way. For example visual search, if you have 10,000 images and want to find one particular aspect, you can use our system to identify that for you. Previously you had to look through all these images to find what you were looking for but now it is being done by a machine - superhuman vision. Superhuman vision can do this much faster and in a more efficient way giving us time to focus on more important tasks that matter. We are working with space and satellite companies who are taking detailed imagery of the earth and usually these images are taken to see if there is an object or activity happening like a wild fire. Processing these visuals is daunting but machines can take on this task. Our software has been made very easy for non technical users such as a product manager to install and manage at the click of a button. There is a component that we call non code area which allows non technical users to not only use the software but also train the software with new capabilities. Our motto is how can we empower users especially in the business world to deploy computer vision and scale it up.
Your company, Mobius Labs, works to take computer vision to the next level: ‘It’s not just computer vision, it’s superhuman vision’. Can you tell us more about this?
Typically, AI companies build their product with a technical crowd in mind. In contrast, we are building our software to address even the non-technical users. This changes everything. For example, a press agency or a media company that is sitting on millions of images and wants to build a next-generation visual search automation tool, has to make a number of decisions in order to successfully adopt AI systems. These decisions are best made by the people who understand their users and know their business well: product owners and business owners for example. So making a computer vision software that is accessible to them will allow better products and applications to be created. By removing this technical boundary, we are accelerating the adoption of the technology.
In a recent article, you argued that ‘hosting computer vision software on your own system, rather than the cloud, offers greater control while opening the door to new commercial opportunities.’ Why are you so adamant about this? What is it that fuels this strong belief?
Generally we work with organisations that have accumulated a large set of images over the years or have devices that can capture this. A primary business asset that our clients have is image visual assets essentially. When you look at a cloud based service what they are essentially asking customers to do is to upload their primary assets to a third party and trust them to take care of their data. We have a different philosophy and we don’t want to make our users part of the product. We have a simple software that people can buy and build their own product and that’s when the empowerment comes in otherwise it’s a huge roadblock to adopt AI at scale because of data privacy etc. Enabling that full control for our clients is a prominent value statement that we hold here. This is one of the main things that makes us stand out from the competition but not the only thing. There are three things that make us stand out. One, we are the simplest solution in the market to use, which allows us to target technical but more importantly non technical users. Two, our performance is much better than the competition, in terms of having the ability to deploy the technology in various devices including satellites which sets us apart. Three, in all of these cases we are helping our clients create their own IP and that allows us to make big plans and use our technology to make and enable these big products.
Computer vision is one of the more niche AI sub-sectors. What sparked your interest in it?
It’s been a lifelong interest, even when I was a teenager I was building systems. At first it was an academic interest, and a nice combination of mathematics and computer programming with an end output which you could show to people. In 2012 I switched my career from academia to starting my first company (Mobius is my third). My father is a film maker and he’s always been interested in making pictures and telling a story through images. I found that film making was hard but computer science was easier so I decided to pursue that passion. In my earlier days, I made a small project which was taking an image and understanding the 3D aspect of that image and translating that into a braille pattern. These are all factors that sparked my initial interest in computer vision.
Finally, where do you see computer vision, or rather ‘superhuman vision’, heading in the future? Is there still a lot more room for the industry to grow?
We are barely scraping the surface. If you look at what humanity has always imagined, for example using robots, it is something that will become more prevalent and in a much more profound way. We are breaking into sectors such as satellites and currently in all the satellites we are sending to outer space you can find a limited form of computer vision in them. Soon you will find that computer vision can make an instant judgement and computer vision will be accessible in all cameras and devices in the world. It is really the start of a journey, and in 5-6 years there will be multiple killer applications in this space. We hope to be the prominent tech lab or provider for a lot of these companies that start exploring computer vision over the next few years.
Don't have time to unpack the technical information surrounding new technologies? Our EM360 podcast is perfect for tech experts that are on-the-go. Find it on Spotify or Apple.