Categories
Strategy Technology

Experiments in Fast Image Recognition on Mobile Devices

Our journey in experimenting with machine vision and image recognition accelerated when we were developing an application, BooksPlus, to change a reader’s experience. BooksPlus uses image recognition to bring printed pages to life. A user can get immersed in rich and interactive content by scanning images in the book using the BooksPlus app. 

For example, you can scan an article about a poet and instantly listen to the poet’s audio. Similarly, you can scan images of historical artwork and watch a documentary clip.

As we started the development, we used commercially available SDKs that worked very well when we tried to recognize images locally. Still, these would fail as our library of images went over a few hundred images. A few services performed cloud-based recognition, but their pricing structure didn’t match our needs. 

Hence, we decided to experiment to develop our own image recognition solution.

What were the Objectives of our Experiments?

We focused on building a solution that would scale to the thousands of images that we needed to recognize. Our aim was to achieve high performance while being flexible to do on-device and in-cloud image matching. 

As we scaled the BooksPlus app, the target was to build a cost-effective outcome. We ensured that our own effort was as accurate as the SDKs (in terms of false positives and false negative matches). Our solutions needed to integrate with native iOS and Android projects.

Choosing an Image Recognition Toolkit

The first step of our journey was to zero down on an image recognition toolkit. We decided to use OpenCV based on the following factors:

  • A rich collection of image-related algorithms: OpenCV has a collection of more than 2500 optimized algorithms, which has many contributions from academia and the industry, making it the most significant open-source machine vision library.
  • Popularity: OpenCV has an estimated download exceeding 18 million and has a community of 47 thousand users, making it abundant technical support available.
  • BSD-licensed product: As OpenCV is BSD-licensed, we can easily modify and redistribute it according to our needs. As we wanted to white-label this technology, OpenCV would benefit us.
  • C-Interface: OpenCV has C interfaces and support, which was very important for us as both native iOS and Android support C; This would allow us to have a single codebase for both the platforms.

The Challenges in Our Journey

We faced numerous challenges while developing an efficient solution for our use case. But first, let’s first understand how image recognition works.

What is Feature Detection and Matching in Image Recognition?

Feature detection and matching is an essential component of every computer vision application. It detects an object, retrieve images, robot navigation, etc. 

Consider two images of a single object clicked at slightly different angles. How would you make your mobile recognize that both the pictures contain the same object? Feature Detection and Matching comes into play here.

A feature is a piece of information that represents if an image contains a specific pattern or not. Points and edges can be used as features. The image above shows the feature points on an image. One must select feature points in a way that they remain invariant under changes in illumination, translation, scaling, and in-plane rotation. Using invariant feature points is critical in the successful recognition of similar images under different positions.

The First Challenge: Slow Performance

When we first started experimenting with image recognition using OpenCV, we used the recommended ORB feature descriptors and FLANN feature matching with 2 nearest neighbours. This gave us accurate results, but it was extremely slow. 

The on-device recognition worked well for a few hundred images; the commercial SDK would crash after 150 images, but we were able to increase that to around 350. However, that was insufficient for a large-scale application.

To give an idea of the speed of this mechanism, consider a database of 300 images. It would take up to 2 seconds to match an image. With this speed, a database with thousands of images would take a few minutes to match an image. For the best UX, the matching must be real-time, in a blink of an eye. 

The number of matches made at different points of the pipeline needed to be minimized to improve the performance. Thus, we had two choices:

  1. Reduce the number of neighbors nearby, but we had only 2 neighbors: the least possible number of neighbors.
  2. Reduce the number of features we detected in each image, but reducing the count would hinder the accuracy. 

We settled upon using 200 features per image, but the time consumption was still not satisfactory. 

The Second Challenge: Low Accuracy

Another challenge that was standing right there was the reduced accuracy while matching images in books that contained text. These books would sometimes have words around the photos, which would add many highly clustered feature points to the words. This increased the noise and reduced the accuracy.

In general, the book’s printing caused more interference than anything else: the text on a page creates many useless features, highly clustered on the sharp edges of the letters causing the ORB algorithm to ignore the basic image features.

The Third Challenge: Native SDK

After the performance and precision challenges were resolved, the ultimate challenge was to wrap the solution in a library that supports multi-threading and is compatible with Android and iOS mobile devices.

Our Experiments That Led to the Solution:

Experiment 1: Solving the Performance Problem

The objective of the first experiment was to improve the performance. Our engineers came up with a solution to improve performance. Our system could potentially be presented with any random image which has billions of possibilities and we had to determine if this image was a match to our database. Therefore, instead of doing a direct match, we devised a two-part approach: Simple matching and In-depth matching.

Part 1: Simple Matching: 

To begin, the system will eliminate obvious non-matches. These are the images that can easily be identified as not matching. They could be any of our database’s thousands or even tens of thousands of images. This is accomplished through a very coarse level scan that considers only 20 features through the use of an on-device database to determine whether the image being scanned belongs to our interesting set. 

Part 2: In-Depth Matching 

After Part 1, we were left with very few images with similar features from a large dataset – the interesting set. Our second matching step is carried out on these few images. An in-depth match was performed only on these interesting images. To find the matching image, all 200 features are matched here. As a result, we reduced the number of feature matching loops performed on each image.

Every feature was matched against every feature of the training image. This brought down the matching loops down from 40,000 (200×200) to 400 (20×20). We would get a list of the best possible matching images to further compare the actual 200 features.

We were more than satisfied with the result. The dataset of 300 images that would previously take 2 seconds to match an image would now take only 200 milliseconds. This improved mechanism was 10x faster than the original, barely noticeable to the human eye in delay.

Experiment 2: Solving the Scale Problem

To scale up the system, part 1 of the matching was done on the device and part 2 could be done in the cloud – this way, only images that were a potential match were sent to the cloud. We would send the 20 feature fingerprint match information to the cloud, along with the additional detected image features. With a large database of interesting images, the cloud could scale.

This method allowed us to have a large database (with fewer features) on-device in order to eliminate obvious non-matches. The memory requirements were reduced, and we eliminated crashes caused by system resource constraints, which was a problem with the commercial SDK. As the real matching was done in the cloud, we were able to scale by reducing cloud computing costs by not using cloud CPU cycling for obvious non-matches.

Experiment 3: Improving the Accuracy

Now that we have better performance results, the matching process’s practical accuracy needs enhancement. As mentioned earlier, when scanning a picture in the real world, the amount of noise was enormous.

Our first approach was to use the CANNY edge detection algorithm to find the square or the rectangle edges of the image and clip out the rest of the data, but the results were not reliable. We observed two issues that still stood tall. The first was that the images would sometimes contain captions which would be a part of the overall image rectangle. The second issue was that the images would sometimes be aesthetically placed in different shapes like circles or ovals. We needed to come up with a simple solution.

Finally, we analyzed the images in 16 shades of grayscale and tried to find areas skewed towards only 2 to 3 shades of grey. This method accurately found areas of text on the outer regions of an image. After finding these portions, blurring them would make them dormant in interfering with the recognition mechanism. 

Experiment 4: Implementing a Native SDK for Mobile

We swiftly managed to enhance the feature detection and matching system’s accuracy and efficiency in recognizing images. The final step was implementing an SDK that could work across both iOS and Android devices like it would have been if we implemented them in native SDKs. To our advantage, both Android and iOS support the use of C libraries in their native SDKs. Therefore, an image recognition library was written in C, and two SDKs were produced using the same codebase. 

Each mobile device has different resources available. The higher-end mobile devices have multiple cores to perform multiple tasks simultaneously. We created a multi-threaded library with a configurable number of threads. The library would automatically configure the number of threads at runtime as per the mobile device’s optimum number.

Conclusion

To summarize, we developed a large-scale image recognition application (used in multiple fields including Augmented Reality) by improving the accuracy and the efficiency of the machine vision: feature detection and matching. The already existing solutions were slow and our use case produced noise that drastically reduced accuracy. We desired accurate match results within a blink of an eye.

Thus, we ran a few tests to improve the mechanism’s performance and accuracy. This reduced the number of feature matching loops by 90%, resulting in a 10x faster match. Once we had the performance that we desired, we needed to improve the accuracy by reducing the noise around the text in the images. We were able to accomplish this by blurring out the text after analyzing the image in 16 different shades of grayscale. Finally, everything was compiled into the C language library that can be used with iOS and Android.

Read More
Categories
Technology

Startup Tips: 6 Costly Product Development Mistakes & How to Avoid Them

Successful businesses, especially startups, are often a result of successful decision-making. Thus, before we dive into comprehending these startup tips, it’s important to understand the factors that contribute to your startups’ success.

Business decision-making is not as simple as ‘Let’s just take this path and see where it goes.’  It is a process of solving problems by weighing evidence, examining alternatives, and choosing a path from there. Thus, you could say that the likelihood of your startup succeeding or failing is proportional to the number of smart decisions you make throughout the process. 

But, what do you mean by smart decision making? I would define smart decision making as a process of not only choosing the right path but also knowing when and how to avoid costly mistakes, especially in the early-stage. In our experience, we have seen a recurring trend of decisions and actions in startups that lead to self-imposed limits for business development. Few of these startups even reached a higher level on the startup ladder, yet they failed.

What was the reason? Let’s find out.
If you lead an early-stage startup, you may fall into some pits. But knowing some obvious mistakes will help you avoid them. 

Why is it important to avoid these mistakes?

Yes, you do learn from your mistakes, but you should not be tolerating costly errors. Identifying issues early on and knowing the difficulties ahead of time helps you minimize costs and penalties when things may not go as expected. It is easier said than done.

Ideas are easy, implementation is hard.
Thus, we have carefully curated these product development tips for you.

1. Solution Looking for a Problem

According to CB Insights, 42% of all startups fail due to the lack of market need. There are many solvable problems around. But you don’t just want to solve any problem as a startup; you want to solve the best problem. 

A problem perceived by the minds engaged in a startup might not be a problem for a more extensive base of people. If you cannot scale your idea, well, your growth is already restricted. An excellent example of this is Google Glass. Google tried to be ahead of its time and expected a larger group of people would prefer a “hand-free” smartphone. Although the idea was astounding, the market was not ready for it, and the result? Failure. 

In an interview conducted by Failory, with more than 80 startups, 34% spoke about the effects of lack of product fitness. Of the 80 ventures, 34% produced something that nobody wanted later on. The most popular lesson listed by far is that you have to check if the market really wants what you offer. 

startup tips: product development mistakes and how to avoid them

Tips for starting up:

Market research. You may have to have to ask a few fundamental questions. You can start by defining the problem. One needs to stop worrying about the idea getting stolen and instead get it validated early. Talk to potential customers. A quick survey in your network proves to be efficacious as it will help you gauge customer interest.

2. Bloating with Features

A person is usually biased towards his own idea, this causes an increase in the number of niche features when the concept expands. It is like building a Swiss Army Knife with only one or two tools that users might really find useful.

This results in needless software sophistication that increases the time, effort, and money needed for the product and renders it less viable to change directions at a later stage.

Tips for starting up:

You need to do something simple. KISS. Confused? This isn’t the kiss that you’re thinking about. KISS simply means ‘Keep It Simple Stupid’. All you need to do is be on track with the value you are trying to impart and try to keep it simple for users. As the complexity increases for the users, the retention rate drops. 

Here are a few mitigation strategies:

  1. Prioritize the user stories. Create a detailed list of user stories and categorize them.
  2. Create an outline of key experiments that would allow you to validate or invalidate your hypothesis. This will help you determine what the market needs and what it cares about.
  3. Build an MVP. This is one crucial thing you should do. Find out the minimum set of attributes to prove or refute the hypothesis. take a close look at what you can do with minimal to zero coding.
  4. Observe and collect data by taking the MVP to the market and figure out if your potential users are actually using it. Analytics comes into the picture here. Track the usage of your users and analyze what they’re using and how often.
  5. Build more on the MVP or change directions. These additional experiments will help you validate your hypothesis better. Iterate Often

NOTE: Don’t worry about failures. The purpose is to minimize the cost of failures, not the rate of failure.

3. Not putting the UX in the front-and-centre

Holding on to the previous point, user retention. If the user does not enjoy using it or finds it complicated to learn, they will not use your product for long. Most startups struggle during the development process because they target the development and coding phases, while not considering the design and the user’s experience.

According to CB Insights, 17% of the startups failed because their product was not user-friendly.

You must know and understand these two essential aspects of design as the non-technical founder of a company: User Experience (UX) and User Interface (UI).

UX is beyond the UI; it covers everything that touches the user and allows the user to make forward progress, whether in transactions, or status updates, or information needs. UI contains all elements allowing a user to interact with a product’s software.

Tips for starting up:

  • Ensure that there is a strong UX designer on your team.
  • Review your product from a user-centric perspective
  • Sensitize the technology team that UX needs to drive the experience decisions.
  • Do not assign design tasks to technical teams (UI or process), as they might solve it from an engineering point-of-view rather than a user-centric point-of-view.

4. Not having an A-Team

It isn’t easy to find high-caliber talent. It is costly to employ high-caliber talent. False hires can lead to disasters in your team and it can take a lot of time and cost to reverse these bad decisions. Getting a team working for you is usually a time-consuming process and takes a few months, and A-teams can take over a year to solidify in some cases. You can utilize this time to improve your product.

Beepi was a prime example of poor management and leadership. Experts say they tried to go big too fast. Bad management caused Beepi to run out of cash flow which became the reason for their failure.

Tips for starting up:

In the early stage of your startup, you need some leadership and some experience. Here are some ways to achieve that:

  • Establish a product leadership – find a CTO (or a part-time virtual CTO), UX leadership, and product manager who can establish the leadership.
  • Engage Experts: A great way to reduce efforts and save time is to find a product-focused outsourcing company that can engage with teams that know early-stage product development and have people who have been doing that.
  • Build capabilities: Slowly start building internal capabilities or continue with the outsourced partners. 

5. Short-sighted or overly grandiose technology decisions

Let’s talk tech.

After all the fundamental research, getting your hypothesis validated, having an A-team ready, it is now time to take the leap into technology and get the product development started. Here’s another hurdle in your product’s lifecycle. Deciding the tech-stack and architecture. This can leave you with a massive barrier to growth and success if not done right. Now, there must be a perfect balance while doing this. Think too short term and your product will be left with a lot of technical debt. Think too grand and you will incur engineering costs that may not be necessary.

Tips for starting up:

  • Choose a technology stack that meets the product’s long-term needs. Read more about our preferred tech stacks here.  
  • Create an Extensible Foundational Architecture: Determine a fundamental architecture that will allow additional features to be introduced in the future without incurring cost in the present.
  • Ensure you have considered security, privacy, scaling needs, performance, and extensibility for immediate as well as future developments.
  • Optimize for immediate needs at the higher layers. While the tech stack is being decided, you can build pieces of the UI that are solving the MVP problem.

6. Underestimating the time and complexity

“Everyone has a plan until they get punched in the face,” says Mike Tyson.

In the excitement of a new startup or a product, we tend to overlook how important laying out a roadmap is. Building software is a challenging and time-consuming process. It would be best to make sure that quality work is being done without misspending. There is no free or easy lunch. It’s a trap to think that products can be built overnight. Most startups fail because they do not have a roadmap prepared.

Tips for starting up:

  • Estimate your time accurately for your first launch.
  • Use Agile Methodology: Set a timeframe and try and get as much done as possible. Then launch and learn. Iterate your product with what you’ve learned.
  • Contain your desire to make changes mid-stream. This is a common cause for overruns. 
startup product development

Key takeaway

These startup tips answer the widely asked question, “What are the top mistakes that kill a startup?” To summarize, start with market research. Contain the tempt to bloat your product with features and make sure the product has a good user experience. Have a good managerial team and be reasonable with technology. Finally, plan out your product development journey beforehand.

The examples above demonstrate very clearly that you can struggle even though you find investors and obtain all the funds you need. Although the failure of these startups does not undermine the chances of other startups succeeding. It’s all about playing the game cleverly. 

Startups face an overwhelming number of challenges and there is no magic recipe for success.  However, by avoiding these obvious mistakes, you can increase your chances. Every failure has a lesson to teach, and it is better to learn from their mistakes instead of your own. 

Looking for an A-Team?

We’re design thinking product development experts.
We’ll be your product partners at every stage: Concept, Launch, Growth.