Artificial Intelligence Virtual Keyboard using CV2 and CVZONE

In this blog, I will try to explain the importance of cv2 and cvzone with a project built called ‘AI Virtual Keyboard’. The complete code can be referred from my GitHub repository.

Computer vision is a field of artificial intelligence (AI) that enables computers and systems to derive meaningful information from digital images, videos and other visual inputs — and take actions or make recommendations based on that information. If AI enables computers to think, computer vision enables them to see, observe and understand.

Computer vision works much the same as human vision, except humans have a head start. Human sight has the advantage of lifetimes of context to train how to tell objects apart, how far away they are, whether they are moving and whether there is something wrong in an image.

Though computer vision is not new but it’s the strength is literally grows exponentially. One may think, what we can achieve from this technology. Well, sky is the limit you can assume.

Applications range from tasks such as industrial machine vision systems which, say, inspect bottles speeding by on a production line, to research into artificial intelligence and computers or robots that can comprehend the world around them. The computer vision and machine vision fields have significant overlap. Computer vision covers the core technology of automated image analysis which is used in many fields. Machine vision usually refers to a process of combining automated image analysis with other methods and technologies to provide automated inspection and robot guidance in industrial applications. In many computer-vision applications, the computers are pre-programmed to solve a particular task, but methods based on learning are now becoming increasingly common.

Building the project:

The idea of this project came across my mind when I thought to do something unique with computer vision.

In this project, I have used 1280x720 pixel video footage using imutils. This is important as some of the laptop’s webcam only provides fixed height and width which can be very small and not suitable for this project.

As you can see, I have commented the method set() and used imutils instead as per below.

I have used HandDetector to find hand, position. Hand position is required can be imported from mediapipe library. The usage of position is unique as according to MediaPipe library, it assigns each tip of the finger a number and which was helpful for our project.

In the above picture, you can see the tip of index and middle finger is 8 and 12. We used the minimal distance for the click while both the finger hovered on a button.

In addition to the above, I have used cvzone.cornerRect for highlighting the border of each button.

We have used cv2.Rectangle for the shape of the buttons and cv2.putText for the mentioning of text in it.

After playing with positions and text adjustments, below mentioned is the final code which completes the current goal of the project.

The output shows as below.

It was a fun and learning experience doing this project. You may refer my GitHub repository for the complete code for the learning purpose.

Thanks for your valuable time for reading this article.

If you like this article- please like, share and follow my channel and stay tuned for the upcoming blogs.

Machine Learning Enthusiast