Part 15: Artificial Intelligence with Computer Vision in python

Part 15: Artificial Intelligence with Computer Vision in python

Computer Vision

Computer vision is a discipline that studies how to reconstruct, interrupt and understand a 3d scene from its 2d images, in terms of the properties of the structure present in the scene.

Computer Vision Hierarchy

Computer vision is divided into three basic categories as following −

  • Low-level vision − It includes process image for feature extraction.
  • Intermediate-level vision − It includes object recognition and 3D scene interpretation
  • High-level vision − It includes conceptual description of a scene like activity, intention and behavior.

Computer Vision Vs Image Processing

  • Image processing studies image to image transformation. The input and output of image processing are both images.
  • Computer vision is the construction of explicit, meaningful descriptions of physical objects from their image. The output of computer vision is a description or an interpretation of structures in 3D scene.


Computer vision finds applications in the following fields −


  • Localization-determine robot location automatically
  • Navigation
  • Obstacles avoidance
  • Assembly (peg-in-hole, welding, painting)
  • Manipulation (e.g. PUMA robot manipulator)
  • Human Robot Interaction (HRI): Intelligent robotics to interact with and serve people


  • Classification and detection (e.g. lesion or cells classification and tumor detection)
  • 2D/3D segmentation
  • 3D human organ reconstruction (MRI or ultrasound)
  • Vision-guided robotics surgery


  • Biometrics (iris, finger print, face recognition)
  • Surveillance-detecting certain suspicious activities or behaviors


  • Autonomous vehicle
  • Safety, e.g., driver vigilance monitoring

[wpsbx_html_block id=1891]

Industrial Automation Application

  • Industrial inspection (defect detection)
  • Assembly
  • Barcode and package label reading
  • Object sorting
  • Document understanding (e.g. OCR)

Installing Useful Packages

For Computer vision with Python, you can use a popular library called OpenCV (Open Source Computer Vision). It is a library of programming functions mainly aimed at the real-time computer vision. It is written in C++ and its primary interface is in C++. You can install this package with the help of the following command −

pip install opencv_python-X.X-cp36-cp36m-winX.whl

Here X represents the version of Python installed on your machine as well as the win32 or 64 bit you are having.

If you are using the anaconda environment, then use the following command to install OpenCV −

conda install -c conda-forge opencv

Reading, Writing and Displaying an Image

Most of the CV applications need to get the images as input and produce the images as output. In this section, you will learn how to read and write image file with the help of functions provided by OpenCV.

OpenCV functions for Reading, Showing, Writing an Image File

OpenCV provides the following functions for this purpose −

  • imread() function − This is the function for reading an image. OpenCV imread() supports various image formats like PNG, JPEG, JPG, TIFF, etc.
  • imshow() function − This is the function for showing an image in a window. The window automatically fits to the image size. OpenCV imshow() supports various image formats like PNG, JPEG, JPG, TIFF, etc.
  • imwrite() function − This is the function for writing an image. OpenCV imwrite() supports various image formats like PNG, JPEG, JPG, TIFF, etc.


This example shows the Python code for reading an image in one format − showing it in a window and writing the same image in other format. Consider the steps shown below −

Import the OpenCV package as shown −

import cv2

Now, for reading a particular image, use the imread() function −

image = cv2.imread('image_flower.jpg')

For showing the image, use the imshow() function. The name of the window in which you can see the image would be image_flower.


image flower

Now, we can write the same image into the other format, say .png by using the imwrite() function −


The output True means that the image has been successfully written as .png file also in the same folder.


Note − The function destroyallWindows() simply destroys all the windows we created.

Color Space Conversion

In OpenCV, the images are not stored by using the conventional RGB color, rather they are stored in the reverse order i.e. in the BGR order. Hence the default color code while reading an image is BGR. The cvtColor() color conversion function in for converting the image from one color code to other.


Consider this example to convert image from BGR to grayscale.

Import the OpenCV package as shown −

import cv2

Now, for reading a particular image, use the imread() function −

image = cv2.imread('image_flower.jpg')

Now, if we see this image using imshow() function, then we can see that this image is in BGR.



Now, use cvtColor() function to convert this image to grayscale.

image = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

grey penguine

Edge Detection

Humans, after seeing a rough sketch, can easily recognize many object types and their poses. That is why edges play an important role in the life of humans as well as in the applications of computer vision. OpenCV provides very simple and useful function called Canny()for detecting the edges.


The following example shows clear identification of the edges.

Import OpenCV package as shown −

import cv2
import numpy as np

Now, for reading a particular image, use the imread() function.

image = cv2.imread('Penguins.jpg')

Now, use the Canny () function for detecting the edges of the already read image.


Now, for showing the image with edges, use the imshow() function.

cv2.imshow(‘edges’, cv2.imread(‘‘edges_Penguins.jpg’))

This Python program will create an image named edges_penguins.jpg with edge detection.

edges penguins

Face Detection

Face detection is one of the fascinating applications of computer vision which makes it more realistic as well as futuristic. OpenCV has a built-in facility to perform face detection. We are going to use the Haar cascade classifier for face detection.

Haar Cascade Data

We need data to use the Haar cascade classifier. You can find this data in our OpenCV package. After installing OpenCv, you can see the folder name haarcascades. There would be .xml files for different application. Now, copy all of them for different use and paste then in a new folder under the current project.


The following is the Python code using Haar Cascade to detect the face of Amitabh Bachan shown in the following image −

ab face

Import the OpenCV package as shown −

import cv2
import numpy as np

Now, use the HaarCascadeClassifier for detecting face −


Now, for reading a particular image, use the imread() function −

img = cv2.imread('AB.jpg')

Now, convert it into grayscale because it would accept gray images −

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Now, using face_detection.detectMultiScale, perform actual face detection

faces = face_detection.detectMultiScale(gray, 1.3, 5)

Now, draw a rectangle around the whole face −

for (x,y,w,h) in faces:
   img = cv2.rectangle(img,(x,y),(x+w, y+h),(255,0,0),3)

This Python program will create an image named Face_AB.jpg with face detection as shown

Face AB

Eye Detection

Eye detection is another fascinating application of computer vision which makes it more realistic as well as futuristic. OpenCV has a built-in facility to perform eye detection. We are going to use the Haar cascade classifier for eye detection.


The following example gives the Python code using Haar Cascade to detect the face of Amitabh Bachan given in the following image −

Haar AB Face

Import OpenCV package as shown −

import cv2
import numpy as np

Now, use the HaarCascadeClassifier for detecting face −

eye_cascade = cv2.CascadeClassifier('D:/ProgramData/cascadeclassifier/haarcascade_eye.xml')

Now, for reading a particular image, use the imread() function

img = cv2.imread('AB_Eye.jpg')

Now, convert it into grayscale because it would accept grey images −

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Now with the help of eye_cascade.detectMultiScale, perform actual face detection

eyes = eye_cascade.detectMultiScale(gray, 1.03, 5)

Now, draw a rectangle around the whole face −

for (ex,ey,ew,eh) in eyes:
   img = cv2.rectangle(img,(ex,ey),(ex+ew, ey+eh),(0,255,0),2)

This Python program will create an image named Eye_AB.jpg with eye detection as shown −

Eye AB

Part 15: Artificial Intelligence with Computer Vision in python

Part 5: Artificial Intelligence on Expert Systems and Robotics

What are Expert Systems?

The expert systems are the computer applications developed to solve complex problems in a particular domain, at the level of extra-ordinary human intelligence and expertise.

Characteristics of Expert Systems

  • High performance
  • Understandable
  • Reliable
  • Highly responsive

Capabilities of Expert Systems

The expert systems are capable of −

  • Advising
  • Instructing and assisting human in decision making
  • Demonstrating
  • Deriving a solution
  • Diagnosing
  • Explaining
  • Interpreting input
  • Predicting results
  • Justifying the conclusion
  • Suggesting alternative options to a problem

They are incapable of −

  • Substituting human decision makers
  • Possessing human capabilities
  • Producing accurate output for inadequate knowledge base
  • Refining their own knowledge

Components of Expert Systems

The components of ES include −

  • Knowledge Base
  • Inference Engine
  • User Interface

Let us see them one by one briefly −

Expert System

Knowledge Base

It contains domain-specific and high-quality knowledge.

Knowledge is required to exhibit intelligence. The success of any ES majorly depends upon the collection of highly accurate and precise knowledge.

What is Knowledge?

The data is collection of facts. The information is organized as data and facts about the task domain. Data, information, and past experience combined together are termed as knowledge.

Components of Knowledge Base

The knowledge base of an ES is a store of both, factual and heuristic knowledge.

  • Factual Knowledge − It is the information widely accepted by the Knowledge Engineers and scholars in the task domain.
  • Heuristic Knowledge − It is about practice, accurate judgement, one’s ability of evaluation, and guessing.

[wpsbx_html_block id=1891]

Knowledge representation

It is the method used to organize and formalize the knowledge in the knowledge base. It is in the form of IF-THEN-ELSE rules.

Knowledge Acquisition

The success of any expert system majorly depends on the quality, completeness, and accuracy of the information stored in the knowledge base.

The knowledge base is formed by readings from various experts, scholars, and the Knowledge Engineers. The knowledge engineer is a person with the qualities of empathy, quick learning, and case analyzing skills.

He acquires information from subject expert by recording, interviewing, and observing him at work, etc. He then categorizes and organizes the information in a meaningful way, in the form of IF-THEN-ELSE rules, to be used by interference machine. The knowledge engineer also monitors the development of the ES.

Inference Engine

Use of efficient procedures and rules by the Inference Engine is essential in deducting a correct, flawless solution.

In case of knowledge-based ES, the Inference Engine acquires and manipulates the knowledge from the knowledge base to arrive at a particular solution.

In case of rule based ES, it −

  • Applies rules repeatedly to the facts, which are obtained from earlier rule application.
  • Adds new knowledge into the knowledge base if required.
  • Resolves rules conflict when multiple rules are applicable to a particular case.

To recommend a solution, the Inference Engine uses the following strategies −

  • Forward Chaining
  • Backward Chaining

Forward Chaining

It is a strategy of an expert system to answer the question, “What can happen next?”

Here, the Inference Engine follows the chain of conditions and derivations and finally deduces the outcome. It considers all the facts and rules, and sorts them before concluding to a solution.

This strategy is followed for working on conclusion, result, or effect. For example, prediction of share market status as an effect of changes in interest rates.

Forward Chaining

Backward Chaining

With this strategy, an expert system finds out the answer to the question, “Why this happened?”

On the basis of what has already happened, the Inference Engine tries to find out which conditions could have happened in the past for this result. This strategy is followed for finding out cause or reason. For example, diagnosis of blood cancer in humans.

Backward Chaining

User Interface

User interface provides interaction between user of the ES and the ES itself. It is generally Natural Language Processing so as to be used by the user who is well-versed in the task domain. The user of the ES need not be necessarily an expert in Artificial Intelligence.

It explains how the ES has arrived at a particular recommendation. The explanation may appear in the following forms −

  • Natural language displayed on screen.
  • Verbal narrations in natural language.
  • Listing of rule numbers displayed on the screen.

The user interface makes it easy to trace the credibility of the deductions.

Requirements of Efficient ES User Interface

  • It should help users to accomplish their goals in shortest possible way.
  • It should be designed to work for user’s existing or desired work practices.
  • Its technology should be adaptable to user’s requirements; not the other way round.
  • It should make efficient use of user input.

Expert Systems Limitations

No technology can offer easy and complete solution. Large systems are costly, require significant development time, and computer resources. ESs have their limitations which include −

  • Limitations of the technology
  • Difficult knowledge acquisition
  • ES are difficult to maintain
  • High development costs

Applications of Expert System

The following table shows where ES can be applied.

Application Description
Design Domain Camera lens design, automobile design.
Medical Domain Diagnosis Systems to deduce cause of disease from observed data, conduction medical operations on humans.
Monitoring Systems Comparing data continuously with observed system or with prescribed behavior such as leakage monitoring in long petroleum pipeline.
Process Control Systems Controlling a physical process based on monitoring.
Knowledge Domain Finding out faults in vehicles, computers.
Finance/Commerce Detection of possible fraud, suspicious transactions, stock market trading, Airline scheduling, cargo scheduling.

Expert System Technology

There are several levels of ES technologies available. Expert systems technologies include −

  • Expert System Development Environment − The ES development environment includes hardware and tools. They are −
    • Workstations, minicomputers, mainframes.
    • High level Symbolic Programming Languages such as LISProgramming (LISP) and PROgrammation en LOGique (PROLOG).
    • Large databases.
  • Tools − They reduce the effort and cost involved in developing an expert system to large extent.
    • Powerful editors and debugging tools with multi-windows.
    • They provide rapid prototyping
    • Have Inbuilt definitions of model, knowledge representation, and inference design.
  • Shells − A shell is nothing but an expert system without knowledge base. A shell provides the developers with knowledge acquisition, inference engine, user interface, and explanation facility. For example, few shells are given below −
    • Java Expert System Shell (JESS) that provides fully developed Java API for creating an expert system.
    • Vidwan, a shell developed at the National Centre for Software Technology, Mumbai in 1993. It enables knowledge encoding in the form of IF-THEN rules.

Development of Expert Systems: General Steps

The process of ES development is iterative. Steps in developing the ES include −

Identify Problem Domain

  • The problem must be suitable for an expert system to solve it.
  • Find the experts in task domain for the ES project.
  • Establish cost-effectiveness of the system.

Design the System

  • Identify the ES Technology
  • Know and establish the degree of integration with the other systems and databases.
  • Realize how the concepts can represent the domain knowledge best.

Develop the Prototype

From Knowledge Base: The knowledge engineer works to −

  • Acquire domain knowledge from the expert.
  • Represent it in the form of If-THEN-ELSE rules.

Test and Refine the Prototype

  • The knowledge engineer uses sample cases to test the prototype for any deficiencies in performance.
  • End users test the prototypes of the ES.

Develop and Complete the ES

  • Test and ensure the interaction of the ES with all elements of its environment, including end users, databases, and other information systems.
  • Document the ES project well.
  • Train the user to use ES.

Maintain the System

  • Keep the knowledge base up-to-date by regular review and update.
  • Cater for new interfaces with other information systems, as those systems evolve.

Benefits of Expert Systems

  • Availability − They are easily available due to mass production of software.
  • Less Production Cost − Production cost is reasonable. This makes them affordable.
  • Speed − They offer great speed. They reduce the amount of work an individual puts in.
  • Less Error Rate − Error rate is low as compared to human errors.
  • Reducing Risk − They can work in the environment dangerous to humans.
  • Steady response − They work steadily without getting motional, tensed or fatigued.


What are Robots?

Robots are the artificial agents acting in real world environment.


Robots are aimed at manipulating the objects by perceiving, picking, moving, modifying the physical properties of object, destroying it, or to have an effect thereby freeing manpower from doing repetitive functions without getting bored, distracted, or exhausted.

What is Robotics?

Robotics is a branch of AI, which is composed of Electrical Engineering, Mechanical Engineering, and Computer Science for designing, construction, and application of robots.

Aspects of Robotics

  • The robots have mechanical construction, form, or shape designed to accomplish a particular task.
  • They have electrical components which power and control the machinery.
  • They contain some level of computer program that determines what, when and how a robot does something.

Difference in Robot System and Other AI Program

Here is the difference between the two −

AI Programs Robots
They usually operate in computer-stimulated worlds. They operate in real physical world
The input to an AI program is in symbols and rules. Inputs to robots is analog signal in the form of speech waveform or images
They need general purpose computers to operate on. They need special hardware with sensors and effectors.

Robot Locomotion

Locomotion is the mechanism that makes a robot capable of moving in its environment. There are various types of locomotions −

  • Legged
  • Wheeled
  • Combination of Legged and Wheeled Locomotion
  • Tracked slip/skid

Legged Locomotion

  • This type of locomotion consumes more power while demonstrating walk, jump, trot, hop, climb up or down, etc.
  • It requires more number of motors to accomplish a movement. It is suited for rough as well as smooth terrain where irregular or too smooth surface makes it consume more power for a wheeled locomotion. It is little difficult to implement because of stability issues.
  • It comes with the variety of one, two, four, and six legs. If a robot has multiple legs then leg coordination is necessary for locomotion.

The total number of possible gaits (a periodic sequence of lift and release events for each of the total legs) a robot can travel depends upon the number of its legs.

If a robot has k legs, then the number of possible events N = (2k-1)!.

In case of a two-legged robot (k=2), the number of possible events is N = (2k-1)! = (2*2-1)! = 3! = 6.

Hence there are six possible different events −

  • Lifting the Left leg
  • Releasing the Left leg
  • Lifting the Right leg
  • Releasing the Right leg
  • Lifting both the legs together
  • Releasing both the legs together

In case of k=6 legs, there are 39916800 possible events. Hence the complexity of robots is directly proportional to the number of legs.


Wheeled Locomotion

It requires fewer number of motors to accomplish a movement. It is little easy to implement as there are less stability issues in case of more number of wheels. It is power efficient as compared to legged locomotion.

  • Standard wheel − Rotates around the wheel axle and around the contact
  • Castor wheel − Rotates around the wheel axle and the offset steering joint.
  • Swedish 45o and Swedish 90o wheels − Omni-wheel, rotates around the contact point, around the wheel axle, and around the rollers.
  • Ball or spherical wheel − Omnidirectional wheel, technically difficult to implement.


Slip/Skid Locomotion

In this type, the vehicles use tracks as in a tank. The robot is steered by moving the tracks with different speeds in the same or opposite direction. It offers stability because of large contact area of track and ground.


Components of a Robot

Robots are constructed with the following:

  • Power Supply: The robots are powered by batteries, solar power, hydraulic, or pneumatic power sources.
  • Actuators: They convert energy into movement.
  • Electric motors (AC/DC): They are required for rotational movement.
  • Pneumatic Air Muscles: They contract almost 40% when air is sucked in them.
  • Muscle Wires: They contract by 5% when electric current is passed through them.
  • Piezo Motors and Ultrasonic Motors: Best for industrial robots.
  • Sensors: They provide knowledge of real time information on the task environment. Robots are equipped with vision sensors to be to compute the depth in the environment. A tactile sensor imitates the mechanical properties of touch receptors of human fingertips.

Computer Vision

This is a technology of AI with which the robots can see. The computer vision plays vital role in the domains of safety, security, health, access, and entertainment. Computer vision automatically extracts, analyzes, and comprehends useful information from a single image or an array of images. This process involves development of algorithms to accomplish automatic visual comprehension.

Hardware of Computer Vision System

This involves:

  • Power supply
  • Image acquisition device such as camera
  • A processor
  • A software
  • A display device for monitoring the system
  • Accessories such as camera stands, cables, and connectors

Tasks of Computer Vision

  • OCR: In the domain of computers, Optical Character Reader, a software to convert scanned documents into editable text, which accompanies a scanner.
  • Face Detection: Many state-of-the-art cameras come with this feature, which enables to read the face and take the picture of that perfect expression. It is used to let a user access the software on correct match.
  • Object Recognition: They are installed in supermarkets, cameras, high-end cars such as BMW, GM, and Volvo.
  • Estimating Position: It is estimating position of an object with respect to camera as in position of tumor in human’s body.

Application Domains of Computer Vision

  • Agriculture
  • Autonomous vehicles
  • Biometrics
  • Character recognition
  • Forensics, security, and surveillance
  • Industrial quality inspection
  • Face recognition
  • Gesture analysis
  • Geoscience
  • Medical imagery
  • Pollution monitoring
  • Process control
  • Remote sensing
  • Robotics
  • Transport

Applications of Robotics

The robotics has been instrumental in the various domains such as −

  • Industries − Robots are used for handling material, cutting, welding, color coating, drilling, polishing, etc.
  • Military − Autonomous robots can reach inaccessible and hazardous zones during war. A robot named Daksh, developed by Defense Research and Development Organization (DRDO), is in function to destroy life-threatening objects safely.
  • Medicine − The robots are capable of carrying out hundreds of clinical tests simultaneously, rehabilitating permanently disabled people, and performing complex surgeries such as brain tumors.
  • Exploration − The robot rock climbers used for space exploration, underwater drones used for ocean exploration are to name a few.
  • Entertainment − Disney’s engineers have created hundreds of robots for movie making.