Practical Image Process with OpenCV

栏目: IT技术 · 发布时间: 4年前

内容简介:Image processing is basically the process that provides us to achieve the features from images. Image processing is applied for both images and videos. These are the procedures used frequently in order to make training more successful in deep learning stru

Practical Image Process with OpenCV

Image processing is basically the process that provides us to achieve the features from images. Image processing is applied for both images and videos. These are the procedures used frequently in order to make training more successful in deep learning structures.

Image Processing

The image processing begins with the recognition of data by computers. Firstly, a matrix is ​​created for data in image format. Each pixel value in the image is processed into this matrix. For example, a matrix of size 200×200 is created for a picture of size 200×200. If this image is colored, this dimension becomes 200x200x3 (RGB). In fact, every manipulation in image processing is a matrix operation. Suppose that a blur operation is desired on the image. A particular filter moves over the entire matrix that it making changes on either all of the matrix elements or part of the matrix elements. As a result of this process, the required part or the whole of the image becomes blurred.

The processing of images is needed in many cases [1]. Generally, these operations are applied on the image format data that will be used in deep learning models. For example, it does not matter that the data is colored in some projects. In this case, using color images for training will cause performance losses. One of the most widely used deep learning structures of image processing is Convolutional Neural Networks. This network determines the required attributes for training with Convolutional layer on the image. At this point, only certain parts of the images that will be used for training may need to be processed. The prominence of more rounded lines rather than the sharp lines in the pictures can sometimes improve the success of the training.

In such cases, image processing techniques are used.

The same logic is based on the operations of image optimization programs used in daily life in addition to the situations described above. There are many process in image processing such as improving the quality of the images, making restorations on the images, removing the noise, Histogram equalization.

OpenCV

OpenCV is one of the most popular libraries used for image processing [2]. There are many companies that use OpenCV, such as Microsoft, Intel, Google, Yahoo. OpenCV supports a wide variety of programming languages such as Java, C++, Python and Matlab. All of the samples in this work that they are coded in Python.

import cv2
from matplotlib import pyplot as plt
import numpy as np

Firstly, The libraries are imported. There are some functions in OpenCV that don’t work stably in every version. One of these functions is “imshow”. This function provides us to see the changes in the image as a result of our operations. The matplotlib library will be used as an alternative solution in this work for those who have such problems.

Figure 1. Standard Image

The processes to be performed will be applied on the image shown above (Figure 1). The image is read initially so that it can be processed.

img_path = "/Users/..../opencv/road.jpeg"
img = cv2.imread(img_path)
print(img.shape)>>>(960, 1280, 3)

The dimensions of the image are 960 x 1280 pixels in Figure 2. We see the result of 960x1280x3 when we want to print the dimensions after reading process. So a matrix was created up to the dimensions of the image and this matrix is assigned the values of each pixel of the image. There are 3 dimensions from RGB because the image is colorful.

If we want to convert the image in black and white, cvtColor function is used.

gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

If we want to see the change that occurs as a result of this function, we use the imshow function from matplotlib.

gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
plt.imshow(gray_image)
plt.show()
print(gray_image.shape)>>>(960, 1280)
Figure 2. Black-White Image

As shown in Figure 2, we have converted our image to black and white. When we check their dimensions, there are no more 3 dimensions due to RGB.

When you look at the matrix values ​​of the image, we see that it consists of values ​​between 0 and 255. In some cases, we may want this matrix to consist only of values ​​0 and 255 [3]. The threshold function is used in such cases.

(thresh, blackAndWhiteImage) = cv2.threshold(gray_image, 20, 255, cv2.THRESH_BINARY)
(thresh, blackAndWhiteImage) = cv2.threshold(gray_image, 80, 255, cv2.THRESH_BINARY)
(thresh, blackAndWhiteImage) = cv2.threshold(gray_image, 160, 255, cv2.THRESH_BINARY)
(thresh, blackAndWhiteImage) = cv2.threshold(gray_image, 200, 255, cv2.THRESH_BINARY)
plt.imshow(blackAndWhiteImage)
plt.show()
Figure 3. Image with Threshold Function Applied

The first parameter required by the threshold function in OpenCV is the image to be processed. The following parameter is the threshold value. The third parameter is the value that we want to assign the matrix elements that exceed the threshold value. The effects of four different threshold values ​​can be seen in Figure 3. In the first image (Image 1), the threshold value was determined as 20. All values ​​above 20 are assigned to 255. The remaining values ​​are set to 0. This allowed only black or very dark colors to be black and all other shades to be directly white. The threshold values ​​of the Image 2 and Image 3 were given 80 and 160. Finally, the threshold value was determined as 200 in the Image 4. Unlike the Image 1, white and very light colors were assigned as 255, while all the remaining values ​​were set to 0 in Image 4. The threshold values must be set specifically for each image and for each case.

Another method used in image processing is blurring . This can be accomplished with more than one function.

output2 = cv2.blur(gray_image, (10, 10))
plt.imshow(output2)
plt.show()
Figure 4.Blurred Image with blur Function
output2 = cv2.GaussianBlur(gray_image, (9, 9), 5)
plt.imshow(output2)
plt.show()
Figure 5. Blurred Image with GaussianBlur Function

As seen in Figure 4 and Figure 5, the black and white image is blurred with the specified blur filters and blur degrees. This process is usually used to remove noise in images. Also, in some cases, training is badly affected due to the sharp lines in the images. It is available in cases where it is used for this reason.

In some cases, the data may need to be rotated for augmentation, or images to be used as data may be skewed. The following functions can be used in this cases.

(h, w) = img.shape[:2]
center = (w / 2, h / 2)
M = cv2.getRotationMatrix2D(center, 13, scale  =1.1)
rotated = cv2.warpAffine(gray_image, M, (w, h))
plt.imshow(rotated)
plt.show()
Figure 6. Rotated Image with getRotationMatrix2D Function

First of all, the center of the image is determined and the rotation is performed by this center. The first parameter of the getRotationMatrix2D function is the calculated center values. The second parameter is the angle value. Finally the third parameter is the scaling value to be applied after the rotation. If this value is set to 1, it will rotate the same image only according to the given angle without any scaling.

Sample 1

The methods mentioned above are often used together in projects. Let’s make a sample project for a better understanding of these structures and process.

Let’s say we want to train autonomous driving pilot for vehicles [4]. When the image in Figure 1 is examined for this problem, our autonomous pilot should be able to understand the path and lanes. We can use the OpenCV for this problem. Since the color does not matter in this problem, the image is converted to black and white. The matrix elements set the values ​​0 and 255 by the determined threshold value. As mentioned above in the explanation of the threshold function, selection of the threshold value is critical for this function. The threshold value is set at 200 for this problem. We can clear up other details as it will be enough to focus on roadsides and lanes. In order to get rid of noise, the blurring is performed with GaussianBlur function. The parts up to here can be examined in detail from Figure 1 to 5.

After these processes, Canny edge detection is applied.

img = cv2.imread(img_path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(thresh, output2) = cv2.threshold(gray_image, 200, 255, cv2.THRESH_BINARY)
output2 = cv2.GaussianBlur(output2, (5, 5), 3)
output2 = cv2.Canny(output2, 180, 255)
plt.imshow(output2)
plt.show()
Figure 7. Image of Canny Function Result

The first parameter that the Canny function takes is the image to which the operation will be applied. The second parameter is the low threshold value and the third parameter is the high threshold value. The image is scanned pixel by pixel for edge detection. As soon as there is a value lower than the low threshold value, the first side of the edge is detected. When a higher value is found than the higher threshold value, the other side is determined and the edge is created. For this reason, the threshold parameter values ​​are determined for each image and for each problem. In order to better observe the GaussianBlur effect, let’s do the same actions without blurring this time.

img = cv2.imread(img_path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(thresh, output2) = cv2.threshold(gray_image, 200, 255, cv2.THRESH_BINARY)
output2 = cv2.Canny(output2, 180, 255)
plt.imshow(output2)
plt.show()
Figure 8. Non-Blur Image

When the GaussianBlur function is not implemented, the noise is clearly visible in Figure 8. These noises may not be a problem for our project, but they will have a great impact on training success in different projects and situations. After this stage, the processes are performed on the real (standard) image based on the determined edges. HoughLinesP and line functions are used for this.

lines = cv2.HoughLinesP(output2, 1, np.pi/180,30)
for line in lines:
    x1,y1,x2,y2 = line[0]
    cv2.line(img,(x1,y1),(x2,y2),(0,255,0),4)
plt.imshow(img)
Figure 9. Image with HoughLinesP Function applied

As seen in the picture in Figure 9, road boundaries and lanes were nicely achieved. However, when the Figure 9 is carefully examined, some problems will be noticed. Although there was no problem in determining lane and road boundaries, clouds were also perceived as road boundaries. The masking method should be used to prevent these problems [5].

def mask_of_image(image):
    height = image.shape[0]
    polygons = np.array([[(0,height),(2200,height),(250,100)]])
    mask = np.zeros_like(image)
    cv2.fillPoly(mask,polygons,255)
    masked_image = cv2.bitwise_and(image,mask)
    return masked_image

We can do the masking process with the mask_of_image function. First of all, the area to be masked is determined as a polygon. The parameter values ​​ are completely data-specific values.

Figure 10. Determined area for masking

The mask (Figure 10) will be applied on the real picture. There is no process to the regions corresponding to the black area in the real image. However, all of the above processes are applied to the areas corresponding to the white area.

Figure 11. Masking Applied Image

As shown in Figure 11, as a result of the masking process, we have solved the problem that we saw in the clouds.

Sample 2

We solved the lane recognition problem with HougLinesP. Let’s assume that this problem applies to circular shapes [6].

Let’s create an image processing that recognizes the coins in Figure 12. In this case, the methods used in the lane recognition project will also be used here.

img = cv2.imread("/Users/.../coin.png")
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(thresh, output2) = cv2.threshold(gray_image, 120, 255, cv2.THRESH_BINARY)
output2 = cv2.GaussianBlur(output2, (5, 5), 1)
output2 = cv2.Canny(output2, 180, 255)
plt.imshow(output2, cmap = plt.get_cmap("gray"))circles = cv2.HoughCircles(output2,cv2.HOUGH_GRADIENT,1,10,                       param1=180,param2=27,minRadius=20,maxRadius=60)
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
    # draw the outer circle
    cv2.circle(img,(i[0],i[1]),i[2],(0,255,0),2)
    # draw the center of the circle
    cv2.circle(img,(i[0],i[1]),2,(0,0,255),3)
    
plt.imshow(img)
Figure 13. Final Coins Image

As a result of the image processing, it i can be reached in Figure 13.

The image is converted to black and white. Then the threshold function is applied. The GaussianBlur and Canny edge detection functions are used.

Finally, circles are drawn with HoughCircles function.

Image processing is also applied to text in image format.

Figure 14. Text in Image Format

Let’s say we want to train our system with the text seen in Figure 14.We want all words or some specific words to be identified by our model as a result of the training. We may need to teach the position information of the words to the system. OpenCV is also used in such problems. First of all, the image (in Figure 14) is converted into text. An optical character recognition engine called Tesseract is used for this[7].

data = pytesseract.image_to_data(img, output_type=Output.DICT, config = "--psm 6")
n_boxes = len(data['text'])
for i in range(n_boxes):
    (x, y, w, h) = (data['left'][i], data['top'][i], data['width'][i], data['height'][i])
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)plt.imshow(img)
plt.show()
Figure 15. Processing of Word Position Information

The image is achieved shown in Figure 15 by combining the information obtained with the help of Tesseract with OpenCV. Each word and each block of words are enclosed in circle. It also be possible to manipulate only certain words in the frame by manipulating the information from Tesseract. In addition, image processing can be applied to clear the text from the noises. However, when the GaussianBlur function used in other examples is applied for the text, it will adversely affect the quality and legibility of the text.Therefore, the medianBlur function will be used instead of the GaussianBlur function.

img = cv2.imread(img_path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
output2 = cv2.medianBlur(gray_image, ksize=5)
plt.imshow(output2)
plt.show()
Figure 16. medianBlur Function Applied Image

When the image is examined in Figure 14, the dashed lines are clearly visible below some words. In this case, the optical character recognition engines may misread some words. As a result of the medianBlur process in Figure 16, it can be seen that these dashed lines are gone.

Note: The dimensions of the matrices of black-and-white images must be checked. Most of the time there are RGB dimensions, even if it is black and white. This may cause you to get dimension errors in some functions in OpenCV.

Erode and Dilate functions can also be used to get rid of the noise of the text in the image format.

kernel = np.ones((3,3),np.uint8)
output2 = cv2.dilate(gray_image,kernel,iterations = 3)
plt.imshow(output2)
plt.show()
Figure 17. The Image Resulting from the Dilated Function

When looking at the text in Figure 14, it will be seen that there are some point-shaped noises. It can be seen that these noises are significantly eliminated with the use of the dilated function in Figure 17. The thinning rate on the article can be changed by changing the created filter and iterations parameter values. These values ​​must be determined correctly in order to preserve the readability of the text. Erode function, in contrast to the dilated function, provides thickening of the text.

kernel = np.ones((3,3),np.uint8)
output2 = cv2.erode(gray_image,kernel,iterations = 3)
plt.imshow(output2)
plt.show()
Figure 18. The Image Resulting from the Erode Function

The font thicknesses were increased with the Erode function, as seen in Figure 18. It is a method used to increase the quality of the articles written in fine fonts in general. Another point to be noted here is that our articles are black and our background is white. If the background was black and the text was white, the processes of these functions would be displaced.

OpenCV is used to increase the quality of some images. For instance histogram values ​​of images with poor contrast are spread over a narrow area.

In order to improve the contrast of this image, it is necessary to spread histogram values ​​over a wide area. The equalizeHist function is used for these operations. Let’s make histogram equalization for the image in Figure 19.

Figure 19. Histogram Values ​​Unmodified Image (Original Image)
Figure 20. Histogram Distribution of Original Image

The histogram of the original image (Figure 19) can be seen in Figure 20.

The visibility of the objects in the image is low.

equ = cv2.equalizeHist(gray_image)
plt.imshow(equ)
Figure 21. Histogram Equalized Image
Figure 22. Histogram Distribution of Histogram Equalized Image

The image whose histogram is equalized with the equalizeHist function can be seen in Figure 21. The quality and clarity of the image has increased. In addition, histogram graphic of the image whose histogram equalization has been done in Figure 22. It can be seen that the values ​​collected in one area in Figure 20 spread over a larger area after histogram equalization. These histogram values ​​can be checked for each image. The quality of the image can be increased by making histogram equalization when it is necessary.

References

[1]P.Erbao, Z.Guotong, “Image Processing Technology Research of On-Line Thread Processing”, 2012 International Conference on Future Electrical Power and Energy System, April 2012.

[2]H.Singh, Practical Machine Learning and Image Processing, pp.63–88, January 2019.

[3]R.H.Moss, S.E.Watkins, T.Jones, D.Apel, “Image thresholding in the high resolution target movement monitor”, Proceedings of SPIE — The International Society for Optical Engineering, March 2009.

[4]Y.Xu, L.Zhang, “Research on Lane Detection Technology Based on OPENCV”, Conference: 2015 3rd International Conference on Mechanical Engineering and Intelligent Systems, January 2015.

[5]F.J.M.Lizan, F.Llorens, M.Pujol, R.R.Aldeguer, C.Villagrá, “Working with OpenCV and Intel Image Proccessing Libraries. Proccessing image data tools”, Informática Industrial e Inteligencia Artificial, July 2002.

[6]Q.R.Zhang, P.Peng, Y.M.Jin, “Cherry Picking Robot Vision Recognition System Based on OpenCV”, MATEC Web of Conferences, January 2016.

[7]R.Smith, “An Overview of the Tesseract OCR Engine”, Conference: Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on, Volume: 2, October 2007.

[8] https://www.mathworks.com/help/examples/images/win64/GetAxesContainingImageExample_01.png


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Algorithms + Data Structures = Programs

Algorithms + Data Structures = Programs

Niklaus Wirth / Prentice Hall / 1975-11-11 / GBP 84.95

It might seem completely dated with all its examples written in the now outmoded Pascal programming language (well, unless you are one of those Delphi zealot trying to resist to the Java/.NET dominanc......一起来看看 《Algorithms + Data Structures = Programs》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具