This example uses OpenCV to transform the image and Tesseract by Google to perform OCR on an image
import cv2
import pytesseract
from IPython.display import display
from PIL import Image
Then image is read into a numpy array using OpenCV:
fname = 'age.png'
img = cv2.imread(fname)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Converting BGR to RGB\
display(Image.fromarray(img))
The goal is to know what the villgers are doing which is displayed in the bottom right corner. To aid the ORC the image needs to be cropped and pre-processed. This processing converts the image to greyscale and applies a filter to remove the background, significantly reducing the noise in the image making OCR much easier.
crop = img[540:-135, 1260:]
gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
display(Image.fromarray(thresh))
Finally pytesseract which is a Python wrapper for Tesseract can be used to perform ORC on the image. The ORC has been sucsessful with only one error mistaking the 'r' for a 'c' in Miner. On inspection of the cropped and processed image the r does look like a c to the human eye. Optimising the greyscale, blur and threshold paramiters may help to get a better image. Another option would be to crop the image to only the numbers.
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)
Mince: 2 Lumberjack: 1 Hunter: 1 Forager: 1 Shepherds: 2
Sample usage:
types = ['Miner', 'Lumberjack', 'Hunter', 'Forager', 'Shepherds']
numbers = [int(x) for x in data if x.isdigit()]
= zip(types, numbers)
This process can be written into a handy function for future use. Pyautogui could be used for automatically taking screen shots and feeding them into the function
! jupyter nbconvert --to html ORC.ipynb
[NbConvertApp] Converting notebook ORC.ipynb to html [NbConvertApp] Writing 3516469 bytes to ORC.html