Table of Contents
- What is Detectron2?
- Project Setup
- Build the Model
- Training and Evaluation
- Results
- Resources
What is Detectron2?
Detectron2 is an opensource object recognition and segmentation software system that implements state of the art algorithms as part of Facebook AI Research(FAIR). It is a ground-up rewrite of its previous version Detectron in Pytorch, and it originates from maskRCNN-Benchmark .
You can see the details of the benchmark comparisons, different applications, customizations, and brief up on nuts and bolts of its working nature from PyTorch DevCon19.
The Detectron also provides a large collection of baselines trained with Detectron2 and you can access the code from Model Zoo to start with. Use this notebook provided by FAIR to play with Detectron2.
Project Setup
We are going to develop a character recognition and segmentation for “Telugu Characters”. To build this project we need the data suitable for the mentioned machine learning task. As there is not sufficient data available we need to collect the custom data.
Dataset
Use the article on “How to Prepare a Custom Dataset for Character Recognition and Segmentation?” to create your own custom data which includes tools and programs to collect data and preprocess and building annotations of data.
Project Structure
Here you can see the project structure consisting of the files and directories such as test_data and train_data consists of images and annotations for training and testing the models.
| Detectron2_Telugu_Characters.ipynb | label2coco.py | resize.py | test.json | train.json | Viewer.ipynb | +---results | res1.PNG | res2.PNG | res3.PNG | res4.PNG | res5.PNG | res6.PNG | res7.PNG | +---test_data | img1.json | img1.png | img10.json | img10.png | img11.json | img11.png | img12.json | img12.png | img13.json | img13.png | img14.json | img14.png | img15.json | img15.png | img16.json | img16.png | img17.json | img17.png | img18.json | img18.png | img19.json | img19.png | img2.json | img2.png | img20.json | img20.png | img3.json | img3.png | img4.json | img4.png | img5.json | img5.png | img6.json | img6.png | img7.json | img7.png | img8.json | img8.png | img9.json | img9.png | +---test_images | 81XtS7O3nUL._SL1500_.jpg | alpha1.jpg | download.jfif | images (1).jfif | images (2).jfif | images (2).png | images (3).jfif | images (3).png | images (4).jfif | images (4).png | images (5).jfif | images (5).png | images (6).jfif | images (7).jfif | images.jfif | images.png | test_1.jpg | x1080.jfif | \---train_data img1.json img1.png img10.json img10.png img100.json img100.png img101.json img101.png img102.json img102.png img103.json img103.png img104.json img104.png img105.json img105.png img106.json img106.png img107.json img107.png img108.json img108.png img109.json img109.png img11.json img11.png img110.json img110.png img111.json img111.png img112.json img112.png img113.json img113.png img114.json img114.png img115.json img115.png img116.json img116.png img117.json img117.png img118.json img118.png img119.json img119.png img12.json img12.png img120.json img120.png img121.json img121.png img122.json img122.png img123.json img123.png img124.json img124.png img125.json img125.png img126.json img126.png img127.json img127.png img128.json img128.png img129.json img129.png img13.json img13.png img130.json img130.png img131.json img131.png img132.json img132.png img133.json img133.png img134.json img134.png img135.json img135.png img136.json img136.png img137.json img137.png img138.json img138.png img139.json img139.png img14.json img14.png img140.json img140.png img141.json img141.png img142.json img142.png img143.json img143.png img144.json img144.png img145.json img145.png img146.json img146.png img147.json img147.png img148.json img148.png img149.json img149.png img15.json img15.png img150.json img150.png img151.json img151.png img152.json img152.png img153.json img153.png img154.json img154.png img155.json img155.png img156.json img156.png img157.json img157.png img158.json img158.png img159.json img159.png img16.json img16.png img160.json img160.png img161.json img161.png img162.json img162.png img163.json img163.png img164.json img164.png img165.json img165.png img166.json img166.png img167.json img167.png img168.json img168.png img169.json img169.png img17.json img17.png img170.json img170.png img171.json img171.png img172.json img172.png img173.json img173.png img174.json img174.png img175.json img175.png img176.json img176.png img177.json img177.png img178.json img178.png img179.json img179.png img18.json img18.png img180.json img180.png img181.json img181.png img182.json img182.png img183.json img183.png img184.json img184.png img185.json img185.png img186.json img186.png img187.json img187.png img188.json img188.png img189.json img189.png img19.json img19.png img190.json img190.png img191.json img191.png img192.json img192.png img193.json img193.png img194.json img194.png img195.json img195.png img196.json img196.png img197.json img197.png img198.json img198.png img199.json img199.png img2.json img2.png img20.json img20.png img200.json img200.png img21.json img21.png img22.json img22.png img23.json img23.png img24.json img24.png img25.json img25.png img26.json img26.png img27.json img27.png img28.json img28.png img29.json img29.png img3.json img3.png img30.json img30.png img31.json img31.png img32.json img32.png img33.json img33.png img34.json img34.png img35.json img35.png img36.json img36.png img37.json img37.png img38.json img38.png img39.json img39.png img4.json img4.png img40.json img40.png img41.json img41.png img42.json img42.png img43.json img43.png img44.json img44.png img45.json img45.png img46.json img46.png img47.json img47.png img48.json img48.png img49.json img49.png img5.json img5.png img50.json img50.png img51.json img51.png img52.json img52.png img53.json img53.png img54.json img54.png img55.json img55.png img56.json img56.png img57.json img57.png img58.json img58.png img59.json img59.png img6.json img6.png img60.json img60.png img61.json img61.png img62.json img62.png img63.json img63.png img64.json img64.png img65.json img65.png img66.json img66.png img67.json img67.png img68.json img68.png img69.json img69.png img7.json img7.png img70.json img70.png img71.json img71.png img72.json img72.png img73.json img73.png img74.json img74.png img75.json img75.png img76.json img76.png img77.json img77.png img78.json img78.png img79.json img79.png img8.json img8.png img80.json img80.png img81.json img81.png img82.json img82.png img83.json img83.png img84.json img84.png img85.json img85.png img86.json img86.png img87.json img87.png img88.json img88.png img89.json img89.png img9.json img9.png img90.json img90.png img91.json img91.png img92.json img92.png img93.json img93.png img94.json img94.png img95.json img95.png img96.json img96.png img97.json img97.png img98.json img98.png img99.json img99.png
Build the Model
In this section, we are going to see to build a model to perform Telugu character recognition and segmentation using Detectron2.
Register COCO Dataset
since we are following Common Objects in Context(COCO) dataset format, we need to register the train and test data as COCO instances. Here is the code:
We need the metadata of the training set and the dataset_dicts i.e., internal format os annotations of the train images.
Let’s see some of the training samples:
we are using the COCO pre-trained R50-FPN Mask R-CNN model from the Model Zoo. The next step is to configure the model using the configuration file, loading weights and setting a threshold value in this case 0.5
Training andEvaluation
Setting Parameters
Here we are using previously COCO registered data i.e., train and test data, number of workers as 4, a path to load the model weights and images per batch as 2, the base learning rate as 0.001, and maximum iterations as 200 and number of classes as 6.
Here you can see the details of the training data:
Here values such as total loss, classification loss, and different metrics are depicted as graphs and they are shown using tensorboard.dev. Here is the link:
Evaluation
The COCOEvaluator
is used to evaluate the test dataset and the evaluation results are save in the ‘output’ directory. The inference_on_dataset
function also provides accurate speed benchmarks for the given model and dataset.
Here is the output:
Results
Even though the Mean Average Precision(AP) is not greater than 0.81%. Considering only 200 images for training data resulting in almost accurate predictions is a better sign and quite promising.
Here are the few unseen samples were given to the model and the results are :
Using this method you can implement your own object recognition and segmentation system using Detectron2.
Resources
以上所述就是小编给大家介绍的《Character Recognition and Segmentation using Detectron2》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
用UML构建Web应用
科纳尔伦 (Conallen Jim) / 陈起 / 中国电力出版社 / 2003-11 / 39.0
用UML构建Web应用(第2版),ISBN:9787508315577,作者:(美)Jim Conallen著;陈起,英宇译;陈起译一起来看看 《用UML构建Web应用》 这本书的介绍吧!