My test image having size of 1200 x 1696, with number of texts per image could be maximum 500 (could be similar to one page inside a book)
I trained the PGNet, the training epoch is fine, but the eval is really slow.
I have investigated inside your code and the issue come from get_socre_A function in E2EMetric, so I guess when the size of e2e_info_list and gt_info_list is above 100, the process take really long time to complete (about 13s per page with 120 annotated text blocks, and 31s for 199 text blocks).
Maybe this function running on CPU. isn't it?
Do you have any solution for this one? Thank you.
2条答案
按热度按时间8iwquhpp1#
My config file:
Global:
use_gpu: True
epoch_num: 600
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/pgnet_r50_vd_totaltext/
save_epoch_step: 10
eval_batch_step: [ 0, 1000 ]
cal_metric_during_train: False
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img:
valid_set: partvgg # two mode: totaltext valid curved words, partvgg valid non-curved words
save_res_path: ./output/pgnet_r50_vd_totaltext/predicts_pgnet.txt
character_dict_path: ppocr/utils/ko_2463.txt
character_type: korean
max_text_length: 25 # the max length in seq
max_text_nums: 1000 # the max seq nums in a pic
tcl_len: 64
Architecture:
model_type: e2e
algorithm: PGNet
Transform:
Backbone:
name: ResNet
layers: 50
Neck:
name: PGFPN
Head:
name: PGHead
out_channels: 2464 # Loss.pad_num + 1
Loss:
name: PGLoss
tcl_bs: 64
max_text_length: 25 # the same as Global: max_text_length
max_text_nums: 1000 # the same as Global竊쉖ax_text_nums
pad_num: 2463 # the length of dict for pad
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0001
regularizer:
name: 'L2'
factor: 0.
PostProcess:
name: PGPostProcess
score_thresh: 0.5
mode: fast # fast or slow two ways
Metric:
name: E2EMetric
mode: A # two ways for eval, A: label from txt, B: label from gt_mat
gt_mat_dir: ./Synthetic_ko_total_text/gt # the dir of gt_mat
character_dict_path: ppocr/utils/ko_2463.txt
main_indicator: f_score_e2e
Train:
dataset:
name: PGDataSet
data_dir: /home/gridone/TextRecognitionDataGenerator/out/document_v6/train
label_file_list: [/home/gridone/TextRecognitionDataGenerator/out/document_v6/train/train.txt]
ratio_list: [1.0]
transforms:
img_mode: BGR
channel_first: False
augmenter_args:
batch_size: 4 # same as loader: batch_size_per_card
min_crop_size: 24
min_text_size: 4
max_text_size: 512
keep_keys: [ 'images', 'tcl_maps', 'tcl_label_maps', 'border_maps','direction_maps', 'training_masks', 'label_list', 'pos_list', 'pos_mask' ] # dataloader will return list in this order
loader:
shuffle: True
drop_last: True
batch_size_per_card: 4
num_workers: 16
Eval:
dataset:
name: PGDataSet
data_dir: /home/gridone/TextRecognitionDataGenerator/out/document_v6/test
label_file_list: [/home/gridone/TextRecognitionDataGenerator/out/document_v6/test/test.txt]
transforms:
img_mode: BGR
channel_first: False
max_side_len: 768
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
keep_keys: [ 'image', 'shape', 'polys', 'texts', 'ignore_tags', 'img_id']
loader:
shuffle: False
drop_last: False
batch_size_per_card: 1 # must be 1
num_workers: 16
OS: Ubuntu 18.04
GPU: NVIDIA Titan X
xxe27gdn2#
I think I solved it using joblib.Parallel in sigma calculation and tau calculation process :)