Skip to content
This repository was archived by the owner on Feb 29, 2024. It is now read-only.

Refactor LabelFileFormat object and create abstract class for Reader and Writer objects #973

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
resources/icons/.DS_Store
resources.py
labelImg.egg-info*
zachary_notes.txt

*.pyc
.*.swp

build/
dist/
uml/
venv/

tags
cscope*
Expand Down
113 changes: 113 additions & 0 deletions dev_log.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
Todo:
refactor labelImg repository, mainly focus on io operations
refactoring classes: LabelFile, LabelFileFormat, LabelIO

2022/10/30:
In labelImg.py:
add comments to note some code smells for refactory

In labelImg.py, def save_labels():
code smell:
the function originally uses multiple if else statements to check the label format,
and then call the corresponding method in LabelFile object to write the file

solving:
making a IOMAP(dictionary) in LabelFile.py
whenever a label needs to be save, map the label format to the corresponding reader object
so we have to allign the init method of all writers
also the save() method in each writers
for future extentions, writers must follow the format

result:
unit test passed

futher work:
add interface for writer that restrict the save() method and init() method
(done)link labelformat with labelfile to remove dictionary mapping

2022/10/31:
In labelImg.py, def set_format():
code smell:
using if else statements to set each format

solving:
making LabelFileFomat an abstract class, contain corresponding writer/reader(io), icon, text
then implement the LabelFileFomat object with current available formats

whenever the labelformat is specified, then the corresponding icon, text, io will be noticed,
because the abstract class force the derived class to have the corresponding attributes,
the only if-else statement remain is the toggle button

result:
unit test passed

futher work:
solve if-else in def change_format(), probably using a list,
or making a patch to search all derived class of LabelFileFormat

2022/11/6:
In labelFile.py
code smell:
Extracting abstract class for labelFileFormat does not make sence.
Ideally, abstract class provide a set of abstract method
and concrete class must implement them.
But for the fileformat design, we only need every format to own corresponding
io as attribute. not method

solving:
thus, I redesign the labelFileFormat class as a concrete class
Each format is a instance of this class, the init method force each instance to have
the required attributes assigned

result:
unit test passed

futher work:
explore some design pattern to apply on format/io interfaces

2022/11/26:
Debugging for application:
though we passed the unit test, I found that the application will crash if some functions are used
1. the toggle label file format button
2. save button
3. openfile button

the toggle label file format button:
this error occur when we compare two LabelFileFormat objects, in python the compairson is based on pointer address
however, here we want to compare only the attributes for the identity.
Thus, a __eq__ method is written into labelFileFormat class, then the issue is solved

save button:
for each save operator, we initialize a new writer based on the format
it is important to keep each writer consistent of input parameters
currently, we cannot fuse writer into label file format operation, this will be left to foward work

openfile button:
this error occur when detecting if the file is available for import
we are checking validity by checking image format and label suffix
the suffix of all labels are not saved, we can only see one suffix at a time
to solve this issue, we add suffixes attribute in labelFileFormat class
thus we can now see all the suffixes of the objects

futher work:
fuse writer into labelFileFormat object

2022/12/09:
Today's work:
1. solving if-else statements in load_file_by_filename()
2. redefining clear parameter for writer objects
3. E2E debugging for labelfile io operation

Code smell:
long if-else statements for loading label file

Solve:
search for matching suffix in LabelFileFormat object
then use the method of LabelFileFormat.read to load any file

futher work: abstract interface for writer and reader objects

2022/12/24:
Today's work:
build abstract class for file reader and file writer
for future extension, any format should inherit the writer and reader class for implementation
161 changes: 36 additions & 125 deletions labelImg.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@
from libs.lightWidget import LightWidget
from libs.labelDialog import LabelDialog
from libs.colorDialog import ColorDialog
from libs.labelFile import LabelFile, LabelFileError, LabelFileFormat
from libs.labelFile import LabelFile, LabelFileError
from libs.labelFileFormat import LabelFileFormat, PascalVoc, Yolo, CreateML
from libs.toolBar import ToolBar
from libs.pascal_voc_io import PascalVocReader
from libs.pascal_voc_io import XML_EXT
Expand Down Expand Up @@ -90,7 +91,7 @@ def __init__(self, default_filename=None, default_prefdef_class_file=None, defau

# Save as Pascal voc xml
self.default_save_dir = default_save_dir
self.label_file_format = settings.get(SETTING_LABEL_FILE_FORMAT, LabelFileFormat.PASCAL_VOC)
self.label_file_format = settings.get(SETTING_LABEL_FILE_FORMAT, PascalVoc)

# For loading all image under a directory
self.m_img_list = []
Expand Down Expand Up @@ -242,20 +243,9 @@ def __init__(self, default_filename=None, default_prefdef_class_file=None, defau
save = action(get_str('save'), self.save_file,
'Ctrl+S', 'save', get_str('saveDetail'), enabled=False)

def get_format_meta(format):
"""
returns a tuple containing (title, icon_name) of the selected format
"""
if format == LabelFileFormat.PASCAL_VOC:
return '&PascalVOC', 'format_voc'
elif format == LabelFileFormat.YOLO:
return '&YOLO', 'format_yolo'
elif format == LabelFileFormat.CREATE_ML:
return '&CreateML', 'format_createml'

save_format = action(get_format_meta(self.label_file_format)[0],
save_format = action(self.label_file_format.text,
self.change_format, 'Ctrl+Y',
get_format_meta(self.label_file_format)[1],
self.label_file_format.icon,
get_str('changeSaveFormat'), enabled=True)

save_as = action(get_str('saveAs'), self.save_file_as,
Expand Down Expand Up @@ -548,35 +538,20 @@ def keyPressEvent(self, event):
# Draw rectangle if Ctrl is pressed
self.canvas.set_drawing_shape_to_square(True)

# Support Functions #
def set_format(self, save_format):
if save_format == FORMAT_PASCALVOC:
self.actions.save_format.setText(FORMAT_PASCALVOC)
self.actions.save_format.setIcon(new_icon("format_voc"))
self.label_file_format = LabelFileFormat.PASCAL_VOC
LabelFile.suffix = XML_EXT

elif save_format == FORMAT_YOLO:
self.actions.save_format.setText(FORMAT_YOLO)
self.actions.save_format.setIcon(new_icon("format_yolo"))
self.label_file_format = LabelFileFormat.YOLO
LabelFile.suffix = TXT_EXT

elif save_format == FORMAT_CREATEML:
self.actions.save_format.setText(FORMAT_CREATEML)
self.actions.save_format.setIcon(new_icon("format_createml"))
self.label_file_format = LabelFileFormat.CREATE_ML
LabelFile.suffix = JSON_EXT

self.actions.save_format.setText(save_format.text)
self.actions.save_format.setIcon(new_icon(save_format.icon))
self.label_file_format = save_format

def change_format(self):
if self.label_file_format == LabelFileFormat.PASCAL_VOC:
self.set_format(FORMAT_YOLO)
elif self.label_file_format == LabelFileFormat.YOLO:
self.set_format(FORMAT_CREATEML)
elif self.label_file_format == LabelFileFormat.CREATE_ML:
self.set_format(FORMAT_PASCALVOC)
if self.label_file_format in LabelFileFormat.formats:
index = self.label_file_format.formats.index(self.label_file_format)
self.label_file_format = LabelFileFormat.formats[index+1 if index+1 < len(self.label_file_format.formats) else 0]
self.set_format(self.label_file_format)
else:
raise ValueError('Unknown label file format.')
#! todo: error when only label file is change then save
#! this should not be dirty if no image is imported/ no label is plotted
self.set_dirty()

def no_shapes(self):
Expand Down Expand Up @@ -881,6 +856,7 @@ def save_labels(self, annotation_file_path):
if self.label_file is None:
self.label_file = LabelFile()
self.label_file.verified = self.canvas.verified
self.label_file.label_file_format = self.label_file_format

def format_shape(s):
return dict(label=s.label,
Expand All @@ -893,24 +869,8 @@ def format_shape(s):
shapes = [format_shape(shape) for shape in self.canvas.shapes]
# Can add different annotation formats here
try:
if self.label_file_format == LabelFileFormat.PASCAL_VOC:
if annotation_file_path[-4:].lower() != ".xml":
annotation_file_path += XML_EXT
self.label_file.save_pascal_voc_format(annotation_file_path, shapes, self.file_path, self.image_data,
self.line_color.getRgb(), self.fill_color.getRgb())
elif self.label_file_format == LabelFileFormat.YOLO:
if annotation_file_path[-4:].lower() != ".txt":
annotation_file_path += TXT_EXT
self.label_file.save_yolo_format(annotation_file_path, shapes, self.file_path, self.image_data, self.label_hist,
self.line_color.getRgb(), self.fill_color.getRgb())
elif self.label_file_format == LabelFileFormat.CREATE_ML:
if annotation_file_path[-5:].lower() != ".json":
annotation_file_path += JSON_EXT
self.label_file.save_create_ml_format(annotation_file_path, shapes, self.file_path, self.image_data,
self.label_hist, self.line_color.getRgb(), self.fill_color.getRgb())
else:
self.label_file.save(annotation_file_path, shapes, self.file_path, self.image_data,
self.line_color.getRgb(), self.fill_color.getRgb())
self.label_file.save(annotation_file_path, shapes, self.file_path, self.image_data,
self.label_hist)
print('Image:{0} -> Annotation:{1}'.format(self.file_path, annotation_file_path))
return True
except LabelFileError as e:
Expand Down Expand Up @@ -1180,32 +1140,11 @@ def counter_str(self):
def show_bounding_box_from_annotation_file(self, file_path):
if self.default_save_dir is not None:
basename = os.path.basename(os.path.splitext(file_path)[0])
xml_path = os.path.join(self.default_save_dir, basename + XML_EXT)
txt_path = os.path.join(self.default_save_dir, basename + TXT_EXT)
json_path = os.path.join(self.default_save_dir, basename + JSON_EXT)

"""Annotation file priority:
PascalXML > YOLO
"""
if os.path.isfile(xml_path):
self.load_pascal_xml_by_filename(xml_path)
elif os.path.isfile(txt_path):
self.load_yolo_txt_by_filename(txt_path)
elif os.path.isfile(json_path):
self.load_create_ml_json_by_filename(json_path, file_path)

else:
xml_path = os.path.splitext(file_path)[0] + XML_EXT
txt_path = os.path.splitext(file_path)[0] + TXT_EXT
json_path = os.path.splitext(file_path)[0] + JSON_EXT

if os.path.isfile(xml_path):
self.load_pascal_xml_by_filename(xml_path)
elif os.path.isfile(txt_path):
self.load_yolo_txt_by_filename(txt_path)
elif os.path.isfile(json_path):
self.load_create_ml_json_by_filename(json_path, file_path)

for suffix in LabelFileFormat.suffixes:
label_path = os.path.join(self.default_save_dir, basename + suffix)
if os.path.isfile(label_path):
self.load_label_by_filename(label_path)
break

def resizeEvent(self, event):
if self.canvas and not self.image.isNull()\
Expand Down Expand Up @@ -1321,15 +1260,15 @@ def open_annotation_dialog(self, _value=False):

path = os.path.dirname(ustr(self.file_path))\
if self.file_path else '.'
if self.label_file_format == LabelFileFormat.PASCAL_VOC:
if self.label_file_format == PascalVoc:
filters = "Open Annotation XML file (%s)" % ' '.join(['*.xml'])
filename = ustr(QFileDialog.getOpenFileName(self, '%s - Choose a xml file' % __appname__, path, filters))
if filename:
if isinstance(filename, (tuple, list)):
filename = filename[0]
self.load_pascal_xml_by_filename(filename)

elif self.label_file_format == LabelFileFormat.CREATE_ML:
elif self.label_file_format == CreateML:

filters = "Open Annotation JSON file (%s)" % ' '.join(['*.json'])
filename = ustr(QFileDialog.getOpenFileName(self, '%s - Choose a json file' % __appname__, path, filters))
Expand Down Expand Up @@ -1358,8 +1297,6 @@ def open_dir_dialog(self, _value=False, dir_path=None, silent=False):
self.last_open_dir = target_dir_path
self.import_dir_images(target_dir_path)
self.default_save_dir = target_dir_path
if self.file_path:
self.show_bounding_box_from_annotation_file(file_path=self.file_path)

def import_dir_images(self, dir_path):
if not self.may_continue() or not dir_path:
Expand Down Expand Up @@ -1455,7 +1392,7 @@ def open_file(self, _value=False):
return
path = os.path.dirname(ustr(self.file_path)) if self.file_path else '.'
formats = ['*.%s' % fmt.data().decode("ascii").lower() for fmt in QImageReader.supportedImageFormats()]
filters = "Image & Label files (%s)" % ' '.join(formats + ['*%s' % LabelFile.suffix])
filters = "Image & Label files (%s)" % ' '.join(formats + ['*%s' % suffix for suffix in LabelFileFormat.suffixes])
filename,_ = QFileDialog.getOpenFileName(self, '%s - Choose Image or Label file' % __appname__, path, filters)
if filename:
if isinstance(filename, (tuple, list)):
Expand Down Expand Up @@ -1485,10 +1422,10 @@ def save_file_as(self, _value=False):

def save_file_dialog(self, remove_ext=True):
caption = '%s - Choose File' % __appname__
filters = 'File (*%s)' % LabelFile.suffix
filters = 'File (*%s)' % self.label_file_format.suffix
open_dialog_path = self.current_path()
dlg = QFileDialog(self, caption, open_dialog_path, filters)
dlg.setDefaultSuffix(LabelFile.suffix[1:])
dlg.setDefaultSuffix(self.label_file_format.suffix[1:])
dlg.setAcceptMode(QFileDialog.AcceptSave)
filename_without_extension = os.path.splitext(self.file_path)[0]
dlg.selectFile(filename_without_extension)
Expand Down Expand Up @@ -1616,44 +1553,18 @@ def load_predefined_classes(self, predef_classes_file):
else:
self.label_hist.append(line)

def load_pascal_xml_by_filename(self, xml_path):
if self.file_path is None:
return
if os.path.isfile(xml_path) is False:
return

self.set_format(FORMAT_PASCALVOC)

t_voc_parse_reader = PascalVocReader(xml_path)
shapes = t_voc_parse_reader.get_shapes()
self.load_labels(shapes)
self.canvas.verified = t_voc_parse_reader.verified

def load_yolo_txt_by_filename(self, txt_path):
if self.file_path is None:
return
if os.path.isfile(txt_path) is False:
return

self.set_format(FORMAT_YOLO)
t_yolo_parse_reader = YoloReader(txt_path, self.image)
shapes = t_yolo_parse_reader.get_shapes()
print(shapes)
self.load_labels(shapes)
self.canvas.verified = t_yolo_parse_reader.verified

def load_create_ml_json_by_filename(self, json_path, file_path):
def load_label_by_filename(self, label_path):
if self.file_path is None:
return
if os.path.isfile(json_path) is False:
if os.path.isfile(label_path) is False:
return

self.set_format(FORMAT_CREATEML)

create_ml_parse_reader = CreateMLReader(json_path, file_path)
shapes = create_ml_parse_reader.get_shapes()
suffix = os.path.splitext(label_path)[1]
format_id = LabelFileFormat.suffixes.index(suffix)
file_format = LabelFileFormat.formats[format_id]
self.set_format(file_format)
shapes = self.label_file_format.read(label_path, self.image)
self.load_labels(shapes)
self.canvas.verified = create_ml_parse_reader.verified
self.canvas.verified = self.label_file_format.file_reader.verified

def copy_previous_bounding_boxes(self):
current_index = self.m_img_list.index(self.file_path)
Expand Down
Loading