LANGUAGE-INDEPENDENT DETECTOR FOR DETECTING AND ELIMINATING REPETITIONS AND EXCESSES OF SOFTWARE CODE

Authors

  • N. Pravorska Khmelnytskyi National University
  • L. Bedratyuk Khmelnytskyi National University
  • Yu. Forkun Khmelnytskyi National University
  • О. Yashyna Khmelnytskyi National University

DOI:

https://doi.org/10.31891/2219-9365-2021-67-1-8

Keywords:

Program code, Language independent detector, incremental approach, locally-sensitive hashing

Abstract

When developing software (software) there is a possibility that mistakes made even by developers, in the future will lead to violations of the normal operation of the software product. Corrections can usually be made at any stage of the software life cycle. However, it should be borne in mind that the detection and correction of errors in the program code in the final stages of development can have a very significant impact on the costs (both financial and time) of software development and maintenance. In addition, some errors are dangerous to human life and health if they appear during operation. Therefore, in the development of software products in their life cycle, a variety of tools have become widely used, which analyze the software code and help identify defects. A number of different problems in the source code of the system arise precisely because of the presence of a significant share of code duplication. The increase in the size of the code base, and accordingly, the increase in maintenance costs, in particular, is a consequence of duplication. In addition, when an error occurs in one of the instances of the block with duplicates (clones) and redundancies, every other block is subject to verification for the same error and the possibility of potential correction. To solve the last problem, you need not only to know the lists of duplicate blocks of code, but also to have a significant amount of time required to go through all the instances. Finally, duplication causes problems in terms of understanding the program code and complications of future refactoring. It is important that the foundation for future manual or automatic refactoring, which leads to a cleaner and easier to maintain code, is the automatic detection of clones (blocks with repetitions and redundancies) in modern software projects. In this regard, the various methods of detecting blocks with repetitions proposed today, mainly work with the entire code base of the system. Regardless of the magnitude of the changes, similar methods for each version of the source code use the entire system as input. This approach may work well for stable outdated systems that are occasionally updated. However, at the current stage of IT development, this is not an ideal option due to flexible software development. In the process of detecting blocks with repetitions and redundancies for the next check, unprofitable calculations will be performed, which are added to the total execution time.

During the software development, there is a probability that in the program code there may be errors that allow even developers specialists, assuming duplicate parts of the code. In order to eliminate future failures in the functioning of the software, there are a number of automated tools that are capable of evaluating repairability based on a number of predefined criteria, such as the scope and complexity of the code, communication of modules, etc. Automatic detection of repetitions and excesses in the software code of modern projects becomes the basis for future manual or automatic refactoring, which leads to a cleaner and convenient code accompaniment. One of these instruments is the proposed linguistic detector that uses an incremental approach and improving it using locally-sensitive hashing.

Published

2021-05-27

How to Cite

Pravorska Н., Bedratyuk Л., Forkun Ю., & Yashyna О. (2021). LANGUAGE-INDEPENDENT DETECTOR FOR DETECTING AND ELIMINATING REPETITIONS AND EXCESSES OF SOFTWARE CODE. MEASURING AND COMPUTING DEVICES IN TECHNOLOGICAL PROCESSES, (1), 56–61. https://doi.org/10.31891/2219-9365-2021-67-1-8