Identification of Structural Clones Using Association Rule and Clustering
Code clones are similar program structures of considerable size and significant similarity. Simple clone set formed by similar code fragments in software. The problem is the huge number of simple clones typically reported by clone detection tools. We observed that recurring patterns of simple clones – so-called structural clones - often indicate the presence of interesting design-level similarities. We propose a technique to detect some specific types of structural clones from the repeated combinations of co-located simple clone. We find the patterns of co-occurring clones in different files using the frequent item set mining (FIM) technique. Finally, we perform file clustering to detect those clusters of highly similar files that are likely to contribute to a design-level similarity pattern. We implement the structural clone detection technique in a tool called CCFinder. Detection of clones provides several benefits in terms of maintenance, program understanding, reengineering and reuse.