CloneWorks
CloneWorks is a fast and flexible near-miss clone detector for large-scale clone detection experiments. It gives the user full control over the source normalizations, transformations and processing before clone detection, for general or domain-specific clone detection experiments. CloneWorks is very fast, executing for 250MLOC in just 4 hours on an average work-station. CloneWorks' input partitioning strategy allows it to handle any input size within the memory constraints of an average workstation.
CloneWorks is available for download. You can download just the tool, or the tool pre-configured within a virtual machine.
Acknowledgements
CloneWorks makes use of techniques and technology from the previous and related work.
CloneWorks used TXL-based source transformation code, which was originally used with the NiCad (Chanchal Roy, James Cordy) clone detector. CloneWorks contains a number of new transformations implemented in TXL, and can also make use of transformations implemented in other languages.
CloneWorks makes use of the sub-block filtering heuristic along with a partial indexing approach for scalability. This approach was introduced with SourcererCC (ICSE'16: Hitesh Sajnani ; Vaibhav Saini ; Jeffrey Svajlenko ; Chanchal K. Roy ; Cristina V. Lopes). The latest version of SourcererCC is available on github.
CloneWorks uses an inverted index with input partitioning strategy from our Shuffling Framework to achieve scalability in memory requirements.
Downloads
github - Latest source-code.
Version-0.3 - New output formatter. ICSE'17 demonstration version.
Version-0.2 - Bug and performance fixes.
Version-0.1 - Initial release of CloneWorks.
VM Version
Version-0.2 - Username: 'cloneworks', password: 'clones'.
Manual
While CloneWorks usage is completely described in the readme files, an in-progress formatted manual is available here.
Use-cases
We are working on a document of example use-cases, and inspiration for CloneWork usage here.
Problems or Questions?
Please contact us if you have questions about CloneWorks and it usage, or have encountered a problem, bug or performance issue: [email protected]
CloneWorks is available for download. You can download just the tool, or the tool pre-configured within a virtual machine.
Acknowledgements
CloneWorks makes use of techniques and technology from the previous and related work.
CloneWorks used TXL-based source transformation code, which was originally used with the NiCad (Chanchal Roy, James Cordy) clone detector. CloneWorks contains a number of new transformations implemented in TXL, and can also make use of transformations implemented in other languages.
CloneWorks makes use of the sub-block filtering heuristic along with a partial indexing approach for scalability. This approach was introduced with SourcererCC (ICSE'16: Hitesh Sajnani ; Vaibhav Saini ; Jeffrey Svajlenko ; Chanchal K. Roy ; Cristina V. Lopes). The latest version of SourcererCC is available on github.
CloneWorks uses an inverted index with input partitioning strategy from our Shuffling Framework to achieve scalability in memory requirements.
Downloads
github - Latest source-code.
Version-0.3 - New output formatter. ICSE'17 demonstration version.
Version-0.2 - Bug and performance fixes.
Version-0.1 - Initial release of CloneWorks.
VM Version
Version-0.2 - Username: 'cloneworks', password: 'clones'.
Manual
While CloneWorks usage is completely described in the readme files, an in-progress formatted manual is available here.
Use-cases
We are working on a document of example use-cases, and inspiration for CloneWork usage here.
Problems or Questions?
Please contact us if you have questions about CloneWorks and it usage, or have encountered a problem, bug or performance issue: [email protected]