Zirui Fang
October 2024 (Updated November 2025)
This research project implements and evaluates an automated testing methodology for verifying data structure reliability through Randoop-based test generation. The study involves:
- Go-to-Java Translation: Using LLM-assisted translation of Go data structure implementations
- Automated Test Generation: Applying Randoop for systematic test generation with representation invariants
- Coverage Analysis: Comprehensive evaluation of test effectiveness across multiple metrics
- Bug Detection: Verification of implementation correctness through error-revealing tests
- Dynamic List - Generic, array-based dynamic list with automatic resizing
- AVL Tree - Self-balancing binary search tree with guaranteed O(log n) operations
Randoop-Go-Bug-Detection-Automated-Testing-of-Go-Data-Structures-via-Java-Translation/
βββ go_to_java/ # Translated Java implementations
β βββ List.java # Dynamic List implementation
β βββ AvlTree.java # AVL Tree implementation
β βββ array_list.go # Original Go List source
β βββ avltree.go # Original Go AVL Tree source
βββ repok/ # Representation invariant implementations
β βββ list_rep.java # List repOK method
β βββ avltree_rep.java # AVL Tree repOK method
βββ tests/ # Generated test suites
β βββ RegressionTest*.java # Randoop regression tests
β βββ ErrorTest.java # Error-revealing tests (if any)
βββ docs/
β βββ Randoop_Project_Report_V2.pdf # Complete research report
βββ README.md
| Data Structure | Regression Tests | Error Tests | Flaky Methods | Invalid Tests |
|---|---|---|---|---|
| List | 12 files | 0 | 0 | 0 |
| AVL Tree | 14 files | 0 | 0 | 0 |
| Data Structure | Method Coverage | Line Coverage | Branch Coverage | Condition Coverage |
|---|---|---|---|---|
| List | 100% (3/3) | 94% (32/34) | 89% (104/116) | 68% (81/118) |
| AVL Tree | 72% (16/22) | 61% (51/100) | 51% (34/66) | - |
- β Zero error-revealing tests generated, indicating robust implementations
- β No flaky methods detected across multiple test runs
- β High line coverage achieved for both data structures
- β Effective repOK validation of representation invariants
- β Successful LLM-assisted translation from Go to Java
- Used DeepSeek LLM for Go-to-Java translation
- Manual validation and minor corrections applied
- Preservation of original semantics and behavior
- Implemented comprehensive repOK methods for both data structures
- Verified BST properties, balance conditions, and structural integrity
- Used
@CheckRepannotations for Randoop integration
- Applied Randoop 4.3.3 for feedback-directed random testing
- Generated both regression and error-revealing tests
- Used systematic sequence generation and test filtering
- Java 17+
- Randoop 4.3.3
- IntelliJ IDEA (recommended)
-
Clone the repository
git clone https://github.com/Ray221f/Randoop-Go-Bug-Detection-Automated-Testing-of-Go-Data-Structures-via-Java-Translation.git cd Randoop-Go-Bug-Detection-Automated-Testing-of-Go-Data-Structures-via-Java-Translation -
Download and Configure Randoop
# Download Randoop JAR wget https://github.com/randoop/randoop/releases/download/v4.3.3/randoop-all-4.3.3.jar # Add to project dependencies (IntelliJ IDEA) # Project Structure β Modules β Dependencies β Add JAR (Compile scope)
-
Generate Tests
# Compile with Randoop javac -cp .:randoop-all-4.3.3.jar List.java AvlTree.java # Generate tests for List java -cp .:randoop-all-4.3.3.jar randoop.main.Main gentests \ --testclass=List \ --junit-output-dir=tests # Generate tests for AVL Tree java -cp .:randoop-all-4.3.3.jar randoop.main.Main gentests \ --testclass=AvlTree \ --junit-output-dir=tests
-
Run Coverage Analysis (IntelliJ IDEA)
- Right-click test files β "Run with Coverage"
- Analyze coverage metrics in IDE coverage tool
- Identify uncovered branches and conditions
For comprehensive implementation details, refer to the project report:
The report includes:
- Randoop Setup Guide: Step-by-step configuration instructions
- repOK Implementation: Technical specifications and examples
- Debugging Methodology: Systematic test failure analysis
- Coverage Analysis: Detailed metrics interpretation
- LLM Translation Insights: Code translation challenges and solutions
This project demonstrates:
- Cross-Language Verification: Practical methodology for verifying data structures across programming languages
- Representation Invariant Effectiveness: Enhanced test generation through explicit specification of intended behavior
- LLM-Assisted Analysis: Emerging applications of large language models in software engineering research
- Coverage Insights: Identification of limitations in automated test generation for complex data structures
- Translation Quality: LLMs can effectively handle data structure translation with minor manual corrections
- Test Effectiveness: Randoop achieves high line coverage but struggles with complex branch conditions
- Invariant Value: repOK methods significantly enhance bug detection capabilities
- Implementation Robustness: Both translated data structures demonstrated zero defects under extensive testing
Contributions are welcome in the following areas:
- Improving test coverage for complex branch conditions
- Enhancing repOK implementations for additional invariant checks
- Adding new data structures to the analysis framework
- Optimizing the LLM translation and validation pipeline
- Developing automated validation for translated code
If you use this work in your research, please cite:
@techreport{fang2024randoop,
title={Randoop-Based Automated Testing of Go Data Structures via Java Translation},
author={Fang, Zirui},
year={2025},
institution={Academic Research Project}
}- Randoop: Automatic Unit Test Generation for Java - ICSE 2007
- DeepSeek LLM: https://www.deepseek.com/
- Go Programming Language Specification
- Java Collections Framework Documentation
This project demonstrates the practical application of automated test generation techniques to verify the reliability of fundamental software components across programming language boundaries.