You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+31-12Lines changed: 31 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,17 +15,13 @@ This is the repo for our [TMLR](https://jmlr.org/tmlr/) survey [Unifying the Per
15
15
16
16
## News
17
17
18
-
🔥🔥🔥 [2025/09/22] Featured papers:
18
+
🔥🔥🔥 [2025/09/26] Featured papers:
19
19
20
20
- 🔥🔥 [CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects](https://arxiv.org/abs/2509.14856) from Ant Group.
21
21
22
-
- 🔥🔥 [SWE-QA: Can Language Models Answer Repository-level Code Questions?](https://arxiv.org/abs/2509.14635) from Shanghai Jiao Tong University.
22
+
- 🔥🔥 [SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?](https://arxiv.org/abs/2509.16941) from Scale AI.
23
23
24
-
- 🔥 [LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering](https://arxiv.org/abs/2509.09614) from Salesforce AI Research.
25
-
26
-
- 🔥 [Astra: A Multi-Agent System for GPU Kernel Performance Optimization](https://arxiv.org/abs/2509.07506) from Stanford University.
27
-
28
-
- 🔥 [GRACE: Graph-Guided Repository-Aware Code Completion through Hierarchical Code Fusion](https://arxiv.org/abs/2509.05980) from Zhejiang University.
24
+
- 🔥 [SWE-QA: Can Language Models Answer Repository-level Code Questions?](https://arxiv.org/abs/2509.14635) from Shanghai Jiao Tong University.
29
25
30
26
🔥🔥🔥 [2025/08/24] 29 papers from ICML 2025 have been added. Search for the keyword "ICML 2025"!
31
27
@@ -35,14 +31,10 @@ This is the repo for our [TMLR](https://jmlr.org/tmlr/) survey [Unifying the Per
35
31
36
32
🔥🔥🔥 [2025/09/22] News from Codefuse
37
33
38
-
-[CGM (Code Graph Model)](https://arxiv.org/abs/2505.16901) is accepted to NeurIPS 2025. CGM currently ranks 1st among open-source models on [SWE-Bench leaderboard](https://www.swebench.com/). [[repo](https://github.com/codefuse-ai/CodeFuse-CGM)]
34
+
-[CGM (Code Graph Model)](https://arxiv.org/abs/2505.16901) is accepted to NeurIPS 2025. CGM currently ranks 1st among open-weight models on [SWE-Bench-Lite leaderboard](https://www.swebench.com/). [[repo](https://github.com/codefuse-ai/CodeFuse-CGM)]
39
35
40
36
-[GALLa: Graph Aligned Large Language Models](https://arxiv.org/abs/2409.04183) is accepted by ACL 2025 main conference. [[repo](https://github.com/codefuse-ai/GALLa)]
If you find a paper to be missing from this repository, misplaced in a category, or lacking a reference to its journal/conference information, please do not hesitate to create an issue.
@@ -693,6 +685,8 @@ These models apply Instruction Fine-Tuning techniques to enhance the capacities
693
685
694
686
67. "SCoGen: Scenario-Centric Graph-Based Synthesis of Real-World Code Problems" [2025-09][[paper](https://arxiv.org/abs/2509.14281)]
1.**CompCoder**: "Compilable Neural Code Generation with Compiler Feedback" [2022-03][ACL 2022][[paper](https://arxiv.org/abs/2203.05132)]
@@ -761,6 +755,8 @@ These models apply Instruction Fine-Tuning techniques to enhance the capacities
761
755
762
756
33. "Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization" [2025-09][[paper](https://arxiv.org/abs/2509.12434)]
763
757
758
+
34. "DELTA-Code: How Does RL Unlock and Transfer New Programming Algorithms in LLMs?" [2025-09][[paper](https://arxiv.org/abs/2509.21016)]
759
+
764
760
## 3. When Coding Meets Reasoning
765
761
766
762
### 3.1 Coding for Reasoning
@@ -1077,6 +1073,8 @@ These models apply Instruction Fine-Tuning techniques to enhance the capacities
1077
1073
1078
1074
76. "GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging" [2025-08][[paper](https://arxiv.org/abs/2508.18993)]
1079
1075
1076
+
77.**MapCoder-Lite**: "MapCoder-Lite: Squeezing Multi-Agent Coding into a Single Small LLM" [2025-09][[paper](https://arxiv.org/abs/2509.17489)]
1077
+
1080
1078
### 3.4 Interactive Coding
1081
1079
1082
1080
- "Interactive Program Synthesis" [2017-03][[paper](https://arxiv.org/abs/1703.03539)]
@@ -1185,6 +1183,8 @@ These models apply Instruction Fine-Tuning techniques to enhance the capacities
- "SR-Eval: Evaluating LLMs on Code Generation under Stepwise Requirement Refinement" [2025-09][[paper](https://arxiv.org/abs/2509.18808)]
1187
+
1188
1188
### 3.5 Frontend Navigation
1189
1189
1190
1190
- "MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding" [2021-10][ACL 2022][[paper](https://arxiv.org/abs/2110.08518)]
@@ -1295,6 +1295,8 @@ These models apply Instruction Fine-Tuning techniques to enhance the capacities
1295
1295
1296
1296
- "UI-Venus Technical Report: Building High-performance UI Agents with RFT" [2025-08][[paper](https://arxiv.org/abs/2508.10833)]
## 4. Code LLM for Low-Resource, Low-Level, and Domain-Specific Languages
1299
1301
1300
1302
-[**Ruby**] "On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages" [2022-04][ICPC 2022][[paper](https://arxiv.org/abs/2204.09653)]
@@ -1483,6 +1485,8 @@ These models apply Instruction Fine-Tuning techniques to enhance the capacities
1483
1485
1484
1486
-[**CUDA**] "Astra: A Multi-Agent System for GPU Kernel Performance Optimization" [2025-09][[paper](https://arxiv.org/abs/2509.07506)]
1485
1487
1488
+
-[**LaTeX**] "Table2LaTeX-RL: High-Fidelity LaTeX Code Generation from Table Images via Reinforced Multimodal Language Models" [2025-09][[paper](https://arxiv.org/abs/2509.17589)]
1489
+
1486
1490
## 5. Methods/Models for Downstream Tasks
1487
1491
1488
1492
For each task, the first column contains non-neural methods (e.g. n-gram, TF-IDF, and (occasionally) static program analysis); the second column contains non-Transformer neural methods (e.g. LSTM, CNN, GNN); the third column contains Transformer based methods (e.g. BERT, GPT, T5).
@@ -2225,6 +2229,10 @@ For each task, the first column contains non-neural methods (e.g. n-gram, TF-IDF
- "CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion" [2025-09][[paper](https://arxiv.org/abs/2509.16112)]
2233
+
2234
+
- "RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation" [2025-09][[paper](https://arxiv.org/abs/2509.16198)]
2235
+
2228
2236
### Issue Resolution
2229
2237
2230
2238
- "SWE-bench: Can Language Models Resolve Real-World GitHub Issues?" [2023-10][ICLR 2024][[paper](https://arxiv.org/abs/2310.06770)]
@@ -3183,6 +3191,8 @@ For each task, the first column contains non-neural methods (e.g. n-gram, TF-IDF
3183
3191
3184
3192
- "Code-SPA: Style Preference Alignment to Large Language Models for Effective and Robust Code Debugging" [2025-07][ACL 2025 Findings][[paper](https://aclanthology.org/2025.findings-acl.912/)]
3185
3193
3194
+
- "LLaVul: A Multimodal LLM for Interpretable Vulnerability Reasoning about Source Code" [2025-09][[paper](https://arxiv.org/abs/2509.17337)]
@@ -3417,6 +3429,8 @@ For each task, the first column contains non-neural methods (e.g. n-gram, TF-IDF
3417
3429
3418
3430
- "CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects" [2025-09][[paper](https://arxiv.org/abs/2509.14856)]
3419
3431
3432
+
- "Fine-Tuning LLMs to Analyze Multiple Dimensions of Code Review: A Maximum Entropy Regulated Long Chain-of-Thought Approach" [2025-09][[paper](https://arxiv.org/abs/2509.21170)]
3433
+
3420
3434
### Log Analysis
3421
3435
3422
3436
- "LogStamp: Automatic Online Log Parsing Based on Sequence Labelling" [2022-08][[paper](https://arxiv.org/abs/2208.10282)]
@@ -3707,6 +3721,8 @@ For each task, the first column contains non-neural methods (e.g. n-gram, TF-IDF
3707
3721
3708
3722
- "A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code" [2025-08][[paper](https://arxiv.org/abs/2508.18106)]
3709
3723
3724
+
- "Localizing Malicious Outputs from CodeLLM" [2025-09][[paper](https://arxiv.org/abs/2509.17070)]
0 commit comments