You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the repo for our [TMLR](https://jmlr.org/tmlr/) survey [Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code](https://arxiv.org/abs/2311.07989) - a comprehensive review of LLM researches for code. Works in each category are ordered chronologically. If you have a basic understanding of machine learning but are new to NLP, we also provide a list of recommended readings in [section 9](#9-recommended-readings). If you refer to this repo, please cite:
4
8
5
9
```
@@ -17,15 +21,11 @@ This is the repo for our [TMLR](https://jmlr.org/tmlr/) survey [Unifying the Per
17
21
18
22
🔥🔥🔥 [2025/10/13] Featured papers:
19
23
20
-
- 🔥🔥 [LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?](https://arxiv.org/abs/2510.09595) from University of Michigan.
21
-
22
-
- 🔥🔥 [Scaling Laws for Code: A More Data-Hungry Regime](https://arxiv.org/abs/2510.08702) from Harbin Institute of Technology.
23
-
24
-
- 🔥🔥 [BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution](https://arxiv.org/abs/2510.08697) from Monash University.
24
+
- 🔥 [LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?](https://arxiv.org/abs/2510.09595) from University of Michigan.
25
25
26
-
- 🔥 [CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects](https://arxiv.org/abs/2509.14856) from Ant Group.
26
+
- 🔥 [Scaling Laws for Code: A More Data-Hungry Regime](https://arxiv.org/abs/2510.08702) from Harbin Institute of Technology.
27
27
28
-
- 🔥🔥 [EvoEngineer: Mastering Automated CUDA Kernel Code Evolution with Large Language Models](https://arxiv.org/abs/2510.03760) from City University of Hong Kong.
28
+
- 🔥[BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution](https://arxiv.org/abs/2510.08697) from Monash University.
29
29
30
30
🔥🔥 [2025/08/24] 29 papers from ICML 2025 have been added. Search for the keyword "ICML 2025"!
31
31
@@ -35,6 +35,10 @@ This is the repo for our [TMLR](https://jmlr.org/tmlr/) survey [Unifying the Per
35
35
36
36
🔥🔥🔥 [2025/09/22] News from Codefuse
37
37
38
+
- We released [F2LLM](https://arxiv.org/abs/2510.02294), a fully open embedding model striking a strong balance between model size, training data, and embedding performance. [[code](https://github.com/codefuse-ai/CodeFuse-Embeddings)][[model & data](https://huggingface.co/collections/codefuse-ai/codefuse-embeddings-68d4b32da791bbba993f8d14)]
39
+
40
+
- We released a new benchmark focusing on code review: [CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects](https://arxiv.org/abs/2509.14856)
41
+
38
42
-[CGM (Code Graph Model)](https://arxiv.org/abs/2505.16901) is accepted to NeurIPS 2025. CGM currently ranks 1st among open-weight models on [SWE-Bench-Lite leaderboard](https://www.swebench.com/). [[repo](https://github.com/codefuse-ai/CodeFuse-CGM)]
39
43
40
44
-[GALLa: Graph Aligned Large Language Models](https://arxiv.org/abs/2409.04183) is accepted by ACL 2025 main conference. [[repo](https://github.com/codefuse-ai/GALLa)]
@@ -3345,6 +3349,8 @@ For each task, the first column contains non-neural methods (e.g. n-gram, TF-IDF
3345
3349
3346
3350
- "Speed Up Your Code: Progressive Code Acceleration Through Bidirectional Tree Editing" [2025-07][ACL 2025][[paper](https://aclanthology.org/2025.acl-long.1387/)]
3347
3351
3352
+
- "ECO: Enhanced Code Optimization via Performance-Aware Prompting for Code-LLMs" [2025-10][[paper](https://arxiv.org/abs/2510.10517)]
3353
+
3348
3354
### Binary Analysis and Decompilation
3349
3355
3350
3356
- "Using recurrent neural networks for decompilation" [2018-03][SANER 2018][[paper](https://ieeexplore.ieee.org/document/8330222)]
@@ -4089,6 +4095,8 @@ For each task, the first column contains non-neural methods (e.g. n-gram, TF-IDF
4089
4095
4090
4096
- "Optimizing Token Choice for Code Watermarking: A RL Approach" [2025-08][[paper](https://arxiv.org/abs/2508.11925)]
4091
4097
4098
+
- "Large Language Models Are Effective Code Watermarkers" [2025-10][[paper](https://arxiv.org/abs/2510.11251)]
4099
+
4092
4100
### Others
4093
4101
4094
4102
- "Code Membership Inference for Detecting Unauthorized Data Use in Code Pre-trained Language Models" [2023-12][EMNLP 2024 Findings][[paper](https://arxiv.org/abs/2312.07200)]
0 commit comments