Skip to content

codegen-units=1 + LTO causes 3-5% performance regression for sequential code #148670

@naoNao89

Description

@naoNao89

Code

I tried this code:

https://github.com/uutils/coreutils/blob/b2d0773356063da0d8cade4d7d14b5392df75556/src/uu/seq/src/seq.rs#L374-L385

The bug trigger:

Line 374: BigUint variable
    ↓
Line 377: Loop (1M+ iterations)
    ↓
Line 383: Arithmetic operation
    ↓
With codegen-units=1 + LTO
    ↓
LLVM over-inlines Line 383
    ↓
Register pressure (16 GPRs on x86_64)
    ↓
Stack spilling
    ↓
-5% performance regression

I expected to see this happen: "may improve performance"

Instead, this happened: 3-5% slower

uutils/coreutils#9161:

  • seq_integers: -5.06% (26.1ms → 27.5ms)
  • seq_with_step: -4.98% (13.3ms → 14.0ms)
  • expand_custom_tabstops: -2.73% (36.6ms → 37.6ms)
  • cut_fields_custom_delim: +32.29% (40.7ms → 30.8ms)
  • cut_fields_tab: +26.13% (34.1ms → 27.0ms)
  • Overall: -10.02% (22 improvements, 10 regressions)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-LTOArea: Link-time optimization (LTO)C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-prioritizeIssue: Indicates that prioritization has been requested for this issue.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions