Skip to content

[VectorCombine] foldShuffleOfBinops - failure to track OperandValueInfo data in getArithmeticInstrCost calls #170500

@RKSimon

Description

@RKSimon
define <16 x i16> @concat_uniform_shift_v16i16_v8i16(<8 x i16> %a0, <8 x i16> %a1) {
  %v0 = shl <8 x i16> %a0, splat (i16 7)
  %v1 = shl <8 x i16> %a1, splat (i16 7)
  %res  = shufflevector <8 x i16> %v0, <8 x i16> %v1, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
  ret <16 x i16> %res
}

-passes=vector-combine -mtriple=x86_64-- -mcpu=znver2

define <16 x i16> @concat_uniform_shift_v16i16_v8i16(<8 x i16> %a0, <8 x i16> %a1) #0 {
  %v0 = shl <8 x i16> %a0, splat (i16 7)
  %v1 = shl <8 x i16> %a1, splat (i16 7)
  %res = shufflevector <8 x i16> %v0, <8 x i16> %v1, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
  ret <16 x i16> %res
}

when it should be:

define <16 x i16> @concat_uniform_shift_v16i16_v8i16(<8 x i16> %a0, <8 x i16> %a1) #0 {
  %v = shufflevector <8 x i16> %a0, <8 x i16> %a1, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
  %res = shl <16 x i16> %v, splat (i16 7)
  ret <16 x i16> %res
}

Vectorcombine fails to concatenate these shifts as the "NewCost" getArithmeticInstrCost calls fail to include any TTI::OperandValueInfo data - both of the shift amounts are OK_UniformConstantValue, so at the very least we should merge these into OK_NonUniformConstantValue.

A possible starting point would be to add TTI::getOperandInfo calls in the binop NewCost calculation for both LHS/RHS ops - and create some kind of
OperandValueInfo::mergeInfo(X, Y) static helper that merge common info data (all constant, all powerof2 etc - although all uniform can't be done this way!).

But for this particular case, the RHS operands are identical - so we don't need to merge, we can just use the original OperandValueInfo.

foldPermuteOfBinops has a similar problem but it looks like OldCost and NewCost both use getArithmeticInstrCost - which is weird as I'd expect OldCost to use getInstructionCost at least.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions