From fca9b3e513d357e28d90617da958ec9e05a7ac1a Mon Sep 17 00:00:00 2001 From: erfang Date: Mon, 27 Oct 2025 09:49:59 +0000 Subject: [PATCH 1/2] 8370863: VectorAPI: Optimize the VectorMaskCast chain in specific patterns MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `VectorMaskCastNode` is used to cast a vector mask from one type to another type. The cast may be generated by calling the vector API `cast` or generated by the compiler. For example, some vector mask operations like `trueCount` require the input mask to be integer types, so for floating point type masks, the compiler will cast the mask to the corresponding integer type mask automatically before doing the mask operation. This kind of cast is very common. If the vector element size is not changed, the `VectorMaskCastNode` don't generate code, otherwise code will be generated to extend or narrow the mask. This IR node is not free no matter it generates code or not because it may block some optimizations. For example: 1. `(VectorStoremask (VectorMaskCast (VectorLoadMask x)))` The middle `VectorMaskCast` prevented the following optimization: `(VectorStoremask (VectorLoadMask x)) => (x)` 2. `(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`, which blocks the optimization `(VectorMaskToLong (VectorLongToMask x)) => (x)`. In these IR patterns, the value of the input `x` is not changed, so we can safely do the optimization. But if the input value is changed, we can't eliminate the cast. The general idea of this PR is introducing an `uncast_mask` helper function, which can be used to uncast a chain of `VectorMaskCastNode`, like the existing `Node::uncast(bool)` function. The funtion returns the first non `VectorMaskCastNode`. The intended use case is when the IR pattern to be optimized may contain one or more consecutive `VectorMaskCastNode` and this does not affect the correctness of the optimization. Then this function can be called to eliminate the `VectorMaskCastNode` chain. Current optimizations related to `VectorMaskCastNode` include: 1. `(VectorMaskCast (VectorMaskCast x)) => (x)`, see JDK-8356760. 2. `(XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) => (VectorMaskCast (VectorMaskCmp src1 src2 ncond))`, see JDK-8354242. This PR does the following optimizations: 1. Extends the optimization pattern `(VectorMaskCast (VectorMaskCast x)) => (x)` as `(VectorMaskCast (VectorMaskCast  ... (VectorMaskCast x))) => (x)`. Because as long as types of the head and tail `VectorMaskCastNode` are consistent, the optimization is correct. 2. Supports a new optimization pattern `(VectorStoreMask (VectorMaskCast ... (VectorLoadMask x))) => (x)`. Since the value before and after the pattern is a boolean vector, it remains unchanged as long as the vector length remains the same, and this is guranteed in the api level. I conducted some simple research on different mask generation methods and mask operations, and obtained the following table, which includes some potential optimization opportunities that may use this `uncast_mask` function. ``` mask_gen\op toLong anyTrue allTrue trueCount firstTrue lastTrue compare N/A N/A N/A N/A N/A N/A maskAll TBI TBI TBI TBI TBI TBI fromLong TBI TBI N/A TBI TBI TBI mask_gen\op and or xor andNot not laneIsSet compare N/A N/A N/A N/A TBI N/A maskAll TBI TBI TBI TBI TBI TBI fromLong N/A N/A N/A N/A TBI TBI ``` `TBI` indicated that there may be potential optimizations here that require further investigation. Benchmarks: On a Nvidia Grace machine with 128-bit SVE2: ``` Benchmark Unit Before Error After Error Uplift microMaskLoadCastStoreByte64 ops/us 59.23 0.21 148.12 0.07 2.50 microMaskLoadCastStoreDouble128 ops/us 2.43 0.00 38.31 0.01 15.73 microMaskLoadCastStoreFloat128 ops/us 6.19 0.00 75.67 0.11 12.22 microMaskLoadCastStoreInt128 ops/us 6.19 0.00 75.67 0.03 12.22 microMaskLoadCastStoreLong128 ops/us 2.43 0.00 38.32 0.01 15.74 microMaskLoadCastStoreShort64 ops/us 28.89 0.02 75.60 0.09 2.62 ``` On a Nvidia Grace machine with 128-bit NEON: ``` Benchmark Unit Before Error After Error Uplift microMaskLoadCastStoreByte64 ops/us 75.75 0.19 149.74 0.08 1.98 microMaskLoadCastStoreDouble128 ops/us 8.71 0.03 38.71 0.05 4.44 microMaskLoadCastStoreFloat128 ops/us 24.05 0.03 76.49 0.05 3.18 microMaskLoadCastStoreInt128 ops/us 24.06 0.02 76.51 0.05 3.18 microMaskLoadCastStoreLong128 ops/us 8.72 0.01 38.71 0.02 4.44 microMaskLoadCastStoreShort64 ops/us 24.64 0.01 76.43 0.06 3.10 ``` On an AMD EPYC 9124 16-Core Processor with AVX3: ``` Benchmark Unit Before Error After Error Uplift microMaskLoadCastStoreByte64 ops/us 82.13 0.31 115.14 0.08 1.40 microMaskLoadCastStoreDouble128 ops/us 0.32 0.00 0.32 0.00 1.01 microMaskLoadCastStoreFloat128 ops/us 42.18 0.05 57.56 0.07 1.36 microMaskLoadCastStoreInt128 ops/us 42.19 0.01 57.53 0.08 1.36 microMaskLoadCastStoreLong128 ops/us 0.30 0.01 0.32 0.00 1.05 microMaskLoadCastStoreShort64 ops/us 42.18 0.05 57.59 0.01 1.37 ``` On an AMD EPYC 9124 16-Core Processor with AVX2: ``` Benchmark Unit Before Error After Error Uplift microMaskLoadCastStoreByte64 ops/us 73.53 0.20 114.98 0.03 1.56 microMaskLoadCastStoreDouble128 ops/us 0.29 0.01 0.30 0.00 1.00 microMaskLoadCastStoreFloat128 ops/us 30.78 0.14 57.50 0.01 1.87 microMaskLoadCastStoreInt128 ops/us 30.65 0.26 57.50 0.01 1.88 microMaskLoadCastStoreLong128 ops/us 0.30 0.00 0.30 0.00 0.99 microMaskLoadCastStoreShort64 ops/us 24.92 0.00 57.49 0.01 2.31 ``` On an AMD EPYC 9124 16-Core Processor with AVX1: ``` Benchmark Unit Before Error After Error Uplift microMaskLoadCastStoreByte64 ops/us 79.68 0.01 248.49 0.91 3.12 microMaskLoadCastStoreDouble128 ops/us 0.28 0.00 0.28 0.00 1.00 microMaskLoadCastStoreFloat128 ops/us 31.11 0.04 95.48 2.27 3.07 microMaskLoadCastStoreInt128 ops/us 31.10 0.03 99.94 1.87 3.21 microMaskLoadCastStoreLong128 ops/us 0.28 0.00 0.28 0.00 0.99 microMaskLoadCastStoreShort64 ops/us 31.11 0.02 94.97 2.30 3.05 ``` This PR was tested on 128-bit, 256-bit, and 512-bit (QEMU) aarch64 environments, and two 512-bit x64 machines with various configurations, including sve2, sve1, neon, avx3, avx2, avx1, sse4 and sse3, all tests passed. --- src/hotspot/share/opto/vectornode.cpp | 37 ++- src/hotspot/share/opto/vectornode.hpp | 1 + .../vectorapi/VectorMaskCastIdentityTest.java | 51 ++- .../vectorapi/VectorMaskCastTest.java | 188 ++++++++---- .../vectorapi/VectorMaskToLongTest.java | 16 +- .../VectorStoreMaskIdentityTest.java | 290 ++++++++++++++++++ .../vector/VectorStoreMaskBenchmark.java | 83 +++++ 7 files changed, 582 insertions(+), 84 deletions(-) create mode 100644 test/hotspot/jtreg/compiler/vectorapi/VectorStoreMaskIdentityTest.java create mode 100644 test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java diff --git a/src/hotspot/share/opto/vectornode.cpp b/src/hotspot/share/opto/vectornode.cpp index a49f3d24fd4ac..bc79399900c30 100644 --- a/src/hotspot/share/opto/vectornode.cpp +++ b/src/hotspot/share/opto/vectornode.cpp @@ -1046,6 +1046,25 @@ Node* VectorNode::Ideal(PhaseGVN* phase, bool can_reshape) { return nullptr; } +// Traverses a chain of VectorMaskCast and returns the first non VectorMaskCast node. +// +// Due to the unique nature of vector masks, for specific IR patterns, +// VectorMaskCast does not affect the output results. For example: +// (VectorStoreMask (VectorMaskCast ... (VectorLoadMask x))) => (x) +// x remains to be a bool vector with no changes. +// This function can be used to eliminate the VectorMaskCast in such patterns. +Node* VectorNode::uncast_mask(Node* n) { + while (n->Opcode() == Op_VectorMaskCast) { + Node* in1 = n->in(1); + if (!in1->isa_Vector()) { + break; + } + assert(n->as_Vector()->length() == in1->as_Vector()->length(), "vector length must match"); + n = in1; + } + return n; +} + // Return initial Pack node. Additional operands added with add_opd() calls. PackNode* PackNode::make(Node* s, uint vlen, BasicType bt) { const TypeVect* vt = TypeVect::make(bt, vlen); @@ -1452,10 +1471,11 @@ Node* VectorLoadMaskNode::Identity(PhaseGVN* phase) { Node* VectorStoreMaskNode::Identity(PhaseGVN* phase) { // Identity transformation on boolean vectors. - // VectorStoreMask (VectorLoadMask bv) elem_size ==> bv + // VectorStoreMask (VectorMaskCast ... VectorLoadMask bv) elem_size ==> bv // vector[n]{bool} => vector[n]{t} => vector[n]{bool} - if (in(1)->Opcode() == Op_VectorLoadMask) { - return in(1)->in(1); + Node* in1 = VectorNode::uncast_mask(in(1)); + if (in1->Opcode() == Op_VectorLoadMask && length() == in1->as_Vector()->length()) { + return in1->in(1); } return this; } @@ -1885,11 +1905,12 @@ Node* VectorMaskOpNode::Ideal(PhaseGVN* phase, bool can_reshape) { } Node* VectorMaskCastNode::Identity(PhaseGVN* phase) { - Node* in1 = in(1); - // VectorMaskCast (VectorMaskCast x) => x - if (in1->Opcode() == Op_VectorMaskCast && - vect_type()->eq(in1->in(1)->bottom_type())) { - return in1->in(1); + // (VectorMaskCast (VectorMaskCast ... x)) => (x) + // If the types of the input and output nodes in a VectorMaskCast chain are + // exactly the same, the intermediate VectorMaskCast nodes can be eliminated. + Node* n = VectorNode::uncast_mask(this); + if (vect_type()->eq(n->bottom_type())) { + return n; } return this; } diff --git a/src/hotspot/share/opto/vectornode.hpp b/src/hotspot/share/opto/vectornode.hpp index 427aeff53fcf9..91e369830ac6a 100644 --- a/src/hotspot/share/opto/vectornode.hpp +++ b/src/hotspot/share/opto/vectornode.hpp @@ -138,6 +138,7 @@ class VectorNode : public TypeNode { static bool is_scalar_op_that_returns_int_but_vector_op_returns_long(int opc); static bool is_reinterpret_opcode(int opc); + static Node* uncast_mask(Node* n); static void trace_new_vector(Node* n, const char* context) { #ifdef ASSERT diff --git a/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastIdentityTest.java b/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastIdentityTest.java index e66b16f053b77..133cc1198b0c7 100644 --- a/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastIdentityTest.java +++ b/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastIdentityTest.java @@ -23,9 +23,9 @@ /* * @test -* @bug 8356760 +* @bug 8356760 8370863 * @library /test/lib / -* @summary Optimize VectorMask.fromLong for all-true/all-false cases +* @summary test VectorMaskCast Identity() optimizations * @modules jdk.incubator.vector * * @run driver compiler.vectorapi.VectorMaskCastIdentityTest @@ -49,9 +49,14 @@ public class VectorMaskCastIdentityTest { } @Test - @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 2" }, applyIfCPUFeatureOr = {"asimd", "true"}) + @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 2" }, + applyIfCPUFeature = {"sve", "true"}) + @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureAnd = {"asimd", "true", "sve", "false"}) public static int testTwoCastToDifferentType() { // The types before and after the two casts are not the same, so the cast cannot be eliminated. + // On NEON, VectorMaskCast nodes are eliminated by optimization pattern + // (VectorStoreMask (VectorMaskCast ... (VectorLoadMask x))) => x VectorMask mFloat64 = VectorMask.fromArray(FloatVector.SPECIES_64, mr, 0); VectorMask mDouble128 = mFloat64.cast(DoubleVector.SPECIES_128); VectorMask mInt64 = mDouble128.cast(IntVector.SPECIES_64); @@ -66,7 +71,10 @@ public static void testTwoCastToDifferentType_runner() { } @Test - @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 2" }, applyIfCPUFeatureOr = {"avx2", "true"}) + @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 2" }, + applyIfCPUFeature = {"avx512", "true"}) + @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureAnd = {"avx2", "true", "avx512", "false"}) public static int testTwoCastToDifferentType2() { // The types before and after the two casts are not the same, so the cast cannot be eliminated. VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0); @@ -83,10 +91,11 @@ public static void testTwoCastToDifferentType2_runner() { } @Test - @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 0" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) + @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static int testTwoCastToSameType() { // The types before and after the two casts are the same, so the cast will be eliminated. - VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0); + VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0).not(); VectorMask mFloat128 = mInt128.cast(FloatVector.SPECIES_128); VectorMask mInt128_2 = mFloat128.cast(IntVector.SPECIES_128); return mInt128_2.trueCount(); @@ -95,23 +104,41 @@ public static int testTwoCastToSameType() { @Run(test = "testTwoCastToSameType") public static void testTwoCastToSameType_runner() { int count = testTwoCastToSameType(); - VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0); + VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0).not(); Asserts.assertEquals(count, mInt128.trueCount()); } @Test - @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 1" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) + @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 1" }, + applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static int testOneCastToDifferentType() { // The types before and after the only cast are different, the cast will not be eliminated. - VectorMask mFloat128 = VectorMask.fromArray(FloatVector.SPECIES_128, mr, 0).not(); - VectorMask mInt128 = mFloat128.cast(IntVector.SPECIES_128); - return mInt128.trueCount(); + VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0).not(); + VectorMask mShort64 = mInt128.cast(ShortVector.SPECIES_64); + return mShort64.trueCount(); } @Run(test = "testOneCastToDifferentType") public static void testOneCastToDifferentType_runner() { int count = testOneCastToDifferentType(); - VectorMask mInt128 = VectorMask.fromArray(FloatVector.SPECIES_128, mr, 0).not(); + VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0).not(); + Asserts.assertEquals(count, mInt128.trueCount()); + } + + @Test + @IR(counts = { IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) + public static int testOneCastToSameType() { + // The types before and after the only cast are the same, the cast will be eliminated. + VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0).not(); + mInt128 = mInt128.cast(IntVector.SPECIES_128); + return mInt128.trueCount(); + } + + @Run(test = "testOneCastToSameType") + public static void testOneCastToSameType_runner() { + int count = testOneCastToSameType(); + VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mr, 0).not(); Asserts.assertEquals(count, mInt128.trueCount()); } diff --git a/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastTest.java b/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastTest.java index 25594d5caf941..69b354d7551bd 100644 --- a/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastTest.java +++ b/test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastTest.java @@ -1,5 +1,6 @@ /* * Copyright (c) 2021, 2023, Arm Limited. All rights reserved. + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it @@ -76,13 +77,15 @@ public class VectorMaskCastTest { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static VectorMask testByte64ToShort128(VectorMask v) { - return v.cast(ShortVector.SPECIES_128); + // A not operation is introduced to prevent the cast from being optimized away. + return v.not().cast(ShortVector.SPECIES_128); } @Run(test = "testByte64ToShort128") public static void testByte64ToShort128_runner() { VectorMask mByte64 = VectorMask.fromArray(ByteVector.SPECIES_64, mask_arr, 0); VectorMask res = testByte64ToShort128(mByte64); + mByte64 = mByte64.not(); Asserts.assertEquals(res.toString(), mByte64.toString()); Asserts.assertEquals(res.trueCount(), mByte64.trueCount()); } @@ -90,13 +93,14 @@ public static void testByte64ToShort128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testByte64ToInt256(VectorMask v) { - return v.cast(IntVector.SPECIES_256); + return v.not().cast(IntVector.SPECIES_256); } @Run(test = "testByte64ToInt256") public static void testByte64ToInt256_runner() { VectorMask mByte64 = VectorMask.fromArray(ByteVector.SPECIES_64, mask_arr, 0); VectorMask res = testByte64ToInt256(mByte64); + mByte64 = mByte64.not(); Asserts.assertEquals(res.toString(), mByte64.toString()); Asserts.assertEquals(res.trueCount(), mByte64.trueCount()); } @@ -104,13 +108,14 @@ public static void testByte64ToInt256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testByte64ToFloat256(VectorMask v) { - return v.cast(FloatVector.SPECIES_256); + return v.not().cast(FloatVector.SPECIES_256); } @Run(test = "testByte64ToFloat256") public static void testByte64ToFloat256_runner() { VectorMask mByte64 = VectorMask.fromArray(ByteVector.SPECIES_64, mask_arr, 0); VectorMask res = testByte64ToFloat256(mByte64); + mByte64 = mByte64.not(); Asserts.assertEquals(res.toString(), mByte64.toString()); Asserts.assertEquals(res.trueCount(), mByte64.trueCount()); } @@ -118,13 +123,14 @@ public static void testByte64ToFloat256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testByte64ToLong512(VectorMask v) { - return v.cast(LongVector.SPECIES_512); + return v.not().cast(LongVector.SPECIES_512); } @Run(test = "testByte64ToLong512") public static void testByte64ToLong512_runner() { VectorMask mByte64 = VectorMask.fromArray(ByteVector.SPECIES_64, mask_arr, 0); VectorMask res = testByte64ToLong512(mByte64); + mByte64 = mByte64.not(); Asserts.assertEquals(res.toString(), mByte64.toString()); Asserts.assertEquals(res.trueCount(), mByte64.trueCount()); } @@ -132,13 +138,14 @@ public static void testByte64ToLong512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testByte64ToDouble512(VectorMask v) { - return v.cast(DoubleVector.SPECIES_512); + return v.not().cast(DoubleVector.SPECIES_512); } @Run(test = "testByte64ToDouble512") public static void testByte64ToDouble512_runner() { VectorMask mByte64 = VectorMask.fromArray(ByteVector.SPECIES_64, mask_arr, 0); VectorMask res = testByte64ToDouble512(mByte64); + mByte64 = mByte64.not(); Asserts.assertEquals(res.toString(), mByte64.toString()); Asserts.assertEquals(res.trueCount(), mByte64.trueCount()); } @@ -146,13 +153,14 @@ public static void testByte64ToDouble512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testByte128ToShort256(VectorMask v) { - return v.cast(ShortVector.SPECIES_256); + return v.not().cast(ShortVector.SPECIES_256); } @Run(test = "testByte128ToShort256") public static void testByte128ToShort256_runner() { VectorMask mByte128 = VectorMask.fromArray(ByteVector.SPECIES_128, mask_arr, 0); VectorMask res = testByte128ToShort256(mByte128); + mByte128 = mByte128.not(); Asserts.assertEquals(res.toString(), mByte128.toString()); Asserts.assertEquals(res.trueCount(), mByte128.trueCount()); } @@ -160,13 +168,14 @@ public static void testByte128ToShort256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testByte128ToInt512(VectorMask v) { - return v.cast(IntVector.SPECIES_512); + return v.not().cast(IntVector.SPECIES_512); } @Run(test = "testByte128ToInt512") public static void testByte128ToInt512_runner() { VectorMask mByte128 = VectorMask.fromArray(ByteVector.SPECIES_128, mask_arr, 0); VectorMask res = testByte128ToInt512(mByte128); + mByte128 = mByte128.not(); Asserts.assertEquals(res.toString(), mByte128.toString()); Asserts.assertEquals(res.trueCount(), mByte128.trueCount()); } @@ -174,13 +183,14 @@ public static void testByte128ToInt512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testByte128ToFloat512(VectorMask v) { - return v.cast(FloatVector.SPECIES_512); + return v.not().cast(FloatVector.SPECIES_512); } @Run(test = "testByte128ToFloat512") public static void testByte128ToFloat512_runner() { VectorMask mByte128 = VectorMask.fromArray(ByteVector.SPECIES_128, mask_arr, 0); VectorMask res = testByte128ToFloat512(mByte128); + mByte128 = mByte128.not(); Asserts.assertEquals(res.toString(), mByte128.toString()); Asserts.assertEquals(res.trueCount(), mByte128.trueCount()); } @@ -188,13 +198,14 @@ public static void testByte128ToFloat512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testByte256ToShort512(VectorMask v) { - return v.cast(ShortVector.SPECIES_512); + return v.not().cast(ShortVector.SPECIES_512); } @Run(test = "testByte256ToShort512") public static void testByte256ToShort512_runner() { VectorMask mByte256 = VectorMask.fromArray(ByteVector.SPECIES_256, mask_arr, 0); VectorMask res = testByte256ToShort512(mByte256); + mByte256 = mByte256.not(); Asserts.assertEquals(res.toString(), mByte256.toString()); Asserts.assertEquals(res.trueCount(), mByte256.trueCount()); } @@ -203,13 +214,14 @@ public static void testByte256ToShort512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static VectorMask testShort64ToInt128(VectorMask v) { - return v.cast(IntVector.SPECIES_128); + return v.not().cast(IntVector.SPECIES_128); } @Run(test = "testShort64ToInt128") public static void testShort64ToInt128_runner() { VectorMask mShort64 = VectorMask.fromArray(ShortVector.SPECIES_64, mask_arr, 0); VectorMask res = testShort64ToInt128(mShort64); + mShort64 = mShort64.not(); Asserts.assertEquals(res.toString(), mShort64.toString()); Asserts.assertEquals(res.trueCount(), mShort64.trueCount()); } @@ -217,13 +229,14 @@ public static void testShort64ToInt128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static VectorMask testShort64ToFloat128(VectorMask v) { - return v.cast(FloatVector.SPECIES_128); + return v.not().cast(FloatVector.SPECIES_128); } @Run(test = "testShort64ToFloat128") public static void testShort64ToFloat128_runner() { VectorMask mShort64 = VectorMask.fromArray(ShortVector.SPECIES_64, mask_arr, 0); VectorMask res = testShort64ToFloat128(mShort64); + mShort64 = mShort64.not(); Asserts.assertEquals(res.toString(), mShort64.toString()); Asserts.assertEquals(res.trueCount(), mShort64.trueCount()); } @@ -231,13 +244,14 @@ public static void testShort64ToFloat128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testShort64ToLong256(VectorMask v) { - return v.cast(LongVector.SPECIES_256); + return v.not().cast(LongVector.SPECIES_256); } @Run(test = "testShort64ToLong256") public static void testShort64ToLong256_runner() { VectorMask mShort64 = VectorMask.fromArray(ShortVector.SPECIES_64, mask_arr, 0); VectorMask res = testShort64ToLong256(mShort64); + mShort64 = mShort64.not(); Asserts.assertEquals(res.toString(), mShort64.toString()); Asserts.assertEquals(res.trueCount(), mShort64.trueCount()); } @@ -245,13 +259,14 @@ public static void testShort64ToLong256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testShort64ToDouble256(VectorMask v) { - return v.cast(DoubleVector.SPECIES_256); + return v.not().cast(DoubleVector.SPECIES_256); } @Run(test = "testShort64ToDouble256") public static void testShort64ToDouble256_runner() { VectorMask mShort64 = VectorMask.fromArray(ShortVector.SPECIES_64, mask_arr, 0); VectorMask res = testShort64ToDouble256(mShort64); + mShort64 = mShort64.not(); Asserts.assertEquals(res.toString(), mShort64.toString()); Asserts.assertEquals(res.trueCount(), mShort64.trueCount()); } @@ -259,13 +274,14 @@ public static void testShort64ToDouble256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static VectorMask testShort128ToByte64(VectorMask v) { - return v.cast(ByteVector.SPECIES_64); + return v.not().cast(ByteVector.SPECIES_64); } @Run(test = "testShort128ToByte64") public static void testShort128ToByte64_runner() { VectorMask mShort128 = VectorMask.fromArray(ShortVector.SPECIES_128, mask_arr, 0); VectorMask res = testShort128ToByte64(mShort128); + mShort128 = mShort128.not(); Asserts.assertEquals(res.toString(), mShort128.toString()); Asserts.assertEquals(res.trueCount(), mShort128.trueCount()); } @@ -273,13 +289,14 @@ public static void testShort128ToByte64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testShort128ToInt256(VectorMask v) { - return v.cast(IntVector.SPECIES_256); + return v.not().cast(IntVector.SPECIES_256); } @Run(test = "testShort128ToInt256") public static void testShort128ToInt256_runner() { VectorMask mShort128 = VectorMask.fromArray(ShortVector.SPECIES_128, mask_arr, 0); VectorMask res = testShort128ToInt256(mShort128); + mShort128 = mShort128.not(); Asserts.assertEquals(res.toString(), mShort128.toString()); Asserts.assertEquals(res.trueCount(), mShort128.trueCount()); } @@ -287,13 +304,14 @@ public static void testShort128ToInt256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testShort128ToFloat256(VectorMask v) { - return v.cast(FloatVector.SPECIES_256); + return v.not().cast(FloatVector.SPECIES_256); } @Run(test = "testShort128ToFloat256") public static void testShort128ToFloat256_runner() { VectorMask mShort128 = VectorMask.fromArray(ShortVector.SPECIES_128, mask_arr, 0); VectorMask res = testShort128ToFloat256(mShort128); + mShort128 = mShort128.not(); Asserts.assertEquals(res.toString(), mShort128.toString()); Asserts.assertEquals(res.trueCount(), mShort128.trueCount()); } @@ -301,13 +319,14 @@ public static void testShort128ToFloat256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testShort128ToLong512(VectorMask v) { - return v.cast(LongVector.SPECIES_512); + return v.not().cast(LongVector.SPECIES_512); } @Run(test = "testShort128ToLong512") public static void testShort128ToLong512_runner() { VectorMask mShort128 = VectorMask.fromArray(ShortVector.SPECIES_128, mask_arr, 0); VectorMask res = testShort128ToLong512(mShort128); + mShort128 = mShort128.not(); Asserts.assertEquals(res.toString(), mShort128.toString()); Asserts.assertEquals(res.trueCount(), mShort128.trueCount()); } @@ -315,13 +334,14 @@ public static void testShort128ToLong512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testShort128ToDouble512(VectorMask v) { - return v.cast(DoubleVector.SPECIES_512); + return v.not().cast(DoubleVector.SPECIES_512); } @Run(test = "testShort128ToDouble512") public static void testShort128ToDouble512_runner() { VectorMask mShort128 = VectorMask.fromArray(ShortVector.SPECIES_128, mask_arr, 0); VectorMask res = testShort128ToDouble512(mShort128); + mShort128 = mShort128.not(); Asserts.assertEquals(res.toString(), mShort128.toString()); Asserts.assertEquals(res.trueCount(), mShort128.trueCount()); } @@ -329,13 +349,14 @@ public static void testShort128ToDouble512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testShort256ToByte128(VectorMask v) { - return v.cast(ByteVector.SPECIES_128); + return v.not().cast(ByteVector.SPECIES_128); } @Run(test = "testShort256ToByte128") public static void testShort256ToByte128_runner() { VectorMask mShort256 = VectorMask.fromArray(ShortVector.SPECIES_256, mask_arr, 0); VectorMask res = testShort256ToByte128(mShort256); + mShort256 = mShort256.not(); Asserts.assertEquals(res.toString(), mShort256.toString()); Asserts.assertEquals(res.trueCount(), mShort256.trueCount()); } @@ -343,13 +364,14 @@ public static void testShort256ToByte128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testShort256ToInt512(VectorMask v) { - return v.cast(IntVector.SPECIES_512); + return v.not().cast(IntVector.SPECIES_512); } @Run(test = "testShort256ToInt512") public static void testShort256ToInt512_runner() { VectorMask mShort256 = VectorMask.fromArray(ShortVector.SPECIES_256, mask_arr, 0); VectorMask res = testShort256ToInt512(mShort256); + mShort256 = mShort256.not(); Asserts.assertEquals(res.toString(), mShort256.toString()); Asserts.assertEquals(res.trueCount(), mShort256.trueCount()); } @@ -357,13 +379,14 @@ public static void testShort256ToInt512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testShort256ToFloat512(VectorMask v) { - return v.cast(FloatVector.SPECIES_512); + return v.not().cast(FloatVector.SPECIES_512); } @Run(test = "testShort256ToFloat512") public static void testShort256ToFloat512_runner() { VectorMask mShort256 = VectorMask.fromArray(ShortVector.SPECIES_256, mask_arr, 0); VectorMask res = testShort256ToFloat512(mShort256); + mShort256 = mShort256.not(); Asserts.assertEquals(res.toString(), mShort256.toString()); Asserts.assertEquals(res.trueCount(), mShort256.trueCount()); } @@ -371,13 +394,14 @@ public static void testShort256ToFloat512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testShort512ToByte256(VectorMask v) { - return v.cast(ByteVector.SPECIES_256); + return v.not().cast(ByteVector.SPECIES_256); } @Run(test = "testShort512ToByte256") public static void testShort512ToByte256_runner() { VectorMask mShort512 = VectorMask.fromArray(ShortVector.SPECIES_512, mask_arr, 0); VectorMask res = testShort512ToByte256(mShort512); + mShort512 = mShort512.not(); Asserts.assertEquals(res.toString(), mShort512.toString()); Asserts.assertEquals(res.trueCount(), mShort512.trueCount()); } @@ -386,13 +410,14 @@ public static void testShort512ToByte256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testInt64ToLong128(VectorMask v) { - return v.cast(LongVector.SPECIES_128); + return v.not().cast(LongVector.SPECIES_128); } @Run(test = "testInt64ToLong128") public static void testInt64ToLong128_runner() { VectorMask mInt64 = VectorMask.fromArray(IntVector.SPECIES_64, mask_arr, 0); VectorMask res = testInt64ToLong128(mInt64); + mInt64 = mInt64.not(); Asserts.assertEquals(res.toString(), mInt64.toString()); Asserts.assertEquals(res.trueCount(), mInt64.trueCount()); } @@ -400,13 +425,14 @@ public static void testInt64ToLong128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testInt64ToDouble128(VectorMask v) { - return v.cast(DoubleVector.SPECIES_128); + return v.not().cast(DoubleVector.SPECIES_128); } @Run(test = "testInt64ToDouble128") public static void testInt64ToDouble128_runner() { VectorMask mInt64 = VectorMask.fromArray(IntVector.SPECIES_64, mask_arr, 0); VectorMask res = testInt64ToDouble128(mInt64); + mInt64 = mInt64.not(); Asserts.assertEquals(res.toString(), mInt64.toString()); Asserts.assertEquals(res.trueCount(), mInt64.trueCount()); } @@ -414,13 +440,14 @@ public static void testInt64ToDouble128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static VectorMask testInt128ToShort64(VectorMask v) { - return v.cast(ShortVector.SPECIES_64); + return v.not().cast(ShortVector.SPECIES_64); } @Run(test = "testInt128ToShort64") public static void testInt128ToShort64_runner() { VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mask_arr, 0); VectorMask res = testInt128ToShort64(mInt128); + mInt128 = mInt128.not(); Asserts.assertEquals(res.toString(), mInt128.toString()); Asserts.assertEquals(res.trueCount(), mInt128.trueCount()); } @@ -428,13 +455,14 @@ public static void testInt128ToShort64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testInt128ToLong256(VectorMask v) { - return v.cast(LongVector.SPECIES_256); + return v.not().cast(LongVector.SPECIES_256); } @Run(test = "testInt128ToLong256") public static void testInt128ToLong256_runner() { VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mask_arr, 0); VectorMask res = testInt128ToLong256(mInt128); + mInt128 = mInt128.not(); Asserts.assertEquals(res.toString(), mInt128.toString()); Asserts.assertEquals(res.trueCount(), mInt128.trueCount()); } @@ -442,13 +470,14 @@ public static void testInt128ToLong256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testInt128ToDouble256(VectorMask v) { - return v.cast(DoubleVector.SPECIES_256); + return v.not().cast(DoubleVector.SPECIES_256); } @Run(test = "testInt128ToDouble256") public static void testInt128ToDouble256_runner() { VectorMask mInt128 = VectorMask.fromArray(IntVector.SPECIES_128, mask_arr, 0); VectorMask res = testInt128ToDouble256(mInt128); + mInt128 = mInt128.not(); Asserts.assertEquals(res.toString(), mInt128.toString()); Asserts.assertEquals(res.trueCount(), mInt128.trueCount()); } @@ -456,13 +485,14 @@ public static void testInt128ToDouble256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testInt256ToShort128(VectorMask v) { - return v.cast(ShortVector.SPECIES_128); + return v.not().cast(ShortVector.SPECIES_128); } @Run(test = "testInt256ToShort128") public static void testInt256ToShort128_runner() { VectorMask mInt256 = VectorMask.fromArray(IntVector.SPECIES_256, mask_arr, 0); VectorMask res = testInt256ToShort128(mInt256); + mInt256 = mInt256.not(); Asserts.assertEquals(res.toString(), mInt256.toString()); Asserts.assertEquals(res.trueCount(), mInt256.trueCount()); } @@ -470,13 +500,14 @@ public static void testInt256ToShort128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testInt256ToByte64(VectorMask v) { - return v.cast(ByteVector.SPECIES_64); + return v.not().cast(ByteVector.SPECIES_64); } @Run(test = "testInt256ToByte64") public static void testInt256ToByte64_runner() { VectorMask mInt256 = VectorMask.fromArray(IntVector.SPECIES_256, mask_arr, 0); VectorMask res = testInt256ToByte64(mInt256); + mInt256 = mInt256.not(); Asserts.assertEquals(res.toString(), mInt256.toString()); Asserts.assertEquals(res.trueCount(), mInt256.trueCount()); } @@ -484,13 +515,14 @@ public static void testInt256ToByte64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testInt256ToLong512(VectorMask v) { - return v.cast(LongVector.SPECIES_512); + return v.not().cast(LongVector.SPECIES_512); } @Run(test = "testInt256ToLong512") public static void testInt256ToLong512_runner() { VectorMask mInt256 = VectorMask.fromArray(IntVector.SPECIES_256, mask_arr, 0); VectorMask res = testInt256ToLong512(mInt256); + mInt256 = mInt256.not(); Asserts.assertEquals(res.toString(), mInt256.toString()); Asserts.assertEquals(res.trueCount(), mInt256.trueCount()); } @@ -498,13 +530,14 @@ public static void testInt256ToLong512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testInt256ToDouble512(VectorMask v) { - return v.cast(DoubleVector.SPECIES_512); + return v.not().cast(DoubleVector.SPECIES_512); } @Run(test = "testInt256ToDouble512") public static void testInt256ToDouble512_runner() { VectorMask mInt256 = VectorMask.fromArray(IntVector.SPECIES_256, mask_arr, 0); VectorMask res = testInt256ToDouble512(mInt256); + mInt256 = mInt256.not(); Asserts.assertEquals(res.toString(), mInt256.toString()); Asserts.assertEquals(res.trueCount(), mInt256.trueCount()); } @@ -512,13 +545,14 @@ public static void testInt256ToDouble512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testInt512ToShort256(VectorMask v) { - return v.cast(ShortVector.SPECIES_256); + return v.not().cast(ShortVector.SPECIES_256); } @Run(test = "testInt512ToShort256") public static void testInt512ToShort256_runner() { VectorMask mInt512 = VectorMask.fromArray(IntVector.SPECIES_512, mask_arr, 0); VectorMask res = testInt512ToShort256(mInt512); + mInt512 = mInt512.not(); Asserts.assertEquals(res.toString(), mInt512.toString()); Asserts.assertEquals(res.trueCount(), mInt512.trueCount()); } @@ -526,13 +560,14 @@ public static void testInt512ToShort256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testInt512ToByte128(VectorMask v) { - return v.cast(ByteVector.SPECIES_128); + return v.not().cast(ByteVector.SPECIES_128); } @Run(test = "testInt512ToByte128") public static void testInt512ToByte128_runner() { VectorMask mInt512 = VectorMask.fromArray(IntVector.SPECIES_512, mask_arr, 0); VectorMask res = testInt512ToByte128(mInt512); + mInt512 = mInt512.not(); Asserts.assertEquals(res.toString(), mInt512.toString()); Asserts.assertEquals(res.trueCount(), mInt512.trueCount()); } @@ -541,13 +576,14 @@ public static void testInt512ToByte128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testFloat64ToLong128(VectorMask v) { - return v.cast(LongVector.SPECIES_128); + return v.not().cast(LongVector.SPECIES_128); } @Run(test = "testFloat64ToLong128") public static void testFloat64ToLong128_runner() { VectorMask mFloat64 = VectorMask.fromArray(FloatVector.SPECIES_64, mask_arr, 0); VectorMask res = testFloat64ToLong128(mFloat64); + mFloat64 = mFloat64.not(); Asserts.assertEquals(res.toString(), mFloat64.toString()); Asserts.assertEquals(res.trueCount(), mFloat64.trueCount()); } @@ -555,13 +591,14 @@ public static void testFloat64ToLong128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testFloat64ToDouble128(VectorMask v) { - return v.cast(DoubleVector.SPECIES_128); + return v.not().cast(DoubleVector.SPECIES_128); } @Run(test = "testFloat64ToDouble128") public static void testFloat64ToDouble128_runner() { VectorMask mFloat64 = VectorMask.fromArray(FloatVector.SPECIES_64, mask_arr, 0); VectorMask res = testFloat64ToDouble128(mFloat64); + mFloat64 = mFloat64.not(); Asserts.assertEquals(res.toString(), mFloat64.toString()); Asserts.assertEquals(res.trueCount(), mFloat64.trueCount()); } @@ -569,13 +606,14 @@ public static void testFloat64ToDouble128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeatureOr = {"avx2", "true", "asimd", "true"}) public static VectorMask testFloat128ToShort64(VectorMask v) { - return v.cast(ShortVector.SPECIES_64); + return v.not().cast(ShortVector.SPECIES_64); } @Run(test = "testFloat128ToShort64") public static void testFloat128ToShort64_runner() { VectorMask mFloat128 = VectorMask.fromArray(FloatVector.SPECIES_128, mask_arr, 0); VectorMask res = testFloat128ToShort64(mFloat128); + mFloat128 = mFloat128.not(); Asserts.assertEquals(res.toString(), mFloat128.toString()); Asserts.assertEquals(res.trueCount(), mFloat128.trueCount()); } @@ -583,13 +621,14 @@ public static void testFloat128ToShort64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testFloat128ToLong256(VectorMask v) { - return v.cast(LongVector.SPECIES_256); + return v.not().cast(LongVector.SPECIES_256); } @Run(test = "testFloat128ToLong256") public static void testFloat128ToLong256_runner() { VectorMask mFloat128 = VectorMask.fromArray(FloatVector.SPECIES_128, mask_arr, 0); VectorMask res = testFloat128ToLong256(mFloat128); + mFloat128 = mFloat128.not(); Asserts.assertEquals(res.toString(), mFloat128.toString()); Asserts.assertEquals(res.trueCount(), mFloat128.trueCount()); } @@ -597,13 +636,14 @@ public static void testFloat128ToLong256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testFloat128ToDouble256(VectorMask v) { - return v.cast(DoubleVector.SPECIES_256); + return v.not().cast(DoubleVector.SPECIES_256); } @Run(test = "testFloat128ToDouble256") public static void testFloat128ToDouble256_runner() { VectorMask mFloat128 = VectorMask.fromArray(FloatVector.SPECIES_128, mask_arr, 0); VectorMask res = testFloat128ToDouble256(mFloat128); + mFloat128 = mFloat128.not(); Asserts.assertEquals(res.toString(), mFloat128.toString()); Asserts.assertEquals(res.trueCount(), mFloat128.trueCount()); } @@ -611,13 +651,14 @@ public static void testFloat128ToDouble256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testFloat256ToShort128(VectorMask v) { - return v.cast(ShortVector.SPECIES_128); + return v.not().cast(ShortVector.SPECIES_128); } @Run(test = "testFloat256ToShort128") public static void testFloat256ToShort128_runner() { VectorMask mFloat256 = VectorMask.fromArray(FloatVector.SPECIES_256, mask_arr, 0); VectorMask res = testFloat256ToShort128(mFloat256); + mFloat256 = mFloat256.not(); Asserts.assertEquals(res.toString(), mFloat256.toString()); Asserts.assertEquals(res.trueCount(), mFloat256.trueCount()); } @@ -625,13 +666,14 @@ public static void testFloat256ToShort128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testFloat256ToByte64(VectorMask v) { - return v.cast(ByteVector.SPECIES_64); + return v.not().cast(ByteVector.SPECIES_64); } @Run(test = "testFloat256ToByte64") public static void testFloat256ToByte64_runner() { VectorMask mFloat256 = VectorMask.fromArray(FloatVector.SPECIES_256, mask_arr, 0); VectorMask res = testFloat256ToByte64(mFloat256); + mFloat256 = mFloat256.not(); Asserts.assertEquals(res.toString(), mFloat256.toString()); Asserts.assertEquals(res.trueCount(), mFloat256.trueCount()); } @@ -639,13 +681,14 @@ public static void testFloat256ToByte64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testFloat256ToLong512(VectorMask v) { - return v.cast(LongVector.SPECIES_512); + return v.not().cast(LongVector.SPECIES_512); } @Run(test = "testFloat256ToLong512") public static void testFloat256ToLong512_runner() { VectorMask mFloat256 = VectorMask.fromArray(FloatVector.SPECIES_256, mask_arr, 0); VectorMask res = testFloat256ToLong512(mFloat256); + mFloat256 = mFloat256.not(); Asserts.assertEquals(res.toString(), mFloat256.toString()); Asserts.assertEquals(res.trueCount(), mFloat256.trueCount()); } @@ -653,13 +696,14 @@ public static void testFloat256ToLong512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testFloat256ToDouble512(VectorMask v) { - return v.cast(DoubleVector.SPECIES_512); + return v.not().cast(DoubleVector.SPECIES_512); } @Run(test = "testFloat256ToDouble512") public static void testFloat256ToDouble512_runner() { VectorMask mFloat256 = VectorMask.fromArray(FloatVector.SPECIES_256, mask_arr, 0); VectorMask res = testFloat256ToDouble512(mFloat256); + mFloat256 = mFloat256.not(); Asserts.assertEquals(res.toString(), mFloat256.toString()); Asserts.assertEquals(res.trueCount(), mFloat256.trueCount()); } @@ -667,13 +711,14 @@ public static void testFloat256ToDouble512_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testFloat512ToShort256(VectorMask v) { - return v.cast(ShortVector.SPECIES_256); + return v.not().cast(ShortVector.SPECIES_256); } @Run(test = "testFloat512ToShort256") public static void testFloat512ToShort256_runner() { VectorMask mFloat512 = VectorMask.fromArray(FloatVector.SPECIES_512, mask_arr, 0); VectorMask res = testFloat512ToShort256(mFloat512); + mFloat512 = mFloat512.not(); Asserts.assertEquals(res.toString(), mFloat512.toString()); Asserts.assertEquals(res.trueCount(), mFloat512.trueCount()); } @@ -681,13 +726,14 @@ public static void testFloat512ToShort256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testFloat512ToByte128(VectorMask v) { - return v.cast(ByteVector.SPECIES_128); + return v.not().cast(ByteVector.SPECIES_128); } @Run(test = "testFloat512ToByte128") public static void testFloat512ToByte128_runner() { VectorMask mFloat512 = VectorMask.fromArray(FloatVector.SPECIES_512, mask_arr, 0); VectorMask res = testFloat512ToByte128(mFloat512); + mFloat512 = mFloat512.not(); Asserts.assertEquals(res.toString(), mFloat512.toString()); Asserts.assertEquals(res.trueCount(), mFloat512.trueCount()); } @@ -696,13 +742,14 @@ public static void testFloat512ToByte128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testLong128ToInt64(VectorMask v) { - return v.cast(IntVector.SPECIES_64); + return v.not().cast(IntVector.SPECIES_64); } @Run(test = "testLong128ToInt64") public static void testLong128ToInt64_runner() { VectorMask mLong128 = VectorMask.fromArray(LongVector.SPECIES_128, mask_arr, 0); VectorMask res = testLong128ToInt64(mLong128); + mLong128 = mLong128.not(); Asserts.assertEquals(res.toString(), mLong128.toString()); Asserts.assertEquals(res.trueCount(), mLong128.trueCount()); } @@ -710,13 +757,14 @@ public static void testLong128ToInt64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testLong128ToFloat64(VectorMask v) { - return v.cast(FloatVector.SPECIES_64); + return v.not().cast(FloatVector.SPECIES_64); } @Run(test = "testLong128ToFloat64") public static void testLong128ToFloat64_runner() { VectorMask mLong128 = VectorMask.fromArray(LongVector.SPECIES_128, mask_arr, 0); VectorMask res = testLong128ToFloat64(mLong128); + mLong128 = mLong128.not(); Asserts.assertEquals(res.toString(), mLong128.toString()); Asserts.assertEquals(res.trueCount(), mLong128.trueCount()); } @@ -724,13 +772,14 @@ public static void testLong128ToFloat64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testLong256ToInt128(VectorMask v) { - return v.cast(IntVector.SPECIES_128); + return v.not().cast(IntVector.SPECIES_128); } @Run(test = "testLong256ToInt128") public static void testLong256ToInt128_runner() { VectorMask mLong256 = VectorMask.fromArray(LongVector.SPECIES_256, mask_arr, 0); VectorMask res = testLong256ToInt128(mLong256); + mLong256 = mLong256.not(); Asserts.assertEquals(res.toString(), mLong256.toString()); Asserts.assertEquals(res.trueCount(), mLong256.trueCount()); } @@ -738,13 +787,14 @@ public static void testLong256ToInt128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testLong256ToFloat128(VectorMask v) { - return v.cast(FloatVector.SPECIES_128); + return v.not().cast(FloatVector.SPECIES_128); } @Run(test = "testLong256ToFloat128") public static void testLong256ToFloat128_runner() { VectorMask mLong256 = VectorMask.fromArray(LongVector.SPECIES_256, mask_arr, 0); VectorMask res = testLong256ToFloat128(mLong256); + mLong256 = mLong256.not(); Asserts.assertEquals(res.toString(), mLong256.toString()); Asserts.assertEquals(res.trueCount(), mLong256.trueCount()); } @@ -752,13 +802,14 @@ public static void testLong256ToFloat128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testLong256ToShort64(VectorMask v) { - return v.cast(ShortVector.SPECIES_64); + return v.not().cast(ShortVector.SPECIES_64); } @Run(test = "testLong256ToShort64") public static void testLong256ToShort64_runner() { VectorMask mLong256 = VectorMask.fromArray(LongVector.SPECIES_256, mask_arr, 0); VectorMask res = testLong256ToShort64(mLong256); + mLong256 = mLong256.not(); Asserts.assertEquals(res.toString(), mLong256.toString()); Asserts.assertEquals(res.trueCount(), mLong256.trueCount()); } @@ -766,13 +817,14 @@ public static void testLong256ToShort64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testLong512ToInt256(VectorMask v) { - return v.cast(IntVector.SPECIES_256); + return v.not().cast(IntVector.SPECIES_256); } @Run(test = "testLong512ToInt256") public static void testLong512ToInt256_runner() { VectorMask mLong512 = VectorMask.fromArray(LongVector.SPECIES_512, mask_arr, 0); VectorMask res = testLong512ToInt256(mLong512); + mLong512 = mLong512.not(); Asserts.assertEquals(res.toString(), mLong512.toString()); Asserts.assertEquals(res.trueCount(), mLong512.trueCount()); } @@ -780,13 +832,14 @@ public static void testLong512ToInt256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testLong512ToFloat256(VectorMask v) { - return v.cast(FloatVector.SPECIES_256); + return v.not().cast(FloatVector.SPECIES_256); } @Run(test = "testLong512ToFloat256") public static void testLong512ToFloat256_runner() { VectorMask mLong512 = VectorMask.fromArray(LongVector.SPECIES_512, mask_arr, 0); VectorMask res = testLong512ToFloat256(mLong512); + mLong512 = mLong512.not(); Asserts.assertEquals(res.toString(), mLong512.toString()); Asserts.assertEquals(res.trueCount(), mLong512.trueCount()); } @@ -794,13 +847,14 @@ public static void testLong512ToFloat256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testLong512ToShort128(VectorMask v) { - return v.cast(ShortVector.SPECIES_128); + return v.not().cast(ShortVector.SPECIES_128); } @Run(test = "testLong512ToShort128") public static void testLong512ToShort128_runner() { VectorMask mLong512 = VectorMask.fromArray(LongVector.SPECIES_512, mask_arr, 0); VectorMask res = testLong512ToShort128(mLong512); + mLong512 = mLong512.not(); Asserts.assertEquals(res.toString(), mLong512.toString()); Asserts.assertEquals(res.trueCount(), mLong512.trueCount()); } @@ -808,13 +862,14 @@ public static void testLong512ToShort128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testLong512ToByte64(VectorMask v) { - return v.cast(ByteVector.SPECIES_64); + return v.not().cast(ByteVector.SPECIES_64); } @Run(test = "testLong512ToByte64") public static void testLong512ToByte64_runner() { VectorMask mLong512 = VectorMask.fromArray(LongVector.SPECIES_512, mask_arr, 0); VectorMask res = testLong512ToByte64(mLong512); + mLong512 = mLong512.not(); Asserts.assertEquals(res.toString(), mLong512.toString()); Asserts.assertEquals(res.trueCount(), mLong512.trueCount()); } @@ -823,13 +878,14 @@ public static void testLong512ToByte64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testDouble128ToInt64(VectorMask v) { - return v.cast(IntVector.SPECIES_64); + return v.not().cast(IntVector.SPECIES_64); } @Run(test = "testDouble128ToInt64") public static void testDouble128ToInt64_runner() { VectorMask mDouble128 = VectorMask.fromArray(DoubleVector.SPECIES_128, mask_arr, 0); VectorMask res = testDouble128ToInt64(mDouble128); + mDouble128 = mDouble128.not(); Asserts.assertEquals(res.toString(), mDouble128.toString()); Asserts.assertEquals(res.trueCount(), mDouble128.trueCount()); } @@ -837,13 +893,14 @@ public static void testDouble128ToInt64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"asimd", "true"}) public static VectorMask testDouble128ToFloat64(VectorMask v) { - return v.cast(FloatVector.SPECIES_64); + return v.not().cast(FloatVector.SPECIES_64); } @Run(test = "testDouble128ToFloat64") public static void testDouble128ToFloat64_runner() { VectorMask mDouble128 = VectorMask.fromArray(DoubleVector.SPECIES_128, mask_arr, 0); VectorMask res = testDouble128ToFloat64(mDouble128); + mDouble128 = mDouble128.not(); Asserts.assertEquals(res.toString(), mDouble128.toString()); Asserts.assertEquals(res.trueCount(), mDouble128.trueCount()); } @@ -851,13 +908,14 @@ public static void testDouble128ToFloat64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testDouble256ToInt128(VectorMask v) { - return v.cast(IntVector.SPECIES_128); + return v.not().cast(IntVector.SPECIES_128); } @Run(test = "testDouble256ToInt128") public static void testDouble256ToInt128_runner() { VectorMask mDouble256 = VectorMask.fromArray(DoubleVector.SPECIES_256, mask_arr, 0); VectorMask res = testDouble256ToInt128(mDouble256); + mDouble256 = mDouble256.not(); Asserts.assertEquals(res.toString(), mDouble256.toString()); Asserts.assertEquals(res.trueCount(), mDouble256.trueCount()); } @@ -865,13 +923,14 @@ public static void testDouble256ToInt128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testDouble256ToFloat128(VectorMask v) { - return v.cast(FloatVector.SPECIES_128); + return v.not().cast(FloatVector.SPECIES_128); } @Run(test = "testDouble256ToFloat128") public static void testDouble256ToFloat128_runner() { VectorMask mDouble256 = VectorMask.fromArray(DoubleVector.SPECIES_256, mask_arr, 0); VectorMask res = testDouble256ToFloat128(mDouble256); + mDouble256 = mDouble256.not(); Asserts.assertEquals(res.toString(), mDouble256.toString()); Asserts.assertEquals(res.trueCount(), mDouble256.trueCount()); } @@ -879,13 +938,14 @@ public static void testDouble256ToFloat128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx2", "true"}) public static VectorMask testDouble256ToShort64(VectorMask v) { - return v.cast(ShortVector.SPECIES_64); + return v.not().cast(ShortVector.SPECIES_64); } @Run(test = "testDouble256ToShort64") public static void testDouble256ToShort64_runner() { VectorMask mDouble256 = VectorMask.fromArray(DoubleVector.SPECIES_256, mask_arr, 0); VectorMask res = testDouble256ToShort64(mDouble256); + mDouble256 = mDouble256.not(); Asserts.assertEquals(res.toString(), mDouble256.toString()); Asserts.assertEquals(res.trueCount(), mDouble256.trueCount()); } @@ -893,13 +953,14 @@ public static void testDouble256ToShort64_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testDouble512ToInt256(VectorMask v) { - return v.cast(IntVector.SPECIES_256); + return v.not().cast(IntVector.SPECIES_256); } @Run(test = "testDouble512ToInt256") public static void testDouble512ToInt256_runner() { VectorMask mDouble512 = VectorMask.fromArray(DoubleVector.SPECIES_512, mask_arr, 0); VectorMask res = testDouble512ToInt256(mDouble512); + mDouble512 = mDouble512.not(); Asserts.assertEquals(res.toString(), mDouble512.toString()); Asserts.assertEquals(res.trueCount(), mDouble512.trueCount()); } @@ -907,13 +968,14 @@ public static void testDouble512ToInt256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testDouble512ToFloat256(VectorMask v) { - return v.cast(FloatVector.SPECIES_256); + return v.not().cast(FloatVector.SPECIES_256); } @Run(test = "testDouble512ToFloat256") public static void testDouble512ToFloat256_runner() { VectorMask mDouble512 = VectorMask.fromArray(DoubleVector.SPECIES_512, mask_arr, 0); VectorMask res = testDouble512ToFloat256(mDouble512); + mDouble512 = mDouble512.not(); Asserts.assertEquals(res.toString(), mDouble512.toString()); Asserts.assertEquals(res.trueCount(), mDouble512.trueCount()); } @@ -921,13 +983,14 @@ public static void testDouble512ToFloat256_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testDouble512ToShort128(VectorMask v) { - return v.cast(ShortVector.SPECIES_128); + return v.not().cast(ShortVector.SPECIES_128); } @Run(test = "testDouble512ToShort128") public static void testDouble512ToShort128_runner() { VectorMask mDouble512 = VectorMask.fromArray(DoubleVector.SPECIES_512, mask_arr, 0); VectorMask res = testDouble512ToShort128(mDouble512); + mDouble512 = mDouble512.not(); Asserts.assertEquals(res.toString(), mDouble512.toString()); Asserts.assertEquals(res.trueCount(), mDouble512.trueCount()); } @@ -935,13 +998,14 @@ public static void testDouble512ToShort128_runner() { @Test @IR(counts = { IRNode.VECTOR_MASK_CAST, "> 0" }, applyIfCPUFeature = {"avx512vl", "true"}) public static VectorMask testDouble512ToByte64(VectorMask v) { - return v.cast(ByteVector.SPECIES_64); + return v.not().cast(ByteVector.SPECIES_64); } @Run(test = "testDouble512ToByte64") public static void testDouble512ToByte64_runner() { VectorMask mDouble512 = VectorMask.fromArray(DoubleVector.SPECIES_512, mask_arr, 0); VectorMask res = testDouble512ToByte64(mDouble512); + mDouble512 = mDouble512.not(); Asserts.assertEquals(res.toString(), mDouble512.toString()); Asserts.assertEquals(res.trueCount(), mDouble512.trueCount()); } diff --git a/test/hotspot/jtreg/compiler/vectorapi/VectorMaskToLongTest.java b/test/hotspot/jtreg/compiler/vectorapi/VectorMaskToLongTest.java index 35a5aca966acc..082854ac9446d 100644 --- a/test/hotspot/jtreg/compiler/vectorapi/VectorMaskToLongTest.java +++ b/test/hotspot/jtreg/compiler/vectorapi/VectorMaskToLongTest.java @@ -237,9 +237,15 @@ public static void testFromLongToLongLong() { } @Test + @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0", + IRNode.VECTOR_MASK_TO_LONG, "= 0" }, + applyIfCPUFeature = { "svebitperm", "true" }) @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 1", IRNode.VECTOR_MASK_TO_LONG, "= 1" }, - applyIfCPUFeatureOr = { "svebitperm", "true", "avx2", "true", "rvv", "true" }) + applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" }) + @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0", + IRNode.VECTOR_MASK_TO_LONG, "= 0" }, + applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" }) @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0", IRNode.VECTOR_MASK_TO_LONG, "= 1" }, applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" }) @@ -251,9 +257,15 @@ public static void testFromLongToLongFloat() { } @Test + @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0", + IRNode.VECTOR_MASK_TO_LONG, "= 0" }, + applyIfCPUFeature = { "svebitperm", "true" }) @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 1", IRNode.VECTOR_MASK_TO_LONG, "= 1" }, - applyIfCPUFeatureOr = { "svebitperm", "true", "avx2", "true", "rvv", "true" }) + applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" }) + @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0", + IRNode.VECTOR_MASK_TO_LONG, "= 0" }, + applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" }) @IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0", IRNode.VECTOR_MASK_TO_LONG, "= 1" }, applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" }) diff --git a/test/hotspot/jtreg/compiler/vectorapi/VectorStoreMaskIdentityTest.java b/test/hotspot/jtreg/compiler/vectorapi/VectorStoreMaskIdentityTest.java new file mode 100644 index 0000000000000..222e1035acf15 --- /dev/null +++ b/test/hotspot/jtreg/compiler/vectorapi/VectorStoreMaskIdentityTest.java @@ -0,0 +1,290 @@ +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA + * or visit www.oracle.com if you need additional information or have any + * questions. + */ + +/* +* @test +* @bug 8370863 +* @library /test/lib / +* @summary VectorStoreMaskNode Identity optimization tests +* @modules jdk.incubator.vector +* +* @run driver compiler.vectorapi.VectorStoreMaskIdentityTest +*/ + +package compiler.vectorapi; + +import compiler.lib.ir_framework.*; +import jdk.incubator.vector.*; +import jdk.test.lib.Asserts; + +public class VectorStoreMaskIdentityTest { + private static final int LENGTH = 256; // large enough + private static boolean[] mask_in; + private static boolean[] mask_out; + static { + mask_in = new boolean[LENGTH]; + mask_out = new boolean[LENGTH]; + for (int i = 0; i < LENGTH; i++) { + mask_in[i] = (i & 3) == 0; + } + } + + @ForceInline + private static void testOneCastKernel(VectorSpecies from_species, + VectorSpecies to_species) { + VectorMask.fromArray(from_species, mask_in, 0) + .cast(to_species).intoArray(mask_out, 0); + } + + @ForceInline + private static void testTwoCastsKernel(VectorSpecies from_species, + VectorSpecies to_species1, + VectorSpecies to_species2) { + VectorMask.fromArray(from_species, mask_in, 0) + .cast(to_species1) + .cast(to_species2).intoArray(mask_out, 0); + } + + @ForceInline + private static void testThreeCastsKernel(VectorSpecies from_species, + VectorSpecies to_species1, + VectorSpecies to_species2, + VectorSpecies to_species3) { + VectorMask.fromArray(from_species, mask_in, 0) + .cast(to_species1) + .cast(to_species2) + .cast(to_species3).intoArray(mask_out, 0); + } + + @DontInline + private static void verifyResult(int vlen) { + for (int i = 0; i < vlen; i++) { + Asserts.assertEquals(mask_in[i], mask_out[i], "index " + i); + } + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx", "true" }) + public static void testVectorMaskStoreIdentityByte() { + testOneCastKernel(ByteVector.SPECIES_64, ShortVector.SPECIES_128); + verifyResult(ByteVector.SPECIES_64.length()); + + testTwoCastsKernel(ByteVector.SPECIES_64, ShortVector.SPECIES_128, ByteVector.SPECIES_64); + verifyResult(ByteVector.SPECIES_64.length()); + + testThreeCastsKernel(ByteVector.SPECIES_64, ShortVector.SPECIES_128, ByteVector.SPECIES_64, ShortVector.SPECIES_128); + verifyResult(ByteVector.SPECIES_64.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx2", "true" }, + applyIf = { "MaxVectorSize", ">16" }) + public static void testVectorMaskStoreIdentityByte256() { + testOneCastKernel(ByteVector.SPECIES_64, IntVector.SPECIES_256); + verifyResult(ByteVector.SPECIES_64.length()); + + testTwoCastsKernel(ByteVector.SPECIES_64, ShortVector.SPECIES_128, IntVector.SPECIES_256); + verifyResult(ByteVector.SPECIES_64.length()); + + testThreeCastsKernel(ByteVector.SPECIES_64, ShortVector.SPECIES_128, FloatVector.SPECIES_256, IntVector.SPECIES_256); + verifyResult(ByteVector.SPECIES_64.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx", "true" }) + public static void testVectorMaskStoreIdentityShort() { + testOneCastKernel(ShortVector.SPECIES_128, ByteVector.SPECIES_64); + verifyResult(ShortVector.SPECIES_128.length()); + + testTwoCastsKernel(ShortVector.SPECIES_64, IntVector.SPECIES_128, ShortVector.SPECIES_64); + verifyResult(ShortVector.SPECIES_64.length()); + + testThreeCastsKernel(ShortVector.SPECIES_128, ByteVector.SPECIES_64, ShortVector.SPECIES_128, ByteVector.SPECIES_64); + verifyResult(ShortVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx2", "true" }, + applyIf = { "MaxVectorSize", ">16" }) + public static void testVectorMaskStoreIdentityShort256() { + testOneCastKernel(ShortVector.SPECIES_128, IntVector.SPECIES_256); + verifyResult(ShortVector.SPECIES_128.length()); + + testTwoCastsKernel(ShortVector.SPECIES_64, IntVector.SPECIES_128, LongVector.SPECIES_256); + verifyResult(ShortVector.SPECIES_64.length()); + + testThreeCastsKernel(ShortVector.SPECIES_128, ByteVector.SPECIES_64, FloatVector.SPECIES_256, IntVector.SPECIES_256); + verifyResult(ShortVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx", "true" }) + public static void testVectorMaskStoreIdentityInt() { + testOneCastKernel(IntVector.SPECIES_MAX, FloatVector.SPECIES_MAX); + verifyResult(IntVector.SPECIES_MAX.length()); + + testTwoCastsKernel(IntVector.SPECIES_128, ShortVector.SPECIES_64, FloatVector.SPECIES_128); + verifyResult(IntVector.SPECIES_128.length()); + + testThreeCastsKernel(IntVector.SPECIES_128, ShortVector.SPECIES_64, FloatVector.SPECIES_128, ShortVector.SPECIES_64); + verifyResult(IntVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx2", "true" }, + applyIf = { "MaxVectorSize", ">16" }) + public static void testVectorMaskStoreIdentityInt256() { + testOneCastKernel(IntVector.SPECIES_MAX, FloatVector.SPECIES_MAX); + verifyResult(IntVector.SPECIES_MAX.length()); + + testTwoCastsKernel(IntVector.SPECIES_128, ShortVector.SPECIES_64, LongVector.SPECIES_256); + verifyResult(IntVector.SPECIES_128.length()); + + testThreeCastsKernel(IntVector.SPECIES_128, ShortVector.SPECIES_64, FloatVector.SPECIES_128, LongVector.SPECIES_256); + verifyResult(IntVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx", "true" }) + public static void testVectorMaskStoreIdentityLong() { + testOneCastKernel(LongVector.SPECIES_MAX, DoubleVector.SPECIES_MAX); + verifyResult(LongVector.SPECIES_MAX.length()); + + testTwoCastsKernel(LongVector.SPECIES_128, IntVector.SPECIES_64, DoubleVector.SPECIES_128); + verifyResult(LongVector.SPECIES_128.length()); + + testThreeCastsKernel(LongVector.SPECIES_128, IntVector.SPECIES_64, DoubleVector.SPECIES_128, IntVector.SPECIES_64); + verifyResult(LongVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx2", "true" }, + applyIf = { "MaxVectorSize", ">16" }) + public static void testVectorMaskStoreIdentityLong256() { + testOneCastKernel(LongVector.SPECIES_MAX, DoubleVector.SPECIES_MAX); + verifyResult(LongVector.SPECIES_MAX.length()); + + testTwoCastsKernel(LongVector.SPECIES_256, IntVector.SPECIES_128, ShortVector.SPECIES_64); + verifyResult(LongVector.SPECIES_256.length()); + + testThreeCastsKernel(LongVector.SPECIES_256, IntVector.SPECIES_128, FloatVector.SPECIES_128, ShortVector.SPECIES_64); + verifyResult(LongVector.SPECIES_256.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx", "true" }) + public static void testVectorMaskStoreIdentityFloat() { + testOneCastKernel(FloatVector.SPECIES_MAX, IntVector.SPECIES_MAX); + verifyResult(FloatVector.SPECIES_MAX.length()); + + testTwoCastsKernel(FloatVector.SPECIES_128, ShortVector.SPECIES_64, IntVector.SPECIES_128); + verifyResult(FloatVector.SPECIES_128.length()); + + testThreeCastsKernel(FloatVector.SPECIES_128, ShortVector.SPECIES_64, IntVector.SPECIES_128, ShortVector.SPECIES_64); + verifyResult(FloatVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx2", "true" }, + applyIf = { "MaxVectorSize", ">16" }) + public static void testVectorMaskStoreIdentityFloat256() { + testOneCastKernel(FloatVector.SPECIES_MAX, IntVector.SPECIES_MAX); + verifyResult(FloatVector.SPECIES_MAX.length()); + + testTwoCastsKernel(FloatVector.SPECIES_128, ShortVector.SPECIES_64, LongVector.SPECIES_256); + verifyResult(FloatVector.SPECIES_128.length()); + + testThreeCastsKernel(FloatVector.SPECIES_128, ShortVector.SPECIES_64, IntVector.SPECIES_128, LongVector.SPECIES_256); + verifyResult(FloatVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx", "true" }) + public static void testVectorMaskStoreIdentityDouble() { + testOneCastKernel(DoubleVector.SPECIES_MAX, LongVector.SPECIES_MAX); + verifyResult(DoubleVector.SPECIES_MAX.length()); + + testTwoCastsKernel(DoubleVector.SPECIES_128, IntVector.SPECIES_64, LongVector.SPECIES_128); + verifyResult(DoubleVector.SPECIES_128.length()); + + testThreeCastsKernel(DoubleVector.SPECIES_128, IntVector.SPECIES_64, LongVector.SPECIES_128, IntVector.SPECIES_64); + verifyResult(DoubleVector.SPECIES_128.length()); + } + + @Test + @IR(counts = { IRNode.VECTOR_LOAD_MASK, "= 0", + IRNode.VECTOR_STORE_MASK, "= 0", + IRNode.VECTOR_MASK_CAST, "= 0" }, + applyIfCPUFeatureOr = { "asimd", "true", "avx2", "true" }, + applyIf = { "MaxVectorSize", ">16" }) + public static void testVectorMaskStoreIdentityDouble256() { + testOneCastKernel(DoubleVector.SPECIES_MAX, LongVector.SPECIES_MAX); + verifyResult(DoubleVector.SPECIES_MAX.length()); + + testTwoCastsKernel(DoubleVector.SPECIES_256, ShortVector.SPECIES_64, IntVector.SPECIES_128); + verifyResult(DoubleVector.SPECIES_256.length()); + + testThreeCastsKernel(DoubleVector.SPECIES_256, ShortVector.SPECIES_64, LongVector.SPECIES_256, IntVector.SPECIES_128); + verifyResult(DoubleVector.SPECIES_256.length()); + } + + public static void main(String[] args) { + TestFramework testFramework = new TestFramework(); + testFramework.setDefaultWarmup(10000) + .addFlags("--add-modules=jdk.incubator.vector") + .start(); + } +} \ No newline at end of file diff --git a/test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java b/test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java new file mode 100644 index 0000000000000..7b5a589e09d4a --- /dev/null +++ b/test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java @@ -0,0 +1,83 @@ +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA + * or visit www.oracle.com if you need additional information or have any + * questions. + * + */ + +package org.openjdk.bench.jdk.incubator.vector; + +import jdk.incubator.vector.*; +import java.util.concurrent.TimeUnit; +import org.openjdk.jmh.annotations.*; + +@OutputTimeUnit(TimeUnit.MICROSECONDS) +@State(Scope.Thread) +@Warmup(iterations = 10, time = 1) +@Measurement(iterations = 10, time = 1) +@Fork(value = 1, jvmArgs = {"--add-modules=jdk.incubator.vector"}) +public class VectorStoreMaskBenchmark { + static final int LENGTH = 256; + static final boolean[] mask_arr = new boolean[LENGTH]; + static { + for (int i = 0; i < LENGTH; i++) { + mask_arr[i] = (i & 1) == 0; + } + } + + @CompilerControl(CompilerControl.Mode.INLINE) + public void maskLoadCastStoreKernel(VectorSpecies species_from, VectorSpecies species_to) { + for (int i = 0; i < LENGTH; i += species_from.length()) { + VectorMask mask_from = VectorMask.fromArray(species_from, mask_arr, i); + VectorMask mask_to = mask_from.cast(species_to); + mask_to.intoArray(mask_arr, i); + } + } + + @Benchmark + public void microMaskLoadCastStoreByte64() { + maskLoadCastStoreKernel(ByteVector.SPECIES_64, ShortVector.SPECIES_128); + } + + @Benchmark + public void microMaskLoadCastStoreShort64() { + maskLoadCastStoreKernel(ShortVector.SPECIES_64, IntVector.SPECIES_128); + } + + @Benchmark + public void microMaskLoadCastStoreInt128() { + maskLoadCastStoreKernel(IntVector.SPECIES_128, ShortVector.SPECIES_64); + } + + @Benchmark + public void microMaskLoadCastStoreLong128() { + maskLoadCastStoreKernel(LongVector.SPECIES_128, IntVector.SPECIES_64); + } + + @Benchmark + public void microMaskLoadCastStoreFloat128() { + maskLoadCastStoreKernel(FloatVector.SPECIES_128, ShortVector.SPECIES_64); + } + + @Benchmark + public void microMaskLoadCastStoreDouble128() { + maskLoadCastStoreKernel(DoubleVector.SPECIES_128, IntVector.SPECIES_64); + } +} \ No newline at end of file From 3b0ff7d66d055060674a9ad76accbd3b7b41658d Mon Sep 17 00:00:00 2001 From: erfang Date: Thu, 20 Nov 2025 03:53:32 +0000 Subject: [PATCH 2/2] Don't read and write the same memory in the JMH benchmarks --- .../jdk/incubator/vector/VectorStoreMaskBenchmark.java | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java b/test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java index 7b5a589e09d4a..6fb21ca0838d1 100644 --- a/test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java +++ b/test/micro/org/openjdk/bench/jdk/incubator/vector/VectorStoreMaskBenchmark.java @@ -35,19 +35,20 @@ @Fork(value = 1, jvmArgs = {"--add-modules=jdk.incubator.vector"}) public class VectorStoreMaskBenchmark { static final int LENGTH = 256; - static final boolean[] mask_arr = new boolean[LENGTH]; + static final boolean[] mask_arr_input = new boolean[LENGTH]; + static final boolean[] mask_arr_output = new boolean[LENGTH]; static { for (int i = 0; i < LENGTH; i++) { - mask_arr[i] = (i & 1) == 0; + mask_arr_input[i] = (i & 1) == 0; } } @CompilerControl(CompilerControl.Mode.INLINE) public void maskLoadCastStoreKernel(VectorSpecies species_from, VectorSpecies species_to) { for (int i = 0; i < LENGTH; i += species_from.length()) { - VectorMask mask_from = VectorMask.fromArray(species_from, mask_arr, i); + VectorMask mask_from = VectorMask.fromArray(species_from, mask_arr_input, i); VectorMask mask_to = mask_from.cast(species_to); - mask_to.intoArray(mask_arr, i); + mask_to.intoArray(mask_arr_output, i); } }