-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[libc++] Optimize std::find_if #167697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[libc++] Optimize std::find_if #167697
Conversation
| for ([[maybe_unused]] auto _ : state) { | ||
| benchmark::DoNotOptimize(c); | ||
| std::vector<short>::iterator result; | ||
| [[clang::noinline]] result = find_if(c.begin(), c.end(), [](short i) { return i == 1; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that #167965 has landed, I think no inline can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fhahn Looks like it's not quite ready: https://godbolt.org/z/9G38nKrsK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also noticed that this fails to vectorize: https://godbolt.org/z/s3rGvEW4K.
a17c365 to
2395272
Compare
|
@llvm/pr-subscribers-libcxx Author: Nikolas Klauser (philnik777) ChangesFull diff: https://github.com/llvm/llvm-project/pull/167697.diff 9 Files Affected:
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index 4b2713191c1c0..535edf7b47b81 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -604,6 +604,7 @@ set(files
__memory/unique_temporary_buffer.h
__memory/uses_allocator.h
__memory/uses_allocator_construction.h
+ __memory/valid_range.h
__memory_resource/memory_resource.h
__memory_resource/monotonic_buffer_resource.h
__memory_resource/polymorphic_allocator.h
@@ -924,7 +925,6 @@ set(files
__utility/in_place.h
__utility/integer_sequence.h
__utility/is_pointer_in_range.h
- __utility/is_valid_range.h
__utility/lazy_synth_three_way_comparator.h
__utility/move.h
__utility/no_destroy.h
diff --git a/libcxx/include/__algorithm/find_if.h b/libcxx/include/__algorithm/find_if.h
index fd63bcc3a50dd..9f7fa480a5571 100644
--- a/libcxx/include/__algorithm/find_if.h
+++ b/libcxx/include/__algorithm/find_if.h
@@ -11,6 +11,7 @@
#define _LIBCPP___ALGORITHM_FIND_IF_H
#include <__config>
+#include <__memory/valid_range.h>
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
# pragma GCC system_header
@@ -21,6 +22,8 @@ _LIBCPP_BEGIN_NAMESPACE_STD
template <class _InputIterator, class _Predicate>
[[__nodiscard__]] inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 _InputIterator
find_if(_InputIterator __first, _InputIterator __last, _Predicate __pred) {
+ std::__assume_valid_range(__first, __last);
+
for (; __first != __last; ++__first)
if (__pred(*__first))
break;
diff --git a/libcxx/include/__memory/valid_range.h b/libcxx/include/__memory/valid_range.h
new file mode 100644
index 0000000000000..fc6675fcfacd0
--- /dev/null
+++ b/libcxx/include/__memory/valid_range.h
@@ -0,0 +1,58 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___MEMORY_VALID_RANGE_H
+#define _LIBCPP___MEMORY_VALID_RANGE_H
+
+#include <__algorithm/comp.h>
+#include <__assert>
+#include <__config>
+#include <__iterator/iterator_traits.h>
+#include <__memory/assume_aligned.h>
+#include <__memory/pointer_traits.h>
+#include <__type_traits/is_constant_evaluated.h>
+#include <__type_traits/is_same.h>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+# pragma GCC system_header
+#endif
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+template <class _Tp>
+_LIBCPP_CONSTEXPR_SINCE_CXX14 _LIBCPP_HIDE_FROM_ABI _LIBCPP_NO_SANITIZE("address") bool
+__is_valid_range(const _Tp* __first, const _Tp* __last) {
+ if (__libcpp_is_constant_evaluated()) {
+ // If this is not a constant during constant evaluation, that is because __first and __last are not
+ // part of the same allocation. If they are part of the same allocation, we must still make sure they
+ // are ordered properly.
+ return __builtin_constant_p(__first <= __last) && __first <= __last;
+ }
+
+ return !__less<>()(__last, __first);
+}
+
+template <class _Iter, class _Sent>
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 void __assume_valid_range(_Iter&& __first, _Sent&& __last) {
+#if __has_builtin(__builtin_assume_dereferenceable) && !defined(_LIBCPP_CXX03_LANG)
+ if constexpr (__libcpp_is_contiguous_iterator<_Iter>::value && is_same<_Iter, _Sent>::value) {
+ _LIBCPP_ASSERT_INTERNAL(std::__is_valid_range(std::__to_address(__first), std::__to_address(__last)),
+ "Valid range assumption does not hold");
+ if (!__libcpp_is_constant_evaluated()) {
+ using __value_type = typename iterator_traits<_Iter>::value_type;
+ __builtin_assume_dereferenceable(std::__to_address(__first), (__last - __first) * sizeof(__value_type));
+ (void)std::__assume_aligned<_LIBCPP_ALIGNOF(__value_type)>(std::__to_address(__first));
+ (void)std::__assume_aligned<_LIBCPP_ALIGNOF(__value_type)>(std::__to_address(__last));
+ }
+ }
+#endif
+}
+
+_LIBCPP_END_NAMESPACE_STD
+
+#endif // _LIBCPP___MEMORY_VALID_RANGE_H
diff --git a/libcxx/include/__utility/is_pointer_in_range.h b/libcxx/include/__utility/is_pointer_in_range.h
index 55fac6256b74e..8e1b86b16df8b 100644
--- a/libcxx/include/__utility/is_pointer_in_range.h
+++ b/libcxx/include/__utility/is_pointer_in_range.h
@@ -12,12 +12,12 @@
#include <__algorithm/comp.h>
#include <__assert>
#include <__config>
+#include <__memory/valid_range.h>
#include <__type_traits/enable_if.h>
#include <__type_traits/integral_constant.h>
#include <__type_traits/is_constant_evaluated.h>
#include <__type_traits/void_t.h>
#include <__utility/declval.h>
-#include <__utility/is_valid_range.h>
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
# pragma GCC system_header
diff --git a/libcxx/include/__utility/is_valid_range.h b/libcxx/include/__utility/is_valid_range.h
deleted file mode 100644
index 7286662dbf309..0000000000000
--- a/libcxx/include/__utility/is_valid_range.h
+++ /dev/null
@@ -1,37 +0,0 @@
-//===----------------------------------------------------------------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-#ifndef _LIBCPP___UTILITY_IS_VALID_RANGE_H
-#define _LIBCPP___UTILITY_IS_VALID_RANGE_H
-
-#include <__algorithm/comp.h>
-#include <__config>
-#include <__type_traits/is_constant_evaluated.h>
-
-#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
-# pragma GCC system_header
-#endif
-
-_LIBCPP_BEGIN_NAMESPACE_STD
-
-template <class _Tp>
-_LIBCPP_CONSTEXPR_SINCE_CXX14 _LIBCPP_HIDE_FROM_ABI _LIBCPP_NO_SANITIZE("address") bool
-__is_valid_range(const _Tp* __first, const _Tp* __last) {
- if (__libcpp_is_constant_evaluated()) {
- // If this is not a constant during constant evaluation, that is because __first and __last are not
- // part of the same allocation. If they are part of the same allocation, we must still make sure they
- // are ordered properly.
- return __builtin_constant_p(__first <= __last) && __first <= __last;
- }
-
- return !__less<>()(__last, __first);
-}
-
-_LIBCPP_END_NAMESPACE_STD
-
-#endif // _LIBCPP___UTILITY_IS_VALID_RANGE_H
diff --git a/libcxx/include/module.modulemap.in b/libcxx/include/module.modulemap.in
index 57d66cd1ccaef..d4f9f5f931375 100644
--- a/libcxx/include/module.modulemap.in
+++ b/libcxx/include/module.modulemap.in
@@ -2161,7 +2161,6 @@ module std [system] {
}
module integer_sequence { header "__utility/integer_sequence.h" }
module is_pointer_in_range { header "__utility/is_pointer_in_range.h" }
- module is_valid_range { header "__utility/is_valid_range.h" }
module lazy_synth_three_way_comparator { header "__utility/lazy_synth_three_way_comparator.h" }
module move { header "__utility/move.h" }
module no_destroy { header "__utility/no_destroy.h" }
diff --git a/libcxx/include/streambuf b/libcxx/include/streambuf
index 7dc4e31cc2324..e2d19eae10b31 100644
--- a/libcxx/include/streambuf
+++ b/libcxx/include/streambuf
@@ -117,8 +117,8 @@ protected:
# include <__assert>
# include <__fwd/streambuf.h>
# include <__locale>
+# include <__memory/valid_range.h>
# include <__type_traits/is_same.h>
-# include <__utility/is_valid_range.h>
# include <__utility/scope_guard.h>
# include <climits>
# include <ios>
diff --git a/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp b/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp
index afea31fb59e95..cd49567ae911b 100644
--- a/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp
+++ b/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp
@@ -21,30 +21,18 @@
int main(int argc, char** argv) {
auto std_find = [](auto first, auto last, auto const& value) { return std::find(first, last, value); };
auto std_find_if = [](auto first, auto last, auto const& value) {
- return std::find_if(first, last, [&](auto element) {
- benchmark::DoNotOptimize(element);
- return element == value;
- });
+ return std::find_if(first, last, [&](auto element) { return element == value; });
};
auto std_find_if_not = [](auto first, auto last, auto const& value) {
- return std::find_if_not(first, last, [&](auto element) {
- benchmark::DoNotOptimize(element);
- return element != value;
- });
+ return std::find_if_not(first, last, [&](auto element) { return element != value; });
};
auto ranges_find = [](auto first, auto last, auto const& value) { return std::ranges::find(first, last, value); };
auto ranges_find_if = [](auto first, auto last, auto const& value) {
- return std::ranges::find_if(first, last, [&](auto element) {
- benchmark::DoNotOptimize(element);
- return element == value;
- });
+ return std::ranges::find_if(first, last, [&](auto element) { return element == value; });
};
auto ranges_find_if_not = [](auto first, auto last, auto const& value) {
- return std::ranges::find_if_not(first, last, [&](auto element) {
- benchmark::DoNotOptimize(element);
- return element != value;
- });
+ return std::ranges::find_if_not(first, last, [&](auto element) { return element != value; });
};
auto register_benchmarks = [&](auto bm, std::string comment) {
diff --git a/libcxx/test/benchmarks/algorithms/nonmodifying/find_if.bench.cpp b/libcxx/test/benchmarks/algorithms/nonmodifying/find_if.bench.cpp
new file mode 100644
index 0000000000000..142676e8e4476
--- /dev/null
+++ b/libcxx/test/benchmarks/algorithms/nonmodifying/find_if.bench.cpp
@@ -0,0 +1,51 @@
+#include <benchmark/benchmark.h>
+
+template <class Iter, class ValueT>
+Iter my_find(Iter first, Iter last, const ValueT& i) {
+ for (; first != last; ++first) {
+ if (*first == i)
+ break;
+ }
+ return first;
+}
+
+static auto bm_find_if_no_vectorization(benchmark::State& state) {
+ std::size_t const size = 8192;
+ std::vector<short> c(size, 0);
+
+ for ([[maybe_unused]] auto _ : state) {
+ benchmark::DoNotOptimize(c);
+ std::vector<short>::iterator result;
+ [[clang::noinline]] result = my_find(c.begin(), c.end(), 1);
+ benchmark::DoNotOptimize(result);
+ }
+}
+BENCHMARK(bm_find_if_no_vectorization);
+
+static auto bm_find_if_autovectorization(benchmark::State& state) {
+ std::size_t const size = 8192;
+ std::vector<short> c(size, 0);
+
+ for ([[maybe_unused]] auto _ : state) {
+ benchmark::DoNotOptimize(c);
+ std::vector<short>::iterator result;
+ [[clang::noinline]] result = find_if(c.begin(), c.end(), [](short i) { return i == 1; });
+ benchmark::DoNotOptimize(result);
+ }
+}
+BENCHMARK(bm_find_if_autovectorization);
+
+static auto bm_find_manual_vectorization(benchmark::State& state) {
+ std::size_t const size = 8192;
+ std::vector<short> c(size, 0);
+
+ for ([[maybe_unused]] auto _ : state) {
+ benchmark::DoNotOptimize(c);
+ std::vector<short>::iterator result;
+ [[clang::noinline]] result = find(c.begin(), c.end(), 1);
+ benchmark::DoNotOptimize(result);
+ }
+}
+BENCHMARK(bm_find_manual_vectorization);
+
+BENCHMARK_MAIN();
|
2395272 to
55e9c69
Compare
| // This functions allows the compiler to assume that [__first, __last) is a valid range to be given to an algortihm. | ||
| // Specifically, this means that | ||
| // - [__first, __last) is dereferenceable | ||
| // - __first and __last are correctly aligned according to the language rules | ||
| // This allows (curently only clang-based compilers) to auto-vectorize algorithms that contain early returns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // This functions allows the compiler to assume that [__first, __last) is a valid range to be given to an algortihm. | |
| // Specifically, this means that | |
| // - [__first, __last) is dereferenceable | |
| // - __first and __last are correctly aligned according to the language rules | |
| // This allows (curently only clang-based compilers) to auto-vectorize algorithms that contain early returns. | |
| // This function allows the compiler to assume that [__first, __last) is a valid range as defined by the C++ Standard. | |
| // Specifically, this means that | |
| // - [__first, __last) is dereferenceable | |
| // - __last is reachable from __first | |
| // - if __first and __last are contiguous iterators, the pointers they "decay to" are correctly aligned according to the language rules for pointers | |
| // | |
| // In practice, we only add explicit assumptions for properties (1) and (3). | |
| // These assumptions allow (curently only clang-based compilers) to auto-vectorize algorithms that contain early returns. |
Uh oh!
There was an error while loading. Please reload this page.