Optimize: replace `String` with `Cow<'_, str>` #60

RoDmitry · 2025-11-09T14:58:11Z

Less String clones -> more performance 😊
Also Derive optimizations.

RoDmitry · 2025-11-09T15:09:59Z

@haydenhoang check it out. Does it look good?

RoDmitry · 2025-11-09T16:53:34Z

There is a problem which I have skipped, in OpenApi generation:

pub struct UpdateNlSearchModelParams<'p> {
    /// The ID of the NL search model to update
    pub model_id: Cow<'p, str>,
    /// The NL search model fields to update
    pub body: models::NlSearchModelCreateSchema, // missing <'p>
    pub _phantom: PhantomData<&'p ()>,
}

Somehow it thinks that models::NlSearchModelCreateSchema isPrimitiveType, maybe some mistake in openapi.yml.
And actually I don't even care about it. Fixable by hand 😏

typesense_codegen/src/apis/stopwords_api.rs

haydenhoang · 2025-11-10T16:05:24Z

Hey @RoDmitry do you have any benchmark showing performance gain from replacing String with Cow<'_, str>?

Doing that make the code more complex and harder to maintain (lifetimes, phantom data), and I think the user might prefer the easy use of String if there is no significant performance gain.

RoDmitry · 2025-11-10T21:51:59Z

Benchmark of the clones? I don't think that's necessary. That's an obvious optimization. There will be significant performance gain, if you benchmark it. I don't want to, because I understand how slow memory allocations are.
The code is more complex, but not to the point that it's harder to maintain.

User will not be bothered by the lifetimes much. It's either local lifetime or 'static. Also lifetimes might be usable in other future optimizations like borrowing of slices or anything else.
String can be wrapped by Cow using Cow::from, or just using .into(), which is the replacement of .to_owned() in the current code, so a user will not care about it.

Although I agree that Cow<'_, str> does not look as solid as String especially if a user is a newbie to Rust.
In my codebase I have clones of big models::SearchParameters in loops, so it's very useful in my case.

haydenhoang · 2025-11-11T16:20:32Z

Wait, isn't the deserialized string owned anyway? Since the content does not outlive the returned model.

 let content = resp.text().await?;
        match content_type {
            ContentType::Json => serde_json::from_str(&content).map_err(Error::from),
            ContentType::Text => {
                return Err(Error::from(serde_json::Error::custom(
                    "Received `text/plain` content type response that cannot be converted to `models::ApiKey`",
                )));
            }
            ContentType::Unsupported(unknown_type) => {
                return Err(Error::from(serde_json::Error::custom(format!(
                    "Received `{unknown_type}` content type response that cannot be converted to `models::ApiKey`"
                ))));
            }
        }

So Cow<'_, str> is only useful on the request/input side?

I don't think the performance benefit is worth it. Saving nanoseconds by avoiding clones but added complexity to the code (especially the public API models) while a typical API request already has milliseconds of latency.

In my codebase I have clones of big models::SearchParameters in loops, so it's very useful in my case.

If the loop runs only a small number of times, which is typical, because each iteration makes an API call, the cost of cloning SearchParameters is negligible compared to network I/O. I think the optimization only matters when you clone the struct thousands of times in a tight loop, which really isn't a common pattern for an HTTP client.

RoDmitry · 2025-11-11T17:51:37Z

So Cow<'_, str> is only useful on the request/input side?

Yes.

a typical API request already has milliseconds of latency.

Yes, but the API request is async, while clones are sync, and eat cpu time.

added complexity to the code (especially the public API models)

What complexity to the public API? Try it yourself, it doesn't change anything much. It's just an enum wrapper of borrowed/owned string. How is it hard to use? Cow has good a API for working with strings. Use .into_owned() and it will unwrap to a usual String.

the cost of cloning SearchParameters is negligible compared to network I/O.

Basically anything is negligible compared to network I/O. Does it mean that any optimizations are useless?

the optimization only matters when you clone the struct thousands of times in a tight loop

I clone it hundreds of times in a tight loop. Does it mean that I must struggle? Or just buy a better cpu? 😅

RoDmitry · 2025-11-11T20:08:10Z

Or maybe you want to separate the request models from the response models? Because response models contain only owned Strings. I didn't try that. I have no idea on how to do it in mustache. Is it possible? If you want to improve it, I'm open, and can push your commits (because the repo is not mine).

Or another idea is to create the second model, but with all of the fields as owned types (Strings), and set that owned model as the response type in the API. But it will leave half of the models unused.

RoDmitry · 2025-11-12T13:10:37Z

I think the following logic might be possible: preprocess openapi.yml to mark models used as input, then using that mark like {{#vendorExtensions.x-rust-is-used-as-input}} it would be possible to conditionally generate the models.

haydenhoang · 2025-11-12T17:38:18Z

Optimizations are good as long as it overweigh the trade-offs. I found a post that explain this pretty well.

The trade-offs here are adding more code/making the code less clear for some performance gain that we haven't even measured yet. I don't like the public API models to carry unnecessary phantomData and lifetimes because it's more clean that way.

I think the following logic might be possible: preprocess openapi.yml to mark models used as input, then using that mark like {{#vendorExtensions.x-rust-is-used-as-input}} it would be possible to conditionally generate the models.

Well, this attempt at optimization really does make our codebase more complex.

Or maybe you want to separate the request models from the response models?

I think we should keep the type consistent across models.

Would be nice if you can open a smaller PR for the internal derive optimization first. This big PR are slowing us down, right now I'm working on writing the documentation for the derive.

RoDmitry · 2025-11-12T22:07:48Z

I know that PhantomData does not look good in the APIs input params. I will think about that a little more.

I found a post that explain this pretty well.

anything that isn't a trivially clear optimization should be avoided until it can be measured.

But it is a trivially clear optimization!

And Premature optimization is the practice of trying to improve a program's performance before it's finished.
Do you think that APIs that we have now are not finished?

RoDmitry · 2025-11-13T11:47:59Z

I have removed all of the PhantomData and all unused lifetimes. @haydenhoang does it look ok now?

Code generation leaves some extra lifetimes (not a lot), which I have removed manually. After the merge, I will open an issue for it. Currently don't have extra time to fix it.

haydenhoang · 2025-11-13T14:08:50Z

Will look into this later 👍

RoDmitry added 4 commits November 9, 2025 14:34

Optimize

43e6061

Replace String with Cow<'_, str>

1a37d6e

Revert ToTypesenseField

4541865

Clippy

ad0c496

Refactor

e324ed3

RoDmitry commented Nov 10, 2025

View reviewed changes

typesense_codegen/src/apis/stopwords_api.rs Show resolved Hide resolved

Fix UnknownValue

9764ebb

Optimize collection API

df48edb

RoDmitry force-pushed the optimize-cow branch from b34654a to b296ceb Compare November 12, 2025 15:23

Owned data in API response models

e236905

RoDmitry force-pushed the optimize-cow branch from b296ceb to e236905 Compare November 12, 2025 15:46

RoDmitry force-pushed the optimize-cow branch from a7d20be to a47d6c6 Compare November 12, 2025 20:58

RoDmitry marked this pull request as draft November 12, 2025 21:43

Rm unused api

c93dfc2

RoDmitry force-pushed the optimize-cow branch from bc3e2c6 to bd3bfc9 Compare November 12, 2025 21:55

RoDmitry force-pushed the optimize-cow branch from bd3bfc9 to 6ea7c90 Compare November 12, 2025 22:12

XTask schemas_mark_owned_data

c195860

RoDmitry force-pushed the optimize-cow branch from 6ea7c90 to c195860 Compare November 12, 2025 22:19

RoDmitry added 2 commits November 12, 2025 23:38

Errors no lifetimes

d59a4ab

Invert to is-used-as-input

f185c83

Less lifetimes

9efa0e3

RoDmitry force-pushed the optimize-cow branch from fc86156 to 351f164 Compare November 13, 2025 02:06

OpenAPI fields reorder

2728a6c

RoDmitry force-pushed the optimize-cow branch from 351f164 to 2728a6c Compare November 13, 2025 02:09

Rm PhantomData

dd00442

RoDmitry marked this pull request as ready for review November 13, 2025 11:42

RoDmitry force-pushed the optimize-cow branch 6 times, most recently from a22f31a to c272f23 Compare November 13, 2025 12:08

Rename flag to has-borrowed-data

47dfbd1

RoDmitry force-pushed the optimize-cow branch 2 times, most recently from 1b24c60 to 5bf1a13 Compare November 13, 2025 12:22

Refactor

12af80b

RoDmitry force-pushed the optimize-cow branch from 5bf1a13 to 12af80b Compare November 13, 2025 12:23

Optimize: replace String with Cow<'_, str> #60

Are you sure you want to change the base?

Optimize: replace String with Cow<'_, str> #60

Uh oh!

Conversation

RoDmitry commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RoDmitry commented Nov 9, 2025

Uh oh!

RoDmitry commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

haydenhoang commented Nov 10, 2025

Uh oh!

RoDmitry commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haydenhoang commented Nov 11, 2025

Uh oh!

RoDmitry commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RoDmitry commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RoDmitry commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haydenhoang commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RoDmitry commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RoDmitry commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haydenhoang commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Optimize: replace `String` with `Cow<'_, str>` #60

Optimize: replace `String` with `Cow<'_, str>` #60

RoDmitry commented Nov 9, 2025 •

edited

Loading

RoDmitry commented Nov 9, 2025 •

edited

Loading

RoDmitry commented Nov 10, 2025 •

edited

Loading

RoDmitry commented Nov 11, 2025 •

edited

Loading

RoDmitry commented Nov 11, 2025 •

edited

Loading

RoDmitry commented Nov 12, 2025 •

edited

Loading

haydenhoang commented Nov 12, 2025 •

edited

Loading

RoDmitry commented Nov 12, 2025 •

edited

Loading

RoDmitry commented Nov 13, 2025 •

edited

Loading