Skip to content

Conversation

@estolfo
Copy link
Contributor

@estolfo estolfo commented Oct 28, 2025

This PR adds support for two query params on the root api: wait_for_status and timeout.
They mirror what the same query params do on the elasticsearch cluster health status endpoint.

wait_for_status: One of green, yellow or red.timeout is required along with a status. Will wait (until the timeout provided) until the status of the service changes to the one provided or better, i.e. green > yellow > red.

timeout: Period to wait for the status to reach the requested target status. If the target status is not reached before the timeout expires, the request returns http status 408.

The status of the service will be checked with an exponential backoff until the timeout is reached.

Short description of the behavior:

  • valid timeout is provided with no status: return immediately
  • valid status is provided with no timeout: return error response that timeout is required with status and http status 400
  • invalid status is provided (i.e. not one of [green, yellow, red] - return error response and http status 400
  • invalid timeout is provided (required input is that it's an integer, and with units) - return error response and http status 400
  • valid status is provided with a valid timeout: wait for the given status or a better one (green > yellow > red). When target status or a better one is reached, return normal response
  • valid status is provided with a valid timeout: wait for the given status or a better one (green > yellow > red). When the timeout is reached and neither the target status nor a better one is reached: return error response and http status 408
  • neither status nor timeout provided: return normal response

Open Questions/ToDo:

  • Right now, the implementation doesn't wait for a status that is "better, i.e. green > yellow > red", as the Elasticsearch implementation does. Do we want to adjust our implementation to also have this behavior or is that overkill? The implementation will do the same as Elasticsearch-- it will wait for a status that matches the target or "better".
  • If the timeout is expired on the Elasticsearch cluster health endpoint before the target status is reached (or a better one), the request fails and returns an error. This implementation currently just returns as normal. Do we want to have the same behavior as Elasticsearch? The request will return 503 if the request times out and the target status is not reached, as does Elasticsearch
  • What examples should be used for "host" and "name" in the openapi documentation? The guidelines suggest api.example.com but that doesn't seem to fit the example of calling logstash's root api. Update: used logstash-pipelines, logstash-pipelines.example.com
  • Confirm that the status code 503 should be used when the request times out. Define message based on what elasticsearch returns. Update: testing showed that status code 408 is returned when the request times out.
  • When an invalid timeout or status are provided, status 400 should be used with an error message.
  • Should a timeout unit be required, like for ES? i.e. "1s" for the timeout.

Resolves #17457

@github-actions
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@mergify
Copy link
Contributor

mergify bot commented Oct 28, 2025

This pull request does not have a backport label. Could you fix it @estolfo? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • If no backport is necessary, please add the backport-skip label

Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the origial requirement is to have a non-200 status code if the status is not met.

@estolfo
Copy link
Contributor Author

estolfo commented Oct 29, 2025

note: handle the unknown status being in the Status enum. Should it be removed from the HEALTH_STATUS constant?
Update: removed unknown as a valid Status.

required: false
schema:
type: string
description: Period to wait for the status to reach the requested target status. If the target status is not reached before the timeout expires, the request returns status code 503.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to external documentation, as do the ES docs

@karenzone
Copy link
Contributor

@lcawl, please take a look.

@lcawl
Copy link
Contributor

lcawl commented Dec 3, 2025

@lcawl, please take a look.

I pushed a commit that addressed errors caught by redocly lint logstash-api.yaml --config redocly.yaml. There was a typo in ephermeral_id, some misaligned tabs, and I changed the response example to examples. The content now displays successfully in Bump.sh, for example:

image

However, IMO we ought to provide a better display name than root and we ought to have descriptions for all of those response body properties. If they're the same descriptions as appear in the Common schema definition, we should just re-use those. If I can help again, lmk!

@elasticmachine
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set HTTP status code based on status in health report API

6 participants