-
Notifications
You must be signed in to change notification settings - Fork 124
RHAIENG-780: Sanity check to find the diff in Dockerfiles #2587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHAIENG-780: Sanity check to find the diff in Dockerfiles #2587
Conversation
|
Hi @ysok. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
WalkthroughAdds a new Bash script that compares konflux Dockerfiles to their originals by stripping comments and multi-line LABEL blocks, and updates the Makefile test target to run this script after quick static tests; the script exits non-zero if any differences are found. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Here is the result against upstream (Note it passed because there were no Dockerfile.konflux.*)Here is the result against downstream |
|
/ok-to-test |
|
Interesting, I'm looking forward to seeing in what direction you'd be taking this next. |
|
/ok-to-test |
|
@ysok: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Hi @ysok great! I took a look on the script and I noticed that we need also to include as well:
Recently we synced the downstream Dockerfiles in between them via pr1640 and pr1651 so now the bash script shows: So digging more into it, I took the In human friendly translation: If i align the labels the misalignment issue disappear. So I would suggest to fix the order on the Labels to avoid strange notifications or make a modification on the script to avoid them completely. |
7d9047d to
155dc39
Compare
|
@atheo89 here is another push to cover these issues, glad you've found that. Trimming leading and trailing whitespace. It is not a bullet proof solution, specially if the user accidentally introduced more line breaks in between the code. Here is another test output I ran against the latest downstream. `Running quick static tests
|
|
Nice! I’m now thinking about how we could take this a step further and automate the alignment part — so that the script doesn’t just detect differences but can also fix them automatically. The reason is that, in practice, the code usually flows from upstream to downstream, and developers might not always follow up manually in rhds. What do you think about introducing a special keyword or comment marker in the Dockerfile (e.g., # no-align) that tells the script to skip alignment for that specific section, while everything else gets synced automatically? That way, we can safely automate most cases but still allow intentional differences where needed. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jiridanek The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
scripts/check_dockerfile_alignment.sh (1)
14-26: Add validation for start directories.Before invoking
find_docker_dirs, the script should verify that each start directory exists. Currently,findwill silently fail or error if a directory doesn't exist, andmapfilewith a non-existent result may produce unexpected behavior. Add a guard to ensure directories are present.main() { # Define multiple starting directories local start_dirs=("./jupyter" "./codeserver" "./runtimes") + + # Verify start directories exist + for dir in "${start_dirs[@]}"; do + if [ ! -d "$dir" ]; then + echo "⚠️ Warning: directory '$dir' not found, skipping..." >&2 + continue + fi + done + echo "Scanning ${start_dirs[*]} for directories containing Dockerfile.konflux.*"
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
Makefile(1 hunks)scripts/check_dockerfile_alignment.sh(1 hunks)
🔇 Additional comments (1)
Makefile (1)
516-519: Script integration into test target looks correct.The new alignment check is properly integrated as a post-test validation step. It will correctly fail the
make testtarget if Dockerfile differences are detected (due toset -euo pipefailin the script and the script's exit 1 on differences). ✅Please run
make testlocally to verify that:
- The script executes after
pytest- Failures in the alignment check properly cause the test target to fail
- The three start directories (jupyter, codeserver, runtimes) are all scanned correctly
| @@ -0,0 +1,173 @@ | |||
| #!/usr/bin/env bash | |||
| #========================================================= | |||
| # Script: dockerfile_diff_checker.sh | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Script name mismatch in header comment.
Line 3 states the script name is dockerfile_diff_checker.sh, but the actual file is check_dockerfile_alignment.sh. Update the header to match the actual filename.
-# Script: dockerfile_diff_checker.sh
+# Script: check_dockerfile_alignment.sh📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Script: dockerfile_diff_checker.sh | |
| # Script: check_dockerfile_alignment.sh |
🤖 Prompt for AI Agents
In scripts/check_dockerfile_alignment.sh around line 3 the header comment
incorrectly names the script as "dockerfile_diff_checker.sh"; update that
comment to the correct filename by changing the header to "Script:
check_dockerfile_alignment.sh" and scan for any other in-file references or
comments that mention the old name to update them as well.
| local dirs=("$@") | ||
| local diff_found=0 | ||
|
|
||
| #dirs=("./jupyter/datascience/ubi9-python-3.12") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented-out debug code.
Line 143 contains a commented debug statement that should be removed before merge.
- #dirs=("./jupyter/datascience/ubi9-python-3.12")
-
# Iterate over each directory📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #dirs=("./jupyter/datascience/ubi9-python-3.12") | |
| # Iterate over each directory |
🤖 Prompt for AI Agents
In scripts/check_dockerfile_alignment.sh around line 143, remove the
commented-out debug line "#dirs=(\"./jupyter/datascience/ubi9-python-3.12\")" so
the file contains no leftover commented debug code; simply delete that line and
ensure no trailing blank lines or stray comments remain.
Scan Jupyter image Dockerfiles to detect differences (diffs) between the upstream and downstream (Konflux) versions (i.e., between Dockerfile.x and Dockerfile.konflux.x). This script has been integrated into the make test target. This script performs the sanity check and will fail (error out) if any differences are found, requiring immediate attention to align the Konflux Dockerfiles with their upstream counterparts.
155dc39 to
3ac32cc
Compare
|
New changes are detected. LGTM label has been removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
scripts/check_dockerfile_alignment.sh (1)
3-3: Fix script name in header comment.Line 3 states the script name as
dockerfile_diff_checker.sh, but the actual filename ischeck_dockerfile_alignment.sh. Update the header to match.-# Script: dockerfile_diff_checker.sh +# Script: check_dockerfile_alignment.sh
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
Makefile(1 hunks)scripts/check_dockerfile_alignment.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- Makefile
🔇 Additional comments (3)
scripts/check_dockerfile_alignment.sh (3)
65-93: Well-implemented LABEL and comment stripping logic.The function correctly handles both single-line and multi-line LABEL instructions, captures continuation lines via backslash detection, and trims whitespace to reduce false positives. Aligns with the PR's objective to ignore cosmetic differences like label ordering and blank lines.
139-168: Solid error handling and control flow.The function properly iterates over directories, derives original filenames, validates existence, and gracefully handles missing files with warnings. The accumulation of diff_found and proper return code semantics ensure the script exits with the correct status.
1-35: Strong overall structure and error handling.The script follows bash best practices:
set -euo pipefailensures safety, array handling is correct, and the exit semantics are appropriate for integration into the Makefile test target. The three-directory scan (jupyter, codeserver, runtimes) aligns with the PR objectives.
| # Uncomment the next line to see detailed differences | ||
| echo "$diff_output" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove stale/misleading comment.
Line 120 says "Uncomment the next line" but line 121 is already active. The comment is stale; either remove it or update it if the behavior has changed. Since the script's purpose is to report differences, printing the diff output unconditionally makes sense.
- # Uncomment the next line to see detailed differences
echo "$diff_output"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Uncomment the next line to see detailed differences | |
| echo "$diff_output" | |
| echo "$diff_output" |
🤖 Prompt for AI Agents
In scripts/check_dockerfile_alignment.sh around lines 120-121, the comment
"Uncomment the next line to see detailed differences" is stale because the echo
command on the following line is already active; remove or update the comment so
it reflects current behavior (e.g., delete the comment entirely or change it to
"Print detailed differences"); ensure the remaining code prints diff_output
unconditionally as intended.
Since the script can generate diffs for only the parts of the file we want to keep in sync already, e.g. this is probably already something that can go as input to the |
|
/ok-to-test |
|
I feel that trying to automatically fix these differences will result in a complicated script that constantly chases the manual errors humans might introduce to the original Dockerfile.* My current thought is whether we should re-generate the Dockerfile.konflux.* files directly from the Dockerfile.cpu to avoid the need to manually touch the Konflux-specific files. For example, if we define the LABEL in build-args (similar to jupyter/datascience/ubi9-python-3.12/build-args/cpu.conf), we should be able to either automatically generate a new set of Dockerfile.konflux.* files, or the LABEL arguments could be made available for Konflux to access directly. What do you think? |
I think I like it. One build feature of Konflux we haven't been using yet and we have to use eventually is https://konflux-ci.dev/docs/building/prefetching-dependencies/. The reason why we have Dockerfile and Dockerfile.konflux is (I believe) that @riprasad thinks we want to have Dockerfile nonhermetic and Dockerfile.konflux be hermetic, eventually. It might be best to have both hermetic; but it means we'd have to figure out (in Makefile, local build scripting) how to build a Dockerfile that uses cachi2 locally, outside of Konflux. Imo it may be worth it. We have until middle of next year to figure it out. |
At Openstack, we prefetched both RPM(Using |
Did not try it either, but with PyPI cachi2 will force you to prefetch source archives and then build from sources. The AI python packages are not easy to build from sources, therefore AIPCC comes in and provides prebuilt artifacts to just pip install, same as RPMs from brew can be just installed and need not be rebuilt from source.
Probably. |
Scan Jupyter image Dockerfiles to detect differences (diffs) between the upstream and downstream (Konflux) versions (i.e., between Dockerfile.x and Dockerfile.konflux.x).
This script has been integrated into the make test target. This script performs the sanity check and will fail (error out) if any differences are found, requiring immediate attention to align the Konflux Dockerfiles with their upstream counterparts.
Tested locally (gmake test) against Downstream repos https://github.com/red-hat-data-services/notebooks and found multiple diff in each Jupyter's Dockerfile.
Related to: https://issues.redhat.com/browse/RHAIENG-780
Summary by CodeRabbit