Skip to content

Conversation

@ysok
Copy link
Contributor

@ysok ysok commented Oct 16, 2025

Scan Jupyter image Dockerfiles to detect differences (diffs) between the upstream and downstream (Konflux) versions (i.e., between Dockerfile.x and Dockerfile.konflux.x).

This script has been integrated into the make test target. This script performs the sanity check and will fail (error out) if any differences are found, requiring immediate attention to align the Konflux Dockerfiles with their upstream counterparts.

Tested locally (gmake test) against Downstream repos https://github.com/red-hat-data-services/notebooks and found multiple diff in each Jupyter's Dockerfile.

Related to: https://issues.redhat.com/browse/RHAIENG-780

Summary by CodeRabbit

  • Tests
    • Added an automated validation step to the test target that runs a new alignment-check script and fails the run if discrepancies are found.
  • Chores
    • Introduced a new standalone script integrated into the build/test flow that scans relevant files, reports any alignment differences, and emits warnings for missing originals.

@github-actions github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Oct 16, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 16, 2025

Hi @ysok. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Walkthrough

Adds a new Bash script that compares konflux Dockerfiles to their originals by stripping comments and multi-line LABEL blocks, and updates the Makefile test target to run this script after quick static tests; the script exits non-zero if any differences are found.

Changes

Cohort / File(s) Summary
Makefile update
Makefile
Appends invocation of ./scripts/check_dockerfile_alignment.sh to the test target so the alignment check runs after quick static tests.
Dockerfile alignment checker
scripts/check_dockerfile_alignment.sh
New Bash script that discovers Dockerfile.konflux.* files, derives corresponding original Dockerfiles, strips comments and multi-line LABEL blocks (via awk), diffs stripped contents, prints per-file results, warns and skips if original missing, and exits with non-zero status if any diffs exist.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "RHAIENG-780: Sanity check to find the diff in Dockerfiles" clearly and directly summarizes the main change in the changeset. The modifications introduce a new script that scans Jupyter image Dockerfiles to detect differences between upstream and downstream (Konflux) versions, and integrate this script into the make test target. The title accurately captures this primary objective in concise, specific language that would help a reviewer quickly understand the purpose of the changes. The issue reference (RHAIENG-780) provides additional context without adding clutter.
Description Check ✅ Passed The PR description covers the substantive content required to understand the changes: it explains what the script does (scans for Dockerfile diffs between upstream and downstream versions), how it's integrated (into the make test target), what testing was performed (locally tested with gmake test against the downstream repository), and provides a reference to the related issue. However, the description does not follow the template structure completely—the "How Has This Been Tested?" section is not formatted as a dedicated section, and the required self-checklist items and merge criteria checkboxes are entirely absent from the description. Despite these formatting gaps, the core information needed for review and merge is substantively present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ysok
Copy link
Contributor Author

ysok commented Oct 16, 2025

Here is the result against upstream (Note it passed because there were no Dockerfile.konflux.*)

+ ./scripts/check_dockerfile_alignment.sh
Scanning ./jupyter for directories containing Dockerfile.konflux.*
Comparing Dockerfiles, ignoring comments and LABEL blocks...
✅ All Dockerfiles are in sync.

Here is the result against downstream

+ ./scripts/check_dockerfile_alignment.sh
Scanning ./jupyter for directories containing Dockerfile.konflux.*
Comparing Dockerfiles, ignoring comments and LABEL blocks...
=== Processing ./jupyter/datascience/ubi9-python-3.12 ===
---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
❌ Differences found between Dockerfile.cpu and Dockerfile.konflux.cpu
3,4d2
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
17c15
<         mkdir -p /tmp && printf '#!/bin/sh\necho "mongocli not supported on s390x"\n' > /tmp/mongocli && \
---
>         mkdir -p /tmp && echo -e '#!/bin/sh\necho "mongocli not supported on s390x"' > /tmp/mongocli && \
34c32
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
50,64c48,58
< RUN /bin/bash <<'EOF'
< set -Eeuxo pipefail
< if [ "$TARGETARCH" = "s390x" ]; then
<     mkdir -p /opt/.cargo
<     export HOME=/root
<     curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs -o rustup-init.sh
<     chmod +x rustup-init.sh
<     CARGO_HOME=/opt/.cargo HOME=/root ./rustup-init.sh -y --no-modify-path
<     rm -f rustup-init.sh
<     chown -R 1001:0 /opt/.cargo
<     cat > /etc/profile.d/cargo.sh <<'CARGO_EOF'
< export PATH=/opt/.cargo/bin:$PATH
< export CARGO_HOME=/opt/.cargo
< export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=1
< CARGO_EOF
---
> RUN if [ "$TARGETARCH" = "s390x" ]; then \
>     mkdir -p /opt/.cargo && \
>     export HOME=/root && \
>     curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs -o rustup-init.sh && \
>     chmod +x rustup-init.sh && \
>     CARGO_HOME=/opt/.cargo HOME=/root ./rustup-init.sh -y --no-modify-path && \
>     rm -f rustup-init.sh && \
>     chown -R 1001:0 /opt/.cargo && \
>     echo 'export PATH=/opt/.cargo/bin:$PATH' >> /etc/profile.d/cargo.sh && \
>     echo 'export CARGO_HOME=/opt/.cargo' >> /etc/profile.d/cargo.sh && \
>     echo 'export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=1' >> /etc/profile.d/cargo.sh; \
66d59
< EOF
68,73c61,64
< RUN /bin/bash <<'EOF'
< set -Eeuxo pipefail
< if [ "$TARGETARCH" = "s390x" ]; then
<     alternatives --install /usr/bin/python python /usr/bin/python3.12 1
<     alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 1
<     python --version && python3 --version
---
> RUN if [ "$TARGETARCH" = "s390x" ]; then \
>     alternatives --install /usr/bin/python python /usr/bin/python3.12 1 && \
>     alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 1 && \
>     python --version && python3 --version; \
75d65
< EOF
113,114d102
<               -DARROW_S3=ON \
<               -DARROW_SUBSTRAIT=ON \
140c128
< RUN /bin/bash <<'EOF'
---
> RUN <<'EOF'
162,164c150,151
<     pip install --no-cache-dir -r requirements.txt
<     CMAKE_ARGS="-DPython3_EXECUTABLE=$(which python3.12)"
<     export CMAKE_ARGS
---
>     pip install -r requirements.txt
>     export CMAKE_ARGS="-DPython3_EXECUTABLE=$(which python3.12)"
179c166
<     wget --progress=dot:giga https://github.com/OpenMathLib/OpenBLAS/releases/download/v${OPENBLAS_VERSION}/OpenBLAS-${OPENBLAS_VERSION}.zip
---
>     wget https://github.com/OpenMathLib/OpenBLAS/releases/download/v${OPENBLAS_VERSION}/OpenBLAS-${OPENBLAS_VERSION}.zip
188d174
<
242c228
< RUN /bin/bash <<'EOF'
---
> RUN <<'EOF'
245c231
<     pip install --no-cache-dir /onnxwheels/*.whl
---
>     pip install /onnxwheels/*.whl
252c238
< RUN /bin/bash <<'EOF'
---
> RUN <<'EOF'
261c247
< RUN /bin/bash <<'EOF'
---
> RUN <<'EOF'
=== Processing ./jupyter/minimal/ubi9-python-3.12 ===
---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
❌ Differences found between Dockerfile.cpu and Dockerfile.konflux.cpu
3c3
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
---
> FROM registry.access.redhat.com/ubi9/python-312:latest AS pdf-builder
4a5,15
> WORKDIR /opt/app-root/bin
>
> USER 0
>
> COPY jupyter/utils/install_texlive.sh ./install_texlive.sh
> COPY jupyter/utils/install_pandoc.sh ./install_pandoc.sh
> RUN chmod +x install_texlive.sh install_pandoc.sh
>
> RUN ./install_texlive.sh
> RUN ./install_pandoc.sh
>
11c22
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
❌ Differences found between Dockerfile.cuda and Dockerfile.konflux.cuda
5,6d4
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
13c11
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
59a58
>
---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
❌ Differences found between Dockerfile.rocm and Dockerfile.konflux.rocm
3,4d2
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
11c9
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
32d29
<
60a58
>
=== Processing ./jupyter/pytorch/ubi9-python-3.12 ===
---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
❌ Differences found between Dockerfile.cuda and Dockerfile.konflux.cuda
5,6d4
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
23c21
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
86d83
<
100a98
>
=== Processing ./jupyter/pytorch+llmcompressor/ubi9-python-3.12 ===
---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
❌ Differences found between Dockerfile.cuda and Dockerfile.konflux.cuda
5,6d4
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
23c21
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
86d83
<
100a98
>
=== Processing ./jupyter/rocm/pytorch/ubi9-python-3.12 ===
---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
❌ Differences found between Dockerfile.rocm and Dockerfile.konflux.rocm
3,4d2
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
21c19
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
84d81
<
103a101
>
=== Processing ./jupyter/rocm/tensorflow/ubi9-python-3.12 ===
---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
❌ Differences found between Dockerfile.rocm and Dockerfile.konflux.rocm
3,4d2
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
21c19
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
79d76
< ARG JUPYTER_REUSABLE_UTILS=jupyter/utils
85d81
<
102,103d97
< COPY ${JUPYTER_REUSABLE_UTILS}/usercustomize.pth ${JUPYTER_REUSABLE_UTILS}/monkey_patch_protobuf_6x.py /opt/app-root/lib/python3.12/site-packages/
<
104a99
>
=== Processing ./jupyter/tensorflow/ubi9-python-3.12 ===
---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
❌ Differences found between Dockerfile.cuda and Dockerfile.konflux.cuda
5,6d4
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
23c21
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
72a71,74
> RUN dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm && \
>     dnf install -y hdf5-devel && \
>     dnf clean all
>
81d82
< ARG JUPYTER_REUSABLE_UTILS=jupyter/utils
87d87
<
101,102d100
< COPY ${JUPYTER_REUSABLE_UTILS}/usercustomize.pth ${JUPYTER_REUSABLE_UTILS}/monkey_patch_protobuf_6x.py /opt/app-root/lib/python3.12/site-packages/
<
103a102
>
=== Processing ./jupyter/trustyai/ubi9-python-3.12 ===
---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
❌ Differences found between Dockerfile.cpu and Dockerfile.konflux.cpu
3,4d2
< FROM registry.access.redhat.com/ubi9/ubi AS ubi-repos
<
27c25
<     pip install --no-cache-dir uv && \
---
>     pip install --no-cache uv && \
37c35
< COPY --from=ubi-repos /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
---
> COPY --from=registry.access.redhat.com/ubi9/ubi /etc/yum.repos.d/ubi.repo /etc/yum.repos.d/ubi.repo
134,135c132
<     chown -R 1001:0 /opt/app-root/ && \
<     chmod -R g=u /opt/app-root
---
>     chown -R 1001:0 /opt/app-root/
❌ Differences were found. Please inspect.
gmake: *** [Makefile:517: test] Error 1

@jiridanek
Copy link
Member

/ok-to-test

@jiridanek
Copy link
Member

Interesting, I'm looking forward to seeing in what direction you'd be taking this next.

@ysok
Copy link
Contributor Author

ysok commented Oct 17, 2025

/ok-to-test

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2025

@ysok: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images 7d9047d link true /test images

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@atheo89
Copy link
Member

atheo89 commented Oct 20, 2025

Hi @ysok great! I took a look on the script and I noticed that we need also to include as well:

  • The runtimes directory
  • Codeserver
  • and in the future also the RStudio (no .konflux file for this yet) add a placeholder and ignore it for now.

Recently we synced the downstream Dockerfiles in between them via pr1640 and pr1651 so now the bash script shows:

+ ./scripts/check_dockerfile_alignment.sh
Scanning ./jupyter for directories containing Dockerfile.konflux.*
Comparing Dockerfiles, ignoring comments and LABEL blocks...
=== Processing ./jupyter/datascience/ubi9-python-3.12 ===
---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
✅ No differences
=== Processing ./jupyter/minimal/ubi9-python-3.12 ===
---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
❌ Differences found between Dockerfile.cpu and Dockerfile.konflux.cpu
45d44
< 
76a76
> 
---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
❌ Differences found between Dockerfile.cuda and Dockerfile.konflux.cuda
59a60
> 
...
(Removed some output)
...
=== Processing ./jupyter/tensorflow/ubi9-python-3.12 ===
---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
❌ Differences found between Dockerfile.cuda and Dockerfile.konflux.cuda
87d86
< 
103a103
> 
=== Processing ./jupyter/trustyai/ubi9-python-3.12 ===
---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
✅ No differences
❌ Differences were found. Please inspect.

So digging more into it, I took the ./jupyter/minimal/ubi9-python-3.12 for inspection (I use meld tool), I see the only diff is on the label and specifically the order. On the .cpu the Label is in the line 75 and on the .konflux.cpu is in the line 116 in contrary the check_dockerfile.aligment.sh points:

45d44
< 
76a76
> 

In human friendly translation:
“One blank line was removed at line 45.”
“One blank line was added at line 76.”
“2 lines changed (RUN instruction modified).”

If i align the labels the misalignment issue disappear. So I would suggest to fix the order on the Labels to avoid strange notifications or make a modification on the script to avoid them completely.

@ysok ysok force-pushed the RHAIENG-780-diff-dockerfiles branch from 7d9047d to 155dc39 Compare October 21, 2025 03:15
@openshift-ci openshift-ci bot added size/l and removed size/l labels Oct 21, 2025
@ysok
Copy link
Contributor Author

ysok commented Oct 21, 2025

@atheo89 here is another push to cover these issues, glad you've found that. Trimming leading and trailing whitespace. It is not a bullet proof solution, specially if the user accidentally introduced more line breaks in between the code.

Here is another test output I ran against the latest downstream.

`Running quick static tests

  • ./scripts/check_dockerfile_alignment.sh
    Scanning ./jupyter ./codeserver ./runtimes for directories containing Dockerfile.konflux.*
    Comparing Dockerfiles, ignoring comments and LABEL blocks...
    === Processing ./jupyter/datascience/ubi9-python-3.12 ===
    ---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
    ✅ No differences
    === Processing ./jupyter/minimal/ubi9-python-3.12 ===
    ---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
    ✅ No differences
    ---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
    ✅ No differences
    ---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
    ✅ No differences
    === Processing ./jupyter/pytorch/ubi9-python-3.12 ===
    ---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
    ✅ No differences
    === Processing ./jupyter/pytorch+llmcompressor/ubi9-python-3.12 ===
    ---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
    ✅ No differences
    === Processing ./jupyter/rocm/pytorch/ubi9-python-3.12 ===
    ---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
    ✅ No differences
    === Processing ./jupyter/rocm/tensorflow/ubi9-python-3.12 ===
    ---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
    ✅ No differences
    === Processing ./jupyter/tensorflow/ubi9-python-3.12 ===
    ---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
    ✅ No differences
    === Processing ./jupyter/trustyai/ubi9-python-3.12 ===
    ---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
    ✅ No differences
    === Processing ./codeserver/ubi9-python-3.12 ===
    ---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
    ✅ No differences
    === Processing ./runtimes/datascience/ubi9-python-3.12 ===
    ---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
    ❌ Differences found between Dockerfile.cpu and Dockerfile.konflux.cpu
    252,253d251
    < chown -R 1001:0 /opt/app-root/ &&
    < chmod -R g=u /opt/app-root &&
    === Processing ./runtimes/minimal/ubi9-python-3.12 ===
    ---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
    ✅ No differences
    === Processing ./runtimes/pytorch/ubi9-python-3.12 ===
    ---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
    ✅ No differences
    === Processing ./runtimes/pytorch+llmcompressor/ubi9-python-3.12 ===
    ---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
    ✅ No differences
    === Processing ./runtimes/rocm-pytorch/ubi9-python-3.12 ===
    ---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
    ✅ No differences
    === Processing ./runtimes/rocm-tensorflow/ubi9-python-3.12 ===
    ---- diff Dockerfile.rocm Dockerfile.konflux.rocm ----
    ✅ No differences
    === Processing ./runtimes/tensorflow/ubi9-python-3.12 ===
    ---- diff Dockerfile.cuda Dockerfile.konflux.cuda ----
    ✅ No differences
    ❌ Differences were found. Please inspect.
    gmake: *** [Makefile:517: test] Error 1`

@openshift-ci openshift-ci bot added size/l and removed size/l labels Oct 21, 2025
@atheo89
Copy link
Member

atheo89 commented Oct 21, 2025

Nice! I’m now thinking about how we could take this a step further and automate the alignment part — so that the script doesn’t just detect differences but can also fix them automatically. The reason is that, in practice, the code usually flows from upstream to downstream, and developers might not always follow up manually in rhds.

What do you think about introducing a special keyword or comment marker in the Dockerfile (e.g., # no-align) that tells the script to skip alignment for that specific section, while everything else gets synced automatically? That way, we can safely automate most cases but still allow intentional differences where needed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 21, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jiridanek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ysok ysok marked this pull request as ready for review October 21, 2025 14:54
@openshift-ci openshift-ci bot requested review from atheo89 and dibryant October 21, 2025 14:54
@openshift-ci openshift-ci bot added size/l and removed size/l labels Oct 21, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
scripts/check_dockerfile_alignment.sh (1)

14-26: Add validation for start directories.

Before invoking find_docker_dirs, the script should verify that each start directory exists. Currently, find will silently fail or error if a directory doesn't exist, and mapfile with a non-existent result may produce unexpected behavior. Add a guard to ensure directories are present.

 main() {
 
     # Define multiple starting directories
     local start_dirs=("./jupyter" "./codeserver" "./runtimes")
+    
+    # Verify start directories exist
+    for dir in "${start_dirs[@]}"; do
+        if [ ! -d "$dir" ]; then
+            echo "⚠️  Warning: directory '$dir' not found, skipping..." >&2
+            continue
+        fi
+    done
+    
     echo "Scanning ${start_dirs[*]} for directories containing Dockerfile.konflux.*"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8adf444 and 155dc39.

📒 Files selected for processing (2)
  • Makefile (1 hunks)
  • scripts/check_dockerfile_alignment.sh (1 hunks)
🔇 Additional comments (1)
Makefile (1)

516-519: Script integration into test target looks correct.

The new alignment check is properly integrated as a post-test validation step. It will correctly fail the make test target if Dockerfile differences are detected (due to set -euo pipefail in the script and the script's exit 1 on differences). ✅

Please run make test locally to verify that:

  1. The script executes after pytest
  2. Failures in the alignment check properly cause the test target to fail
  3. The three start directories (jupyter, codeserver, runtimes) are all scanned correctly

@@ -0,0 +1,173 @@
#!/usr/bin/env bash
#=========================================================
# Script: dockerfile_diff_checker.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Script name mismatch in header comment.

Line 3 states the script name is dockerfile_diff_checker.sh, but the actual file is check_dockerfile_alignment.sh. Update the header to match the actual filename.

-# Script: dockerfile_diff_checker.sh
+# Script: check_dockerfile_alignment.sh
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Script: dockerfile_diff_checker.sh
# Script: check_dockerfile_alignment.sh
🤖 Prompt for AI Agents
In scripts/check_dockerfile_alignment.sh around line 3 the header comment
incorrectly names the script as "dockerfile_diff_checker.sh"; update that
comment to the correct filename by changing the header to "Script:
check_dockerfile_alignment.sh" and scan for any other in-file references or
comments that mention the old name to update them as well.

local dirs=("$@")
local diff_found=0

#dirs=("./jupyter/datascience/ubi9-python-3.12")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove commented-out debug code.

Line 143 contains a commented debug statement that should be removed before merge.

-    #dirs=("./jupyter/datascience/ubi9-python-3.12")
-
     # Iterate over each directory
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#dirs=("./jupyter/datascience/ubi9-python-3.12")
# Iterate over each directory
🤖 Prompt for AI Agents
In scripts/check_dockerfile_alignment.sh around line 143, remove the
commented-out debug line "#dirs=(\"./jupyter/datascience/ubi9-python-3.12\")" so
the file contains no leftover commented debug code; simply delete that line and
ensure no trailing blank lines or stray comments remain.

Scan Jupyter image Dockerfiles to detect differences (diffs) between the upstream and downstream (Konflux) versions (i.e., between Dockerfile.x and Dockerfile.konflux.x).

This script has been integrated into the make test target. This script performs the sanity check and will fail (error out) if any differences are found, requiring immediate attention to align the Konflux Dockerfiles with their upstream counterparts.
@ysok ysok force-pushed the RHAIENG-780-diff-dockerfiles branch from 155dc39 to 3ac32cc Compare October 21, 2025 15:01
@openshift-ci openshift-ci bot removed the lgtm label Oct 21, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 21, 2025

New changes are detected. LGTM label has been removed.

@openshift-ci openshift-ci bot added size/l and removed size/l labels Oct 21, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
scripts/check_dockerfile_alignment.sh (1)

3-3: Fix script name in header comment.

Line 3 states the script name as dockerfile_diff_checker.sh, but the actual filename is check_dockerfile_alignment.sh. Update the header to match.

-# Script: dockerfile_diff_checker.sh
+# Script: check_dockerfile_alignment.sh
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 155dc39 and 3ac32cc.

📒 Files selected for processing (2)
  • Makefile (1 hunks)
  • scripts/check_dockerfile_alignment.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • Makefile
🔇 Additional comments (3)
scripts/check_dockerfile_alignment.sh (3)

65-93: Well-implemented LABEL and comment stripping logic.

The function correctly handles both single-line and multi-line LABEL instructions, captures continuation lines via backslash detection, and trims whitespace to reduce false positives. Aligns with the PR's objective to ignore cosmetic differences like label ordering and blank lines.


139-168: Solid error handling and control flow.

The function properly iterates over directories, derives original filenames, validates existence, and gracefully handles missing files with warnings. The accumulation of diff_found and proper return code semantics ensure the script exits with the correct status.


1-35: Strong overall structure and error handling.

The script follows bash best practices: set -euo pipefail ensures safety, array handling is correct, and the exit semantics are appropriate for integration into the Makefile test target. The three-directory scan (jupyter, codeserver, runtimes) aligns with the PR objectives.

Comment on lines +120 to +121
# Uncomment the next line to see detailed differences
echo "$diff_output"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove stale/misleading comment.

Line 120 says "Uncomment the next line" but line 121 is already active. The comment is stale; either remove it or update it if the behavior has changed. Since the script's purpose is to report differences, printing the diff output unconditionally makes sense.

-        # Uncomment the next line to see detailed differences
         echo "$diff_output"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Uncomment the next line to see detailed differences
echo "$diff_output"
echo "$diff_output"
🤖 Prompt for AI Agents
In scripts/check_dockerfile_alignment.sh around lines 120-121, the comment
"Uncomment the next line to see detailed differences" is stale because the echo
command on the following line is already active; remove or update the comment so
it reflects current behavior (e.g., delete the comment entirely or change it to
"Print detailed differences"); ensure the remaining code prints diff_output
unconditionally as intended.

@jiridanek
Copy link
Member

Nice! I’m now thinking about how we could take this a step further and automate the alignment part — so that the script doesn’t just detect differences but can also fix them automatically. The reason is that, in practice, the code usually flows from upstream to downstream, and developers might not always follow up manually in rhds.

Since the script can generate diffs for only the parts of the file we want to keep in sync already, e.g.

=== Processing ./runtimes/datascience/ubi9-python-3.12 ===
---- diff Dockerfile.cpu Dockerfile.konflux.cpu ----
❌ Differences found between Dockerfile.cpu and Dockerfile.konflux.cpu
252,253d251
< chown -R 1001:0 /opt/app-root/ &&
< chmod -R g=u /opt/app-root &&

this is probably already something that can go as input to the patch utility (or git am) to apply the change

@jiridanek
Copy link
Member

/ok-to-test

@jiridanek jiridanek merged commit 0f8f86c into opendatahub-io:main Oct 21, 2025
13 of 15 checks passed
@ysok
Copy link
Contributor Author

ysok commented Oct 21, 2025

@jiridanek @atheo89

I feel that trying to automatically fix these differences will result in a complicated script that constantly chases the manual errors humans might introduce to the original Dockerfile.*

My current thought is whether we should re-generate the Dockerfile.konflux.* files directly from the Dockerfile.cpu to avoid the need to manually touch the Konflux-specific files.

For example, if we define the LABEL in build-args (similar to jupyter/datascience/ubi9-python-3.12/build-args/cpu.conf), we should be able to either automatically generate a new set of Dockerfile.konflux.* files, or the LABEL arguments could be made available for Konflux to access directly. What do you think?

@jiridanek
Copy link
Member

What do you think?

I think I like it. One build feature of Konflux we haven't been using yet and we have to use eventually is https://konflux-ci.dev/docs/building/prefetching-dependencies/.

The reason why we have Dockerfile and Dockerfile.konflux is (I believe) that @riprasad thinks we want to have Dockerfile nonhermetic and Dockerfile.konflux be hermetic, eventually.

It might be best to have both hermetic; but it means we'd have to figure out (in Makefile, local build scripting) how to build a Dockerfile that uses cachi2 locally, outside of Konflux. Imo it may be worth it. We have until middle of next year to figure it out.

@ysok
Copy link
Contributor Author

ysok commented Oct 21, 2025

RE: hermetic;

At Openstack, we prefetched both RPM(Using rpm-lockfile-prototype) and Golang dependencies and got them built in hermetic environment, network isolated. I haven't tried the pip dependencies yet, but it should be supported as advertised by Konflux. But we also can't really use any custom curl command to fetch anything out of Konflux. We'll most likely need to maintain a separate Dockerfile for Konflux.

@jiridanek
Copy link
Member

jiridanek commented Oct 21, 2025

I haven't tried the pip dependencies yet, but it should be supported as advertised by Konflux.

Did not try it either, but with PyPI cachi2 will force you to prefetch source archives and then build from sources. The AI python packages are not easy to build from sources, therefore AIPCC comes in and provides prebuilt artifacts to just pip install, same as RPMs from brew can be just installed and need not be rebuilt from source.

We'll most likely need to maintain a separate Dockerfile for Konflux.

Probably.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved ok-to-test review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel size/l

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants