Skip to content

Conversation

@srcejon
Copy link
Contributor

@srcejon srcejon commented Nov 8, 2025

This PR adds cv::cuda::cvtColorTwoPlane, similar to cv::cvtColorTwoPlane.

Currently it supports COLOR_YUV2BGR_NV12 and COLOR_YUV2RGB_NV12 for CV_8U only. Unfortunately there doesn't appear to be a npp function supporting RGBA.

It requires NPP v13 or greater, as nppiNV12ToRGB_8u_ColorTwist32f_P2C3R_Ctx is buggy in earlier versions of NPP.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • [] There is a reference to the original bug report and related work
  • [] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

No patch to opencv_extra - but test is included that compares cv::cuda::cvtColorTwoPlane to cv::cvtColorTwoPlane.

@cudawarped
Copy link
Contributor

cudawarped commented Nov 8, 2025

@srcejon Would it be better to use cv::cudacodec::NVSurfaceToColorConverter as NV12 is a video format? The only difference being NVSurfaceToColorConverter accepts NV12 as a single plane, however it supports

  1. conversion to BGR, RGB, BGRA, RGBA and gray,
  2. 8 and 16 bit input/output,
  3. packed and planar formats,
  4. full and limited ranges, and
  5. different color space standards.

e.g.

int rows = 1080;
int cols = 1920;
int type = CV_8U;
cv::Mat nv12(rows*1.5,cols, type);
cv::randu(nv12, 16, 235);
cv::cuda::GpuMat bgr;
Ptr<cv::cudacodec::NVSurfaceToColorConverter> yuvConverter = cudacodec::createNVSurfaceToColorConverter(cv::cudacodec::ColorSpaceStandard::BT601);
yuvConverter->convert(nv12, bgr, cv::cudacodec::SurfaceFormat::SF_NV12, cv::cudacodec::ColorFormat::BGR);
Mat bgrHost;
bgr.download(bgrHost);

If you agree then the requirement for Nvidia Video Codec SDK to be installed for this function to be available should be removed.

@srcejon
Copy link
Contributor Author

srcejon commented Nov 8, 2025

@srcejon Would it be better to use cv::cudacodec::NVSurfaceToColorConverter as NV12 is a video format? The only difference being NVSurfaceToColorConverter accepts NV12 as a single plane

In my application (Which is using FFMPEG with HW decoder (pixfmt=AV_PIX_FMT_CUDA, so should be using NVDEC)), it appears to be giving me two plane NV12. As far as I can see, the UV plane isn't contiguous with the Y plane - there's a small gap inbetween.

(Why aren't I using cv::cudacodec? Because I have an mp4 with multiple video + audio streams, which doesn't appear to be supported)

Both cudacodec and FFPMEG are using NVDEC, so presumably there should be a way for it to work, but can't currently see it.

Maybe it's a useful function anyway - as it's in the main opencv library.

@cudawarped
Copy link
Contributor

cudawarped commented Nov 8, 2025

I guess that's how FFMpeg wants to present the output, the Nvidia decoder uses

cudaVideoSurfaceFormat_NV12=0, /**< Semi-Planar YUV [Y plane followed by interleaved UV plane] */

Anyway what's your use case for outputing NV12 and then converting to BGR/RGB and why only from SD files using BT.601 color space?

@asmorkalov Should we have additional CUDA functions for different data layouts?

@srcejon
Copy link
Contributor Author

srcejon commented Nov 9, 2025

I guess that's how FFMpeg wants to present the output, the Nvidia decoder uses

cudaVideoSurfaceFormat_NV12=0, /**< Semi-Planar YUV [Y plane followed by interleaved UV plane] */

I will dig in to it more to try to find out why it appears different and if really necessary to have two plane conversion.

Anyway what's your use case for outputing NV12 and then converting to BGR/RGB and why only from SD files using BT.601 color space?

Good question, that's possibly not what I want. I just blindly copied the BT.601 coeffs from cv::cvtColorTwoPlane so the functionality is the same. (cv::cvtColor() and cv::cvtColorTwoPlane() don't have color space & range as parameters).

The source mp4 is actually created using cudacodec::VideoWriter with 4096x3000 BGRA input, h264 and mostly default EncoderParams. It's not clear from the OpenCV or NVEnc docs what the default color space used is. It looks like NvEncoder.cpp doesn't set h264VUIParameters explictly.

edit: Running a few tests shows that it defaults to the BT.709 color space. EncoderParams.videoFullRangeFlag determines whether full or limited range is used for the YUV values.

@srcejon
Copy link
Contributor Author

srcejon commented Nov 9, 2025

I will dig in to it more to try to find out why it appears different and if really necessary to have two plane conversion.

The issue is that my frame height is 3000, which isn't divisible by 16, so it allocates a 3008 line YUV buffer. Those extra 8 lines account for the gap between the Y and UV when viewed as a 3000 line frame.

So, it seems it is useful to have a two plane conversion function, but also should have the other functionality of NVSurfaceToColorConverter. Perhaps NVSurfaceToColorConverter::convert could have an overloaded version with separate y and uv parameters? This would be preferable to using the NPP functions as it could support RGBA and also wouldn't depend on NPP v13.

@cudawarped
Copy link
Contributor

Not sure I understand. You are encoding a 4096x3000 video to h264 using cudacodec::VideoWriter, then encapsulating it in an mp4 with audio and other encoded streams. Then you are decoding using FFmpeg because OpenCV only supports a single video stream to two plane NV12 which you then want to convert back to BGR. Why are you decoding to NV12 with FFmpeg and not directly decoding to BGR?

The issue is that my frame height is 3000, which isn't divisible by 16, so it allocates a 3008 line YUV buffer. Those extra 8 lines account for the gap between the Y and UV when viewed as a 3000 line frame.

If its a single plane you should still be able to convert it using the function I mentioned where the NV12 input frame has1.5*3008 rows.

@srcejon
Copy link
Contributor Author

srcejon commented Nov 9, 2025

Why are you decoding to NV12 with FFmpeg and not directly decoding to BGR?

I can't see an option to get BGR when the decoder is NVDEC. To clarify, I'm using libavcodec directly rather than the ffmpeg executable.

The issue is that my frame height is 3000, which isn't divisible by 16, so it allocates a 3008 line YUV buffer. Those extra 8 lines account for the gap between the Y and UV when viewed as a 3000 line frame.

If its a single plane you should still be able to convert it using the function I mentioned where the NV12 input frame has1.5*3008 rows.

Yes, if I oversize the buffers with the extra lines, then it works. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants