Skip to content

Conversation

@lifegpc
Copy link

@lifegpc lifegpc commented Nov 6, 2025

Releated issue: #30487

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Nov 6, 2025
@github-actions github-actions bot added the modifies/go Pull requests that update Go code label Nov 6, 2025
@dangjinghao
Copy link

As far as I know, this issue is specific to PDF LFS files. The application/octet-stream type works correctly in other cases. Therefore, I recommend a minimal change to address this edge case.

Additionally, have we considered using mime.TypeByExtension? This would eliminate the need to maintain a custom MIME type map.

@dangjinghao
Copy link

What is the expected behavior if we rely on file extension name for LFS type detection and the extension name is wrong? Specifically, will the file content still be displayed correctly?

@lifegpc
Copy link
Author

lifegpc commented Nov 6, 2025

What is the expected behavior if we rely on file extension name for LFS type detection and the extension name is wrong? Specifically, will the file content still be displayed correctly?

For images, wrong mimetype still works.

@silverwind
Copy link
Member

For images, wrong mimetype still works.

It depends on the client. Some like browsers can handle wrong mime type and do another content type detection based on the data.

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Nov 8, 2025
@wxiaoguang
Copy link
Contributor

This is a quick fix for the "View Raw File" on the Web UI, which have the filename in the URL. So it should work.

@silverwind
Copy link
Member

silverwind commented Nov 8, 2025

Why not just exclude image/svg? Having to maintain a list of "safe" image types seems less maintainable because new image formats are added all the time. If we keep it, define it as a constant in a shared place.

@wxiaoguang
Copy link
Contributor

Why not just exclude image/svg? Having to maintain a list of "safe" image types seems less maintainable because new image formats are added all the time. If we keep it, define it as a constant in a shared place.

Will you also remember to exclude html?

@silverwind
Copy link
Member

silverwind commented Nov 8, 2025

Why not just exclude image/svg? Having to maintain a list of "safe" image types seems less maintainable because new image formats are added all the time. If we keep it, define it as a constant in a shared place.

Will you also remember to exclude html?

Just use

if mimetype == "application/pdf" || (strings.HasPrefix(mimetype, "image/") && mimetype != "image/svg+xml")

We have similar code in other places already so it could be moved to a isSafeImageMimeType.

@silverwind
Copy link
Member

silverwind commented Nov 8, 2025

Also, what about audio/video files? See

// IsBrowsableBinaryType returns whether a non-text type can be displayed in a browser
func (ct SniffedType) IsBrowsableBinaryType() bool {
return ct.IsImage() || ct.IsSvgImage() || ct.IsPDF() || ct.IsVideo() || ct.IsAudio()
}

@wxiaoguang
Copy link
Contributor

Why not just exclude image/svg? Having to maintain a list of "safe" image types seems less maintainable because new image formats are added all the time. If we keep it, define it as a constant in a shared place.

Will you also remember to exclude html?

Just use

if mimetype == "application/pdf" || (strings.HasPrefix(mimetype, "image/") && mimetype != "image/svg+xml")

We have similar code in other places already so it could be moved to a isSafeImageMimeType.

I don't like it, it doesn't feel safe, I don't want to spend more time on fixing security bugs in the future.

If you can fix the security problems when they appear in the future, feel free to make your changes.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Nov 8, 2025

Also, what about audio/video files? See

// IsBrowsableBinaryType returns whether a non-text type can be displayed in a browser
func (ct SniffedType) IsBrowsableBinaryType() bool {
return ct.IsImage() || ct.IsSvgImage() || ct.IsPDF() || ct.IsVideo() || ct.IsAudio()
}

.

No, I don't want to use it at the moment. There are still more logic after IsBrowsableBinaryType call, and the function name itself is not accurate.


This PR aims to fix the clear problems as it stated. To fix more, the comment keyword HINT: PDF-RENDER-SANDBOX will lead developers to more related places.

@wxiaoguang
Copy link
Contributor

To fully refactor the related code and add more supports, it needs to use the approach from "Fix mime-type detection for HTTP server (#18370)" (and also refactor the existing code, a lot of changes)

Then it will introduce much more changes in this PR. I don't think these works need to be in this PR's scope (since this PR's aim is clear)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm/need 1 This PR needs approval from one additional maintainer to be merged. modifies/go Pull requests that update Go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants