Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1032,13 +1032,13 @@ Bug fixes
Categorical
^^^^^^^^^^^
- Bug in :class:`Categorical` where constructing from a pandas :class:`Series` or :class:`Index` with ``dtype='object'`` did not preserve the categories' dtype as ``object``; now the ``categories.dtype`` is preserved as ``object`` for these cases, while numpy arrays and Python sequences with ``dtype='object'`` continue to infer the most specific dtype (for example, ``str`` if all elements are strings) (:issue:`61778`)
- Bug in :class:`pandas.Categorical` displaying string categories without quotes when constructed from a Series with dtype "string" (:issue:`63045`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Bug in :class:`pandas.Categorical` displaying string categories without quotes when constructed from a Series with dtype "string" (:issue:`63045`)
- Bug in :class:`pandas.Categorical` displaying string categories without quotes when using "string" dtype (:issue:`63045`)

It is not so much the issue that the Categorical was created from a Series, but that it is using the string dtype for its categories (you can construct the same categorical in other ways as well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. I have updated this doc accordingly.

- Bug in :func:`Series.apply` where ``nan`` was ignored for :class:`CategoricalDtype` (:issue:`59938`)
- Bug in :func:`bdate_range` raising ``ValueError`` with frequency ``freq="cbh"`` (:issue:`62849`)
- Bug in :func:`testing.assert_index_equal` raising ``TypeError`` instead of ``AssertionError`` for incomparable ``CategoricalIndex`` when ``check_categorical=True`` and ``exact=False`` (:issue:`61935`)
- Bug in :meth:`Categorical.astype` where ``copy=False`` would still trigger a copy of the codes (:issue:`62000`)
- Bug in :meth:`DataFrame.pivot` and :meth:`DataFrame.set_index` raising an ``ArrowNotImplementedError`` for columns with pyarrow dictionary dtype (:issue:`53051`)
- Bug in :meth:`Series.convert_dtypes` with ``dtype_backend="pyarrow"`` where empty :class:`CategoricalDtype` :class:`Series` raised an error or got converted to ``null[pyarrow]`` (:issue:`59934`)
-

Datetimelike
^^^^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/arrays/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -2280,7 +2280,7 @@ def _repr_categories(self) -> list[str]:
from pandas.io.formats import format as fmt

formatter = None
if self.categories.dtype == "str":
if self.categories.dtype == "str" or self.categories.dtype == "string":
# the extension array formatter defaults to boxed=True in format_array
# override here to boxed=False to be consistent with QUOTE_NONNUMERIC
formatter = cast(ExtensionArray, self.categories._values)._formatter(
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/arrays/categorical/test_repr.py
Original file line number Diff line number Diff line change
Expand Up @@ -545,3 +545,11 @@ def test_categorical_str_repr(self):
result = repr(Categorical([1, "2", 3, 4]))
expected = "[1, '2', 3, 4]\nCategories (4, object): [1, 3, 4, '2']"
assert result == expected

def test_categorical_with_pandas_series(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_categorical_with_pandas_series(self):
def test_categorical_with_string_dtype(self):

# GH 63045
s = Series(["apple", "banana", "cherry", "cherry"], dtype="string")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_categorical_with_pandas_series(self):
# GH 63045
s = Series(["apple", "banana", "cherry", "cherry"], dtype="string")
def test_categorical_with_pandas_series(self, string_dtype_no_object):
# GH 63045
s = Series(["apple", "banana", "cherry", "cherry"], dtype=string_dtype_no_object)

You could maybe use here this fixture that will test it for the different string dtype variations, to make sure we now do this consistently for all string like dtypes.

The only thing you will have to update is the "string" in the expected result below (you can include str(string_dtype_no_object) in the expected value)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. I have updated this test accordingly.

result = repr(Categorical(s))
expected = "['apple', 'banana', 'cherry', 'cherry']\nCategories (3, string): ['apple', 'banana', 'cherry']" # noqa: E501

assert result == expected
Loading