You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With object dtype, using ``.values`` on a Series will return the underlying NumPy array.
322
+
323
+
.. code-block:: python
324
+
325
+
>>> ser = pd.Series(["a", "b", np.nan], dtype="object")
326
+
>>>type(ser.values)
327
+
<class'numpy.ndarray'>
328
+
329
+
However with the new string dtype, the underlying ExtensionArray is returned instead.
330
+
331
+
.. code-block:: python
332
+
333
+
>>> ser = pd.Series(["a", "b", pd.NA], dtype="str")
334
+
>>> ser.values
335
+
<ArrowStringArray>
336
+
['a', 'b', nan]
337
+
Length: 3, dtype: str
338
+
339
+
If your code requires a NumPy array, you should use :meth:`Series.to_numpy`.
340
+
341
+
.. code-block:: python
342
+
343
+
>>> ser = pd.Series(["a", "b", pd.NA], dtype="str")
344
+
>>> ser.to_numpy()
345
+
['a''b' nan]
346
+
347
+
In general, you should always prefer :meth:`Series.to_numpy` to get a NumPy array or :meth:`Series.array` to get an ExtensionArray over using :meth:`Series.values`.
348
+
310
349
Notable bug fixes
311
350
~~~~~~~~~~~~~~~~~
312
351
352
+
.. _string_migration_guide-astype_str:
353
+
313
354
``astype(str)`` preserving missing values
314
355
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
315
356
316
-
This is a long standing "bug" or misfeature, as discussed in https://github.com/pandas-dev/pandas/issues/25353.
357
+
The stringifying of missing values is a long standing "bug" or misfeature, as
358
+
discussed in https://github.com/pandas-dev/pandas/issues/25353, but fixing it
359
+
introduces a significant behaviour change.
317
360
318
-
With pandas < 3, when using ``astype(str)`` (using the built-in :func:`str`, not
319
-
``astype("str")``!), the operation would convert every element to a string,
320
-
including the missing values:
361
+
With pandas < 3, when using ``astype(str)`` or ``astype("str")``, the operation
362
+
would convert every element to a string, including the missing values:
321
363
322
364
.. code-block:: python
323
365
324
366
# OLD behavior in pandas < 3
325
-
>>> ser = pd.Series(["a", np.nan], dtype=object)
367
+
>>> ser = pd.Series([1.5, np.nan])
326
368
>>> ser
327
-
0 a
369
+
01.5
328
370
1 NaN
329
-
dtype: object
330
-
>>> ser.astype(str)
331
-
0 a
371
+
dtype: float64
372
+
>>> ser.astype("str")
373
+
01.5
332
374
1 nan
333
375
dtype: object
334
-
>>> ser.astype(str).to_numpy()
335
-
array(['a', 'nan'], dtype=object)
376
+
>>> ser.astype("str").to_numpy()
377
+
array(['1.5', 'nan'], dtype=object)
336
378
337
379
Note how ``NaN`` (``np.nan``) was converted to the string ``"nan"``. This was
338
380
not the intended behavior, and it was inconsistent with how other dtypes handled
339
381
missing values.
340
382
341
-
With pandas 3, this behavior has been fixed, and now ``astype(str)`` is an alias
342
-
for ``astype("str")``, i.e. casting to the new string dtype, which will preserve
343
-
the missing values:
383
+
With pandas 3, this behavior has been fixed, and now ``astype("str")`` will cast
384
+
to the new string dtype, which preserves the missing values:
344
385
345
386
.. code-block:: python
346
387
347
388
# NEW behavior in pandas 3
348
389
>>> pd.options.future.infer_string =True
349
-
>>> ser = pd.Series(["a", np.nan], dtype=object)
350
-
>>> ser.astype(str)
351
-
0 a
390
+
>>> ser = pd.Series([1.5, np.nan])
391
+
>>> ser.astype("str")
392
+
01.5
352
393
1 NaN
353
394
dtype: str
354
-
>>> ser.astype(str).values
355
-
array(['a', nan], dtype=object)
395
+
>>> ser.astype("str").to_numpy()
396
+
array(['1.5', nan], dtype=object)
356
397
357
398
If you want to preserve the old behaviour of converting every object to a
358
-
string, you can use ``ser.map(str)`` instead.
399
+
string, you can use ``ser.map(str)`` instead. If you want do such conversion
400
+
while preserving the missing values in a way that works with both pandas 2.x and
401
+
3.x, you can use ``ser.map(str, na_action="ignore")`` (for pandas 3.x only, you
402
+
can do ``ser.astype("str")``).
403
+
404
+
If you want to convert to object or string dtype for pandas 2.x and 3.x,
405
+
respectively, without needing to stringify each individual element, you will
406
+
have to use a conditional check on the pandas version.
407
+
For example, to convert a categorical Series with string categories to its
408
+
dense non-categorical version with object or string dtype:
409
+
410
+
.. code-block:: python
411
+
412
+
>>>import pandas as pd
413
+
>>> ser = pd.Series(["a", np.nan], dtype="category")
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v2.3.2.rst
+8-3Lines changed: 8 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
.. _whatsnew_232:
2
2
3
-
What's new in 2.3.2 (August XX, 2025)
3
+
What's new in 2.3.2 (August 21, 2025)
4
4
-------------------------------------
5
5
6
6
These are the changes in pandas 2.3.2. See :ref:`release` for a full changelog
@@ -25,11 +25,16 @@ Bug fixes
25
25
- Fix :meth:`~DataFrame.to_json` with ``orient="table"`` to correctly use the
26
26
"string" type in the JSON Table Schema for :class:`StringDtype` columns
27
27
(:issue:`61889`)
28
-
- Fixed ``~Series.str.match``, ``~Series.str.fullmatch`` and ``~Series.str.contains``
29
-
with compiled regex for the Arrow-backed string dtype (:issue:`61964`, :issue:`61942`)
28
+
- Boolean operations (``|``, ``&``, ``^``) with bool-dtype objects on the left and :class:`StringDtype` objects on the right now cast the string to bool, with a deprecation warning (:issue:`60234`)
29
+
- Fixed :meth:`~Series.str.match`, :meth:`~Series.str.fullmatch` and :meth:`~Series.str.contains`
30
+
string methods with compiled regex for the Arrow-backed string dtype (:issue:`61964`, :issue:`61942`)
31
+
- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` inconsistently
32
+
replacing matching values when missing values are present for string dtypes (:issue:`56599`)
0 commit comments