Skip to content

Commit a8bcf68

Browse files
BUG: Fix MultiIndex construction in pd.concat() with Int64Dtype NA
1 parent f4851e5 commit a8bcf68

File tree

3 files changed

+22
-2
lines changed

3 files changed

+22
-2
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1140,6 +1140,7 @@ Indexing
11401140
- Bug in :meth:`DataFrame.loc` with inconsistent behavior of loc-set with 2 given indexes to Series (:issue:`59933`)
11411141
- Bug in :meth:`Index.equals` when comparing between :class:`Series` with string dtype :class:`Index` (:issue:`61099`)
11421142
- Bug in :meth:`Index.get_indexer` and similar methods when ``NaN`` is located at or after position 128 (:issue:`58924`)
1143+
- Bug in :func:`pandas.concat` incorrectly constructing the :class:`MultiIndex` when an inner level contained :obj:`pandas.NA` with :class:`pandas.Int64Dtype`, causing a :exc:`KeyError` on lookup (:issue:`62903`)
11431144
- Bug in :meth:`MultiIndex.insert` when a new value inserted to a datetime-like level gets cast to ``NaT`` and fails indexing (:issue:`60388`)
11441145
- Bug in :meth:`Series.__setitem__` when assigning boolean series with boolean indexer will raise ``LossySetitemError`` (:issue:`57338`)
11451146
- Bug in printing :attr:`Index.names` and :attr:`MultiIndex.levels` would not escape single quotes (:issue:`60190`)

pandas/core/reshape/concat.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -990,8 +990,9 @@ def _make_concat_multiindex(indexes, keys, levels=None, names=None) -> MultiInde
990990
new_levels.extend(new_index.levels)
991991
new_codes.extend(np.tile(lab, kpieces) for lab in new_index.codes)
992992
else:
993-
new_levels.append(new_index.unique())
994-
single_codes = new_index.unique().get_indexer(new_index)
993+
levels_for_index = new_index.unique().dropna()
994+
new_levels.append(levels_for_index)
995+
single_codes = levels_for_index.get_indexer(new_index)
995996
new_codes.append(np.tile(single_codes, kpieces))
996997

997998
if len(new_names) < len(new_levels):

pandas/tests/indexes/multi/test_constructors.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -343,6 +343,24 @@ def test_from_arrays_respects_none_names():
343343

344344
tm.assert_index_equal(result, expected)
345345

346+
def test_concat_int64dtype_na_multiindex_lookup():
347+
levels1 = ['a', 'b']
348+
levels2 = pd.Series([1, 2, pd.NA], dtype=pd.Int64Dtype())
349+
index1 = MultiIndex.from_product([levels1, levels2], names=['one', 'two'])
350+
series1 = pd.Series([f'{i1}-{i2}' for i1, i2 in index1], index=index1)
351+
series2 = pd.concat(
352+
[series1.loc[i1] for i1 in levels1],
353+
keys=levels1,
354+
names=['one']
355+
)
356+
lookup_key = ('a', pd.NA)
357+
result = series2.at[lookup_key]
358+
assert result == 'a-<NA>'
359+
level_two = series2.index.levels[1]
360+
codes_two = series2.index.codes[1]
361+
assert level_two.hasnans is False
362+
assert codes_two[2] == -1
363+
assert codes_two[5] == -1
346364

347365
# ----------------------------------------------------------------------------
348366
# from_tuples

0 commit comments

Comments
 (0)