-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Closed as not planned
Closed as not planned
Copy link
Labels
EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member
Description
Feature Type
- Adding new functionality to panda
- Changing existing functionality in pandas
- Removing existing functionality in pandas
Problem Description
I wanted to get the nunique for each column in my df, but some columns contained unhashable values like lists, so I got TypeError: unhashable type: 'list'. It would be nice if df.nunique() could skip columns like that, putting NaN for them.
I got around the problem myself like this:
def nunique_if_hashable(s: pd.Series) -> float:
try:
return s.nunique()
except TypeError:
return np.nan
df.apply(nunique_if_hashable)With a result like this:
A 0.0
B 1.0
C 3.0
D NaN
dtype: float64
Since D contains at least one list, and lists aren't hashable, it's skipped.
Setup:
import numpy as np
import pandas as pd
df = pd.DataFrame({
'A': [np.nan] * 4,
'B': [1] * 4,
'C': [5, 5, 6, 7],
'D': [[], [], [], None]})Feature Description
I'm imagining a parameter like say skip_unhashable: bool = False that would do the equivalent of the above:
>>> df.nunique(skip_unhashable=True)
A 0
B 1
C 3
D NaN
Alternative Solutions
The helper function I wrote above isn't so bad. It's not crucial to put this functionality in Pandas, it would just be nice is all.
Additional Context
I loaded this df from JSON.
Metadata
Metadata
Assignees
Labels
EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member