|
| 1 | +.. _howto_performance_diagnosis: |
| 2 | + |
| 3 | +Diagnosing performance bottlenecks |
| 4 | +================================== |
| 5 | + |
| 6 | +This page shows practical steps to *measure first*, then fix the most common |
| 7 | +pandas bottlenecks (slow ``apply``, heavy memory usage, joins/groupbys, and |
| 8 | +unintended dtype conversions). It complements :doc:`enhancingperf` with quick |
| 9 | +checklists you can try in any notebook or script. |
| 10 | + |
| 11 | +Quick checklist |
| 12 | +--------------- |
| 13 | + |
| 14 | +- Inspect dtypes and memory: ``df.info(memory_usage="deep")`` and |
| 15 | + ``df.memory_usage(deep=True)``. |
| 16 | +- Measure time with ``time.perf_counter`` or IPython's ``%timeit``. |
| 17 | +- Prefer vectorized operations over ``DataFrame.apply`` / row loops. |
| 18 | +- Ensure join/groupby keys have the same dtype; consider categoricals for |
| 19 | + low-cardinality keys. |
| 20 | +- When arithmetic-heavy, consider :func:`pandas.eval` or moving work to |
| 21 | + specialized libraries (e.g., Numba) if appropriate. |
| 22 | + |
| 23 | +Measure time |
| 24 | +------------ |
| 25 | + |
| 26 | +.. code-block:: python |
| 27 | +
|
| 28 | + import time |
| 29 | + start = time.perf_counter() |
| 30 | +
|
| 31 | + # your operation here, e.g. df.groupby("key")["value"].mean() |
| 32 | +
|
| 33 | + elapsed = time.perf_counter() - start |
| 34 | + print(f"{elapsed:.3f}s") # wall-clock timing |
| 35 | +
|
| 36 | +In notebooks, ``%timeit`` provides robust micro-benchmarks (avoid inside docs |
| 37 | +examples that execute on build). |
| 38 | + |
| 39 | +Measure memory |
| 40 | +-------------- |
| 41 | + |
| 42 | +.. code-block:: python |
| 43 | +
|
| 44 | + df.info(memory_usage="deep") |
| 45 | + df.memory_usage(deep=True) |
| 46 | +
|
| 47 | +Look for large object-dtype columns; consider converting to ``string``, |
| 48 | +nullable integer/float (``Int64`` / ``Float64``), or ``category``. |
| 49 | + |
| 50 | +Vectorize instead of apply |
| 51 | +-------------------------- |
| 52 | + |
| 53 | +.. code-block:: python |
| 54 | +
|
| 55 | + # Slow: Python-level apply calls len() per row/cell |
| 56 | + s = df["name"] |
| 57 | + slow = s.apply(len) |
| 58 | +
|
| 59 | + # Fast: vectorized string method or map over Python built-in |
| 60 | + fast = s.str.len() |
| 61 | + # or |
| 62 | + fast2 = s.map(len) |
| 63 | +
|
| 64 | +Joins and groupbys |
| 65 | +------------------ |
| 66 | + |
| 67 | +- Align dtypes on keys before ``merge``: ``df1["key"] = df1["key"].astype("int64")`` |
| 68 | + to match ``df2["key"]``. |
| 69 | +- For low-cardinality keys, try categoricals: |
| 70 | + |
| 71 | +.. code-block:: python |
| 72 | +
|
| 73 | + df["key"] = df["key"].astype("category") |
| 74 | + out = df.groupby("key", observed=True)["value"].mean() |
| 75 | +
|
| 76 | +Arithmetic / expressions |
| 77 | +------------------------ |
| 78 | + |
| 79 | +For column-wise arithmetic and boolean logic, ``pandas.eval`` can reduce |
| 80 | +temporary objects and speed up some expressions: |
| 81 | + |
| 82 | +.. code-block:: python |
| 83 | +
|
| 84 | + df = df.eval("z = (x + y) * 2") |
| 85 | +
|
| 86 | +When to scale out |
| 87 | +----------------- |
| 88 | + |
| 89 | +If a single-machine DataFrame is too large or the workflow is inherently |
| 90 | +parallel, consider external tools (e.g., Dask) or algorithmic changes. Keep |
| 91 | +this page about *diagnosis*; see :doc:`enhancingperf` for advanced options. |
| 92 | + |
| 93 | +See also |
| 94 | +-------- |
| 95 | + |
| 96 | +- :doc:`enhancingperf` |
| 97 | +- :doc:`categorical` |
| 98 | +- :doc:`missing_data` |
| 99 | +- :doc:`pyarrow` # Arrow-backed dtypes and memory behavior |
| 100 | + |
| 101 | +Notes for contributors |
| 102 | +---------------------- |
| 103 | + |
| 104 | +Examples use ``.. code-block:: python`` to avoid executed doctests. Keep code |
| 105 | +snippets small and runnable; prefer idiomatic pandas over micro-optimizations. |
0 commit comments