Skip to content

Commit db82fc5

Browse files
committed
Working on IO with rank of distarrays
1 parent d356553 commit db82fc5

File tree

12 files changed

+362
-206
lines changed

12 files changed

+362
-206
lines changed

docs/source/io.rst

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Storing datafiles
44
mpi4py-fft works with regular Numpy arrays. However, since arrays in parallel
55
can become very large, and the arrays live on multiple processors, we require
66
parallel IO capabilities that goes beyond Numpys regular methods.
7-
In the :mod:`mpi4py_fft.utilities` module there are two helper classes for dumping
7+
In the :mod:`mpi4py_fft.io` module there are two helper classes for dumping
88
dataarrays to either `HDF5 <https://www.hdf5.org>`_ or
99
`NetCDF <https://www.unidata.ucar.edu/software/netcdf/>`_ format:
1010

@@ -17,46 +17,50 @@ reads data in parallel. A simple example of usage is::
1717
from mpi4py import MPI
1818
import numpy as np
1919
from mpi4py_fft import PFFT, HDF5File, NCFile, newDistArray
20-
2120
N = (128, 256, 512)
2221
T = PFFT(MPI.COMM_WORLD, N)
2322
u = newDistArray(T, forward_output=False)
2423
v = newDistArray(T, forward_output=False, val=2)
25-
u[:] = np.random.random(N)
26-
24+
u[:] = np.random.random(u.shape)
25+
# Store by first creating output files
2726
fields = {'u': [u], 'v': [v]}
28-
f0 = HDF5File('h5test.h5', T)
29-
f1 = NCFile('nctest.nc', T)
27+
f0 = HDF5File('h5test.h5', global_shape=N, mode='w')
28+
f1 = NCFile('nctest.nc', global_shape=N, mode='w')
3029
f0.write(0, fields)
3130
f1.write(0, fields)
3231
v[:] = 3
3332
f0.write(1, fields)
3433
f1.write(1, fields)
34+
# Alternatively, just use write method of each distributed array
35+
u.write('h5test.h5', 'u', step=2)
36+
v.write('h5test.h5', 'v', step=2)
37+
u.write('nctest.nc', 'u', step=2)
38+
v.write('nctest.nc', 'v', step=2)
3539

36-
Note that we are creating two datafiles ``h5test.h5`` and ``nctest.nc``,
40+
Note that we are here creating two datafiles ``h5test.h5`` and ``nctest.nc``,
3741
for storing in HDF5 or NetCDF4 formats respectively. Normally, one would be
3842
satisfied using only one format, so this is only for illustration. We store
39-
the fields ``u`` and ``v`` using method ``write`` on two different occasions,
40-
so the datafiles will contain two snapshots of each field ``u`` and ``v``.
43+
the fields ``u`` and ``v`` on three different occasions,
44+
so the datafiles will contain three snapshots of each field ``u`` and ``v``.
4145

4246
The stored dataarrays can be retrieved later on::
4347

44-
f0 = HDF5File('h5test.h5', T, mode='r')
45-
f1 = NCFile('nctest.nc', T, mode='r')
4648
u0 = newDistArray(T, forward_output=False)
4749
u1 = newDistArray(T, forward_output=False)
48-
f0.read(u0, 'u', 0)
49-
f0.read(u1, 'u', 1)
50-
f1.read(u0, 'u', 0)
51-
f1.read(u1, 'u', 1)
50+
u0.read('h5test.h5', 'u', 0)
51+
u1.read('h5test.h5', 'u', 1)
52+
# or alternatively for netcdf
53+
#u0.read('nctest.nc', 'u', 0)
54+
#u1.read('nctest.nc', 'u', 1)
55+
5256

5357
Note that one does not have to use the same number of processors when
5458
retrieving the data as when they were stored.
5559

5660
It is also possible to store only parts of the, potentially large, arrays.
5761
Any chosen slice may be stored, using a *global* view of the arrays::
5862

59-
f2 = HDF5File('variousfields.h5', T, mode='w')
63+
f2 = HDF5File('variousfields.h5', global_shape=N, mode='w')
6064
fields = {'u': [u,
6165
(u, [slice(None), slice(None), 4]),
6266
(u, [5, 5, slice(None)])],
@@ -65,6 +69,8 @@ Any chosen slice may be stored, using a *global* view of the arrays::
6569
f2.write(0, fields)
6670
f2.write(1, fields)
6771
f2.write(2, fields)
72+
# or, using write method of field, e.g.
73+
#u.write('variousfields.h5', 'u', 0, [slice(None), slice(None), 4])
6874

6975
This will lead to an hdf5-file with groups::
7076

@@ -107,14 +113,14 @@ two different ways when creating the datafiles:
107113
originates from the origin, with lengths :math:`\pi, 2\pi, 3\pi`, can be
108114
given as::
109115

110-
f0 = HDF5File('filename.h5', T, domain=((0, pi), (0, 2*np.pi), (0, 3*np.pi)))
116+
f0 = HDF5File('filename.h5', global_shape=N, domain=((0, pi), (0, 2*np.pi), (0, 3*np.pi)))
111117

112118
2) A sequence of arrays giving the coordinates for each dimension. For example::
113119

114120
d = (np.arange(N[0], dtype=np.float)*1*np.pi/N[0],
115121
np.arange(N[1], dtype=np.float)*2*np.pi/N[1],
116122
np.arange(N[2], dtype=np.float)*2*np.pi/N[2])
117-
f0 = HDF5File('filename.h5', T, domain=d)
123+
f0 = HDF5File('filename.h5', global_shape=N, domain=d)
118124

119125
With NetCDF4 the layout is somewhat different. For ``variousfields`` above,
120126
if we were using :class:`.NCFile` instead of :class:`.HDF5File`,
@@ -147,9 +153,9 @@ opened with `Visit <https://www.visitusers.org>`_.
147153

148154
To view the HDF5-files we first need to generate some light-weight *xdmf*-files that can
149155
be understood by both Paraview and Visit. To generate such files, simply throw the
150-
module :mod:`.utilities.generate_xdmf` on the HDF5-files::
156+
module :mod:`.io.generate_xdmf` on the HDF5-files::
151157

152-
from mpi4py_fft.utilities import generate_xdmf
158+
from mpi4py_fft.io import generate_xdmf
153159
generate_xdmf('variousfields.h5')
154160

155161
This will create a number of xdmf-files, one for each group that contains 2D
Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,45 @@
1-
mpi4py_fft.utilities package
1+
mpi4py_fft.io package
22
=============================
33

44
Submodules
55
----------
66

7-
mpi4py_fft.utilities.generate_xdmf module
7+
mpi4py_fft.io.generate_xdmf module
88
-----------------------------------------
99

10-
.. automodule:: mpi4py_fft.utilities.generate_xdmf
10+
.. automodule:: mpi4py_fft.io.generate_xdmf
1111
:members:
1212
:undoc-members:
1313
:show-inheritance:
1414

15-
mpi4py_fft.utilities.h5py_file module
15+
mpi4py_fft.io.h5py_file module
1616
-------------------------------------
1717

18-
.. automodule:: mpi4py_fft.utilities.h5py_file
18+
.. automodule:: mpi4py_fft.io.h5py_file
1919
:members:
2020
:undoc-members:
2121
:show-inheritance:
2222

23-
mpi4py_fft.utilities.nc_file module
23+
mpi4py_fft.io.nc_file module
2424
-----------------------------------
2525

26-
.. automodule:: mpi4py_fft.utilities.nc_file
26+
.. automodule:: mpi4py_fft.io.nc_file
2727
:members:
2828
:undoc-members:
2929
:show-inheritance:
3030

31-
mpi4py_fft.utilities.file_base module
31+
mpi4py_fft.io.file_base module
3232
-------------------------------------
3333

34-
.. automodule:: mpi4py_fft.utilities.file_base
34+
.. automodule:: mpi4py_fft.io.file_base
3535
:members:
3636
:undoc-members:
3737
:show-inheritance:
3838

3939
Module contents
4040
---------------
4141

42-
.. automodule:: mpi4py_fft.utilities
42+
.. automodule:: mpi4py_fft.io
4343
:members:
4444
:undoc-members:
4545
:show-inheritance:

docs/source/mpi4py_fft.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Subpackages
77
.. toctree::
88

99
mpi4py_fft.fftw
10-
mpi4py_fft.utilities
10+
mpi4py_fft.io
1111

1212

1313
Submodules

mpi4py_fft/distarray.py

Lines changed: 93 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import numpy as np
44
from mpi4py import MPI
55
from .pencil import Pencil, Subcomm
6+
from .io import HDF5File, NCFile, FileBase
67

78
comm = MPI.COMM_WORLD
89

@@ -54,8 +55,12 @@ class DistArray(np.ndarray):
5455
"""
5556
def __new__(cls, global_shape, subcomm=None, val=None, dtype=np.float,
5657
buffer=None, alignment=None, rank=0):
57-
if rank > 0:
58-
assert global_shape[:rank] == (len(global_shape[rank:]),)*rank
58+
if len(global_shape) < 2:
59+
obj = np.ndarray.__new__(cls, global_shape, dtype=dtype, buffer=buffer)
60+
if buffer is None and isinstance(val, Number):
61+
obj.fill(val)
62+
obj._rank = rank
63+
return obj
5964

6065
if isinstance(subcomm, Subcomm):
6166
pass
@@ -139,13 +144,24 @@ def rank(self):
139144
"""Return tensor rank of ``self``"""
140145
return self._rank
141146

147+
@property
148+
def dimensions(self):
149+
"""Return dimensions of array not including rank"""
150+
return len(self._p0.shape)
151+
142152
def __getitem__(self, i):
143153
# Return DistArray if the result is a component of a tensor
144154
# Otherwise return ndarray view
145-
if isinstance(i, int) and self.rank > 0:
155+
if self.ndim == 1:
156+
return np.ndarray.__getitem__(self, i)
157+
158+
if isinstance(i, (int, slice)) and self.rank > 0:
146159
v0 = np.ndarray.__getitem__(self, i)
147-
v0._rank -= 1
160+
v0._rank = self.rank - (self.ndim - v0.ndim)
161+
#if v0.ndim < self.ndim:
162+
# v0._rank -= 1
148163
return v0
164+
149165
if isinstance(i, tuple) and len(i) == 2 and self.rank == 2:
150166
v0 = np.ndarray.__getitem__(self, i)
151167
v0._rank = 0
@@ -246,14 +262,14 @@ def local_slice(self):
246262
... print(l)''')
247263
>>> fx.close()
248264
>>> print(subprocess.getoutput('mpirun -np 4 python ls_script.py'))
249-
[slice(0, 16, None), slice(0, 7, None), slice(0, 6, None)]
250-
[slice(0, 16, None), slice(0, 7, None), slice(6, 12, None)]
251-
[slice(0, 16, None), slice(7, 14, None), slice(0, 6, None)]
252-
[slice(0, 16, None), slice(7, 14, None), slice(6, 12, None)]
265+
(slice(0, 16, None), slice(0, 7, None), slice(0, 6, None))
266+
(slice(0, 16, None), slice(0, 7, None), slice(6, 12, None))
267+
(slice(0, 16, None), slice(7, 14, None), slice(0, 6, None))
268+
(slice(0, 16, None), slice(7, 14, None), slice(6, 12, None))
253269
"""
254270
v = [slice(start, start+shape) for start, shape in zip(self._p0.substart,
255271
self._p0.subshape)]
256-
return [slice(0, s) for s in self.shape[:self.rank]] + v
272+
return tuple([slice(0, s) for s in self.shape[:self.rank]] + v)
257273

258274
def get_pencil_and_transfer(self, axis):
259275
"""Return pencil and transfer objects for alignment along ``axis``
@@ -339,6 +355,74 @@ def redistribute(self, axis=None, out=None):
339355

340356
return out
341357

358+
def write(self, filename, name='darray', step=0, global_slice=None,
359+
as_scalar=False):
360+
"""Write snapshot ``step`` of ``self`` to file ``filename``
361+
362+
Parameters
363+
----------
364+
filename : str or instance of :class:`.FileBase`
365+
The name of the file (or the file itself) that is used to store the
366+
requested data in ``self``
367+
name : str, optional
368+
Name used for storing snapshot in file.
369+
step : int, optional
370+
Index used for snapshot in file.
371+
global_slice : sequence of slices or integers, optional
372+
Store only this global slice of ``self``
373+
as_scalar : boolean, optional
374+
Whether to store rank > 0 arrays as scalars. Default is False.
375+
376+
Example
377+
-------
378+
>>> from mpi4py_fft import DistArray
379+
>>> u = DistArray((8, 8), val=1)
380+
>>> u.write('h5file.h5', 'u', 0)
381+
>>> u.write('h5file.h5', 'u', (slice(None), 4))
382+
"""
383+
if isinstance(filename, str):
384+
writer = HDF5File if filename.endswith('.h5') else NCFile
385+
f = writer(filename, u=self, mode='a')
386+
elif isinstance(filename, FileBase):
387+
f = filename
388+
field = [self] if global_slice is None else [(self, global_slice)]
389+
f.write(step, {name: field}, as_scalar=as_scalar)
390+
391+
def read(self, filename, name='darray', step=0):
392+
"""Read from file ``filename`` into array ``self``
393+
394+
Note
395+
----
396+
Only whole arrays can be read from file, not slices.
397+
398+
Parameters
399+
----------
400+
filename : str or instance of :class:`.FileBase`
401+
The name of the file (or the file itself) holding the data that is
402+
loaded into ``self``.
403+
name : str, optional
404+
Internal name in file of snapshot to be read.
405+
step : int, optional
406+
Index of field to be read. Default is 0.
407+
408+
Example
409+
-------
410+
>>> from mpi4py_fft import DistArray
411+
>>> u = DistArray((8, 8), val=1)
412+
>>> u.write('h5file.h5', 'u', 0)
413+
>>> v = DistArray((8, 8))
414+
>>> v.read('h5file.h5', 'u', 0)
415+
>>> assert np.allclose(u, v)
416+
417+
"""
418+
if isinstance(filename, str):
419+
writer = HDF5File if filename.endswith('.h5') else NCFile
420+
f = writer(filename, u=self, mode='r')
421+
elif isinstance(filename, FileBase):
422+
f = filename
423+
f.read(self, name, step=step)
424+
425+
342426
def newDistArray(pfft, forward_output=True, val=0, rank=0, view=False):
343427
"""Return a new :class:`.DistArray` object for provided :class:`.PFFT` object
344428

mpi4py_fft/io/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
from .h5py_file import *
22
from .nc_file import *
3+
from .file_base import *
34
from .generate_xdmf import generate_xdmf

0 commit comments

Comments
 (0)