@@ -451,10 +451,10 @@ def __str__(self):
451451 def get_data (self , caching = 'fill' ):
452452 """ Return image data from image with any necessary scalng applied
453453
454- If the image data is a array proxy (data not yet read from disk) then
455- the default behavior (`caching` == "fill") is to read the data, and
456- store in an internal cache. Future calls to ``get_data`` will return
457- the cached copy.
454+ If the image data is a array proxy (an object that knows how to load
455+ the image data from disk) then the default behavior (`caching` ==
456+ "fill") is to read the data from the proxy, and store in an internal
457+ cache. Future calls to ``get_data`` will return the cached copy.
458458
459459 Once the data has been cached and returned from a proxy array, the
460460 cached array can be modified by modifying the returned array, because
@@ -465,20 +465,84 @@ def get_data(self, caching='fill'):
465465 Parameters
466466 ----------
467467 caching : {'fill', 'unchanged'}, optional
468- This argument has no effect in the case where the image data is an
469- array, or the image data has already been cached. If the image data
470- is an array proxy, and the image data has not yet been cached, then
471- 'fill' (the default) will read the data from the array proxy, and
472- store in an internal cache, so that future calls to ``get_data``
473- will return the cached copy. If 'unchanged' then leave the current
474- state of caching unchanged; return the cached copy if it exists, if
475- not, load the data from disk and return that, but without filling
476- the cache.
468+ See the Notes section for a detailed explanation. This argument
469+ specifies whether the image object should fill in an internal
470+ cached reference to the returned image data array. "fill" specifies
471+ that the image should fill an internal cached reference if
472+ currently empty. Future calls to ``get_data`` will return this
473+ cached reference. You might prefer "fill" to save the image object
474+ from having to reload the array data from disk on each call to
475+ ``get_data``. "unchanged" means that the image should not fill in
476+ the internal cached reference if the cache is currently empty. You
477+ might prefer "unchanged" to "fill" if you want to make sure that
478+ the call to ``get_data`` does not create an extra (cached)
479+ reference to the returned array. In this case it is easier for
480+ Python to free the memory from the returned array.
477481
478482 Returns
479483 -------
480484 data : array
481485 array of image data
486+
487+ See also
488+ --------
489+ uncache: empty the array data cache
490+
491+ Notes
492+ -----
493+ All images have a property ``dataobj`` that represents the image array
494+ data. Images that have been loaded from files usually do not load the
495+ array data from file immediately, in order to reduce image load time
496+ and memory use. For these images, ``dataobj`` is an *array proxy*; an
497+ object that knows how to load the image array data from file. Images
498+ with an array proxy ``dataobj`` are called *proxy images*. In contrast,
499+ images created directly from numpy arrays carry a simple reference to
500+ their array data in ``dataobj``. These are *in-memory images*.
501+
502+ By default (`caching` == "fill"), when you call ``get_data`` on a
503+ proxy image, we load the array data from disk, store (cache) an
504+ internal reference to this array data, and return the array. The next
505+ time you call ``get_data``, you will get the cached reference to the
506+ array, so we don't have to load the array data from disk again.
507+
508+ In-memory images are already in memory, so there is no benefit to
509+ caching, and the `caching` keywords have no effect.
510+
511+ For proxy images, you may not want to fill the cache after reading the
512+ data from disk because the cache will hold onto the array memory until
513+ the image object is deleted, or you use the image ``uncache`` method.
514+ If you don't want to fill the cache, then always use
515+ ``get_data(caching='unchanged')``; in this case ``get_data`` will not
516+ fill the cache (store the reference to the array) if the cache is empty
517+ (no reference to the array). If the cache is full, "unchanged" leaves
518+ the cache full and returns the cached array reference.
519+
520+ The cache can effect the behavior of the image, because if the cache is
521+ full, or you have an in-memory image, then modifying the returned array
522+ will modify the result of future calls to ``get_data()``. For example
523+ you might do this:
524+
525+ img = load('my_image.nii') # a proxy image
526+ data = img.get_data()
527+ data[0, 0, 0] = 99
528+
529+ In this case the cache is full (default `caching='fill'), and the cache
530+ contains a reference to the returned array ``data``, so the next time
531+ you call ``get_data()``:
532+
533+ data_again = img.get_data()
534+ data_again is data # will be True
535+ data_again[0, 0, 0] == 99 # will be True
536+
537+ If you had *initially* used `caching` == 'unchanged' then the returned
538+ ``data`` array is loaded from file, but not cached, and:
539+
540+ img = load('my_image.nii') # a proxy image
541+ data = img.get_data(caching='unchanged')
542+ data[0, 0, 0] = 99
543+ data_again = img.get_data(caching='unchanged')
544+ data_again is data # will be False
545+ data_again[0, 0, 0] == 99 # will be False
482546 """
483547 if caching not in ('fill' , 'unchanged' ):
484548 raise ValueError ('caching value should be "fill" or "unchanged"' )
0 commit comments