xbatcher: Batch Generation from Xarray Datasets
===============================================
Xbatcher is a small library for iterating Xarray DataArrays and Datasets in
batches. The goal is to make it easy to feed Xarray objects to machine learning
libraries such as Keras_.
.. _Keras: https://keras.io/
Installation
------------
Xbatcher can be installed from PyPI as::
python -m pip install xbatcher
Or via Conda as::
conda install -c conda-forge xbatcher
Or from source as::
python -m pip install git+https://github.com/xarray-contrib/xbatcher.git
Optional Dependencies
~~~~~~~~~~~~~~~~~~~~~
.. note::
The required dependencies installed with Xbatcher are `Xarray `_,
`Dask `_, and `NumPy `_.
You will need to separately install `TensorFlow `_
or `PyTorch `_ to use those data loaders or
Xarray accessors.
To install Xbatcher and PyTorch via `Conda `_::
conda install -c conda-forge xbatcher pytorch
Or via PyPI::
python -m pip install xbatcher[torch]
To install Xbatcher and TensorFlow via `Conda `_::
conda install -c conda-forge xbatcher tensorflow
Or via PyPI::
python -m pip install xbatcher[tensorflow]
Basic Usage
-----------
Let's say we have an Xarray Dataset
.. ipython:: python
import xarray as xr
import numpy as np
da = xr.DataArray(np.random.rand(1000, 100, 100), name='foo',
dims=['time', 'y', 'x']).chunk({'time': 1})
da
and we want to create batches along the time dimension. We can do it like this
.. ipython:: python
import xbatcher
bgen = xbatcher.BatchGenerator(da, {'time': 10})
for batch in bgen:
pass
# actually feed to machine learning library
batch
or via a built-in `Xarray accessor `_:
.. ipython:: python
import xbatcher
for batch in da.batch.generator({'time': 10}):
pass
# actually feed to machine learning library
batch
.. toctree::
:maxdepth: 2
:caption: Contents:
api
tutorials-and-presentations
roadmap
contributing