Dataset.from_numpy#

classmethod Dataset.from_numpy(*args: ndarray, columns: Optional[Sequence[Hashable]] = None, label_name: Optional[Hashable] = None, **kwargs) TDataset[source]#

Create Dataset instance from numpy arrays.

Parameters
*args: np.ndarray

Numpy array of data columns, and second optional numpy array of labels.

columnst.Sequence[Hashable] , default: None

names for the columns. If none provided, the names that will be automatically assigned to the columns will be: 1 - n (where n - number of columns)

label_namet.Hashable , default: None

labels column name. If none is provided, the name ‘target’ will be used.

**kwargsDict

additional arguments that will be passed to the main Dataset constructor.

Returns
——-
Dataset

instance of the Dataset

Raises
——
DeepchecksValueError

if receives zero or more than two numpy arrays. if columns (args[0]) is not two dimensional numpy array. if labels (args[1]) is not one dimensional numpy array. if features array or labels array is empty.

Examples

>>> import numpy
>>> from deepchecks.tabular import Dataset
>>> features = numpy.array([[0.25, 0.3, 0.3],
...                        [0.14, 0.75, 0.3],
...                        [0.23, 0.39, 0.1]])
>>> labels = numpy.array([0.1, 0.1, 0.7])
>>> dataset = Dataset.from_numpy(features, labels)

Creating dataset only from features array.

>>> dataset = Dataset.from_numpy(features)

Passing additional arguments to the main Dataset constructor

>>> dataset = Dataset.from_numpy(features, labels, max_categorical_ratio=0.5)

Specifying features and label columns names.

>>> dataset = Dataset.from_numpy(
...     features, labels,
...     columns=['sensor-1', 'sensor-2', 'sensor-3'],
...     label_name='labels'
... )