Dataset.train_test_split#

Dataset.train_test_split(train_size: Optional[Union[int, float]] = None, test_size: Union[int, float] = 0.25, random_state: int = 42, shuffle: bool = True, stratify: Union[List, Series, ndarray, bool] = False) Tuple[TDataset, TDataset][source]#

Split dataset into random train and test datasets.

Parameters
train_sizet.Union[int, float, None] , default: None

If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

test_sizet.Union[int, float] , default: 0.25

If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples.

random_stateint , default: 42

The random state to use for shuffling.

shufflebool , default: True

Whether or not to shuffle the data before splitting.

stratifyt.Union[t.List, pd.Series, np.ndarray, bool] , default: False

If True, data is split in a stratified fashion, using the class labels. If array-like, data is split in a stratified fashion, using this as class labels.

Returns
——-
Dataset

Dataset containing train split data.

Dataset

Dataset containing test split data.