Dataset.train_test_split#
- Dataset.train_test_split(train_size: Optional[Union[int, float]] = None, test_size: Union[int, float] = 0.25, random_state: int = 42, shuffle: bool = True, stratify: Union[List, Series, ndarray, bool] = False) Tuple[TDataset, TDataset] [source]#
Split dataset into random train and test datasets.
- Parameters
- train_sizet.Union[int, float, None] , default: None
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.
- test_sizet.Union[int, float] , default: 0.25
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples.
- random_stateint , default: 42
The random state to use for shuffling.
- shufflebool , default: True
Whether or not to shuffle the data before splitting.
- stratifyt.Union[t.List, pd.Series, np.ndarray, bool] , default: False
If True, data is split in a stratified fashion, using the class labels. If array-like, data is split in a stratified fashion, using this as class labels.
- Returns
- ——-
- Dataset
Dataset containing train split data.
- Dataset
Dataset containing test split data.