load_data#

load_data(data_format: str = 'TextData', as_train_test: bool = True, use_full_size: bool = False, include_properties: bool = True, include_embeddings: bool = False) Union[Tuple, TextData, DataFrame][source]#

Load and returns the Just Dance Comment Analysis dataset (multi-label classification).

Parameters
data_formatstr, default: ‘TextData’

Represent the format of the returned value. Can be ‘TextData’|’DataFrame’ ‘TextData’ will return the data as a TextData object ‘Dataframe’ will return the data as a pandas DataFrame object

as_train_testbool, default: True

If True, the returned data is split into train and test exactly like the toy model was trained. The first return value is the train data and the second is the test data. In order to get this model, call the load_fitted_model() function. Otherwise, returns a single object.

use_full_sizebool, default: False

If True, the returned data will be the full dataset, otherwise returns a subset of the data.

include_propertiesbool, default: True

If True, the returned data will include properties of the comments. Incompatible with data_format=’DataFrame’

include_embeddingsbool, default: False

If True, the returned data will include embeddings of the comments. Incompatible with data_format=’DataFrame’

Returns
datasetUnion[TextData, pd.DataFrame]

the data object, corresponding to the data_format attribute.

train, testTuple[Union[TextData, pd.DataFrame],Union[TextData, pd.DataFrame]

tuple if as_train_test = True. Tuple of two objects represents the dataset split to train and test sets.