load_data#
- load_data(data_format: str = 'TextData', include_properties: bool = True, include_embeddings: bool = False) Tuple[Union[TextData, DataFrame], Union[TextData, DataFrame]] [source]#
Load and returns the SCIERC Abstract NER dataset (token classification).
- Parameters
- data_formatstr, default: ‘TextData’
Represent the format of the returned value. Can be ‘TextData’|’Dict’ ‘TextData’ will return the data as a TextData object ‘Dict’ will return the data as a dict of tokenized texts and IOB NER labels
- include_propertiesbool, default: True
If True, the returned data will include properties of the comments. Incompatible with data_format=’DataFrame’
- include_embeddingsbool, default: False
If True, the returned data will include embeddings of the comments. Incompatible with data_format=’DataFrame’
- Returns
- train, testTuple[Union[TextData, Dict]
Tuple of two objects represents the dataset split to train and test sets.