.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "nlp/auto_tutorials/quickstarts/plot_token_classification.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_nlp_auto_tutorials_quickstarts_plot_token_classification.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_nlp_auto_tutorials_quickstarts_plot_token_classification.py:


.. _nlp__token_classification_quickstart:

Token Classification Quickstart
*******************************

Deepchecks NLP tests your models during model development/research and before deploying to production. Using our
testing package reduces model failures and saves tests development time. In this quickstart guide, you will learn how
to use the deepchecks NLP package to analyze and evaluate token classification tasks. A token classification task is
a case in which we wish to give a specific label for each token (usually a word or a part of a word), rather than
assigning a class or classes for the text as a whole. For a more complete example showcasing the range of checks and
capabilities of the NLP package, refer to our :ref:`Multiclass Quickstart <nlp__multiclass_quickstart>`. We will
cover the following steps:

1. `Creating a TextData object and auto calculating properties <#setting-up>`__
2. `Running checks <#running-checks>`__

To run deepchecks for token classification, you need the following for both your train and test data:

1. Your tokenized text dataset - a list containing lists of strings, each string is a single token within the sample,
   where a sample can be a sentence, paragraph, document and so on.
2. Your labels - a :ref:`Token Classification <nlp__supported_token_classification>` label. These are not needed for
   checks that don't require labels (such as the Embeddings Drift check or most data integrity checks), but are needed
   for many other checks.
3. Your model's predictions (see :ref:`nlp__supported_tasks` for info on supported formats). These are needed only for
   the model related checks, shown in the `Model Evaluation <#running-checks>`__ check in this guide.

If you don't have deepchecks installed yet:

.. code:: python

    import sys
    !{sys.executable} -m pip install 'deepchecks[nlp]' -U --quiet #--user

Some properties calculated by ``deepchecks.nlp`` require additional packages to be installed. You can
also install them by running:

.. code:: python

    import sys
    !{sys.executable} -m pip install 'deepchecks[nlp-properties]' -U --quiet #--user

Setting Up
==========

Load Data
---------
For the purpose of this guide, we'll use a small subset of the
`SCIERC <http://nlp.cs.washington.edu/sciIE/>`__ dataset:

.. GENERATED FROM PYTHON SOURCE LINES 53-61

.. code-block:: default

    from pprint import pprint
    from deepchecks.nlp import TextData
    from deepchecks.nlp.datasets.token_classification import scierc_ner

    train, test = scierc_ner.load_data(data_format='Dict')
    pprint(train['text'][0][:10])
    pprint(train['label'][0][:10])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    include_properties and include_embeddings are incompatible with data_format="Dict". loading only original text data
    ['English',
     'is',
     'shown',
     'to',
     'be',
     'trans-context-free',
     'on',
     'the',
     'basis',
     'of']
    ['B-Material', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']


.. GENERATED FROM PYTHON SOURCE LINES 62-78

The SCIERC dataset is a dataset of scientific articles with annotations for named entities, relations and
coreferences.
In this example we'll only use the named entity annotations, which are the labels we'll use for our token
classification task.
We can see that we have the article text itself, and the labels for each token in the text in the
:ref:`IOB format <nlp__supported_token_classification>`.

Create a TextData Object
-------------------------

We can now create a :ref:`TextData <nlp__textdata_object>` object for the train and test dataframes.
This object is used to pass your data to the deepchecks checks.

To create a TextData object, the only required argument is the tokenized text itself. In most cases we'll want to
pass labels as well, as they are needed in order to calculate many checks. In this example we'll pass the label and
define the task type.

.. GENERATED FROM PYTHON SOURCE LINES 79-84

.. code-block:: default


    train = TextData(tokenized_text=train['text'], label=train['label'], task_type='token_classification')
    test = TextData(tokenized_text=test['text'], label=test['label'], task_type='token_classification')


.. GENERATED FROM PYTHON SOURCE LINES 85-93

Calculating Properties
----------------------

Some of deepchecks' checks use properties of the text samples for various calculations. Deepcheck has a wide
variety of such properties, some simple and some that rely on external models and are more heavy to run. In order
for deepchecks' checks to be able to use the properties, they must be added to the
:ref:`TextData <nlp__textdata_object>` object, usually by calculating them. You can read more about properties in the
:ref:`Property Guide <nlp__properties_guide>`.

.. GENERATED FROM PYTHON SOURCE LINES 93-105

.. code-block:: default


    # properties can be either calculated directly by Deepchecks
    # or imported from other sources in appropriate format

    # device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    # train.calculate_builtin_properties(
    #   include_long_calculation_properties=True, device=device
    # )
    # test.calculate_builtin_properties(
    #   include_long_calculation_properties=True, device=device
    # )


.. GENERATED FROM PYTHON SOURCE LINES 106-107

In this example though we'll use pre-calculated properties:

.. GENERATED FROM PYTHON SOURCE LINES 107-115

.. code-block:: default


    train_properties, test_properties = scierc_ner.load_properties()

    train.set_properties(train_properties, categorical_properties=['Language'])
    test.set_properties(test_properties, categorical_properties=['Language'])

    train.properties.head(2)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>Language</th>
          <th>Count URLs</th>
          <th>Count Email Address</th>
          <th>Count Unique URLs</th>
          <th>Count Unique Email Address</th>
          <th>...</th>
          <th>Formality</th>
          <th>Lexical Density</th>
          <th>Unique Noun Count</th>
          <th>Readability Score</th>
          <th>Average Sentence Length</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>en</td>
          <td>0</td>
          <td>0</td>
          <td>0</td>
          <td>0</td>
          <td>...</td>
          <td>0.997133</td>
          <td>68.38</td>
          <td>30.0</td>
          <td>34.850</td>
          <td>34.0</td>
        </tr>
        <tr>
          <th>1</th>
          <td>en</td>
          <td>0</td>
          <td>0</td>
          <td>0</td>
          <td>0</td>
          <td>...</td>
          <td>0.997115</td>
          <td>60.47</td>
          <td>32.0</td>
          <td>54.669</td>
          <td>22.0</td>
        </tr>
      </tbody>
    </table>
    <p>2 rows × 22 columns</p>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 116-130

Running Checks
==============

Train Test Performance
----------------------

Once the :ref:`TextData <nlp__textdata_object>` object is ready, we can run the checks. We'll start by running
the :ref:`TrainTestPerformance <nlp__train_test_performance>` check, which compares the performance of the model on
the train and test sets. For this check, we'll need to pass the model's predictions on the train and test sets, also
provided in the format of an IOB annotation per token in the tokenized text.

We'll also define a condition for the check with the default threshold value. You can learn more about customizing
checks and conditions, as well as defining suites of checks in our
:ref:`Customizations Guide <general__customizations>`

.. GENERATED FROM PYTHON SOURCE LINES 130-138

.. code-block:: default


    train_preds, test_preds = scierc_ner.load_precalculated_predictions()

    from deepchecks.nlp.checks import TrainTestPerformance
    check = TrainTestPerformance().add_condition_train_test_relative_degradation_less_than()
    result = check.run(train, test, train_predictions=train_preds, test_predictions=test_preds)
    result


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <!--
        ~ ----------------------------------------------------------------------------
        ~ Copyright (C) 2021-2023 Deepchecks (https://www.deepchecks.com)
        ~
        ~ This file is part of Deepchecks.
        ~ Deepchecks is distributed under the terms of the GNU Affero General
        ~ Public License (version 3 or later).
        ~ You should have received a copy of the GNU Affero General Public License
        ~ along with Deepchecks.  If not, see <http://www.gnu.org/licenses/>.
        ~ ----------------------------------------------------------------------------
        ~
    -->
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>Train Test Performance</title>
        <link rel="icon" type="image/x-icon" href="https://deepchecks.com/wp-content/themes/deepchecks/assets/favicons/favicon.ico">
        <style>
            body {
                font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol';
                font-size: 1rem;
                line-height: 1.5;
                color: #212529;
                text-align: left;
                max-width: 1200px;
                margin: auto;
                background-color: white;
            }

            div.nbinput.container div.prompt *,
            div.nboutput.container div.prompt *,
            div.nbinput.container div.input_area pre,
            div.nboutput.container div.output_area pre,
            div.nbinput.container div.input_area .highlight,
            div.nboutput.container div.output_area .highlight {
                border: none;
                padding: 0;
                margin: 0;
                box-shadow: none;
            }

            div.nbinput.container > div[class*=highlight],
            div.nboutput.container > div[class*=highlight] {
                margin: 0;
            }

            div.nbinput.container div.prompt *,
            div.nboutput.container div.prompt * {
                background: none;
            }

            div.nboutput.container div.output_area .highlight,
            div.nboutput.container div.output_area pre {
                background: unset;
            }

            div.nboutput.container div.output_area div.highlight {
                color: unset;  /* override Pygments text color */
            }

            /* avoid gaps between output lines */
            div.nboutput.container div[class*=highlight] pre {
                line-height: normal;
            }

            /* input/output containers */
            div.nbinput.container,
            div.nboutput.container {
                display: -webkit-flex;
                display: flex;
                align-items: flex-start;
                margin: 0;
                width: 100%;
            }
            @media (max-width: 540px) {
                div.nbinput.container,
                div.nboutput.container {
                    flex-direction: column;
                }
            }

            /* input container */
            div.nbinput.container {
                padding-top: 5px;
            }

            /* last container */
            div.nblast.container {
                padding-bottom: 5px;
            }

            /* input prompt */
            div.nbinput.container div.prompt pre {
                color: #307FC1;
            }

            /* output prompt */
            div.nboutput.container div.prompt pre {
                color: #BF5B3D;
            }

            /* all prompts */
            div.nbinput.container div.prompt,
            div.nboutput.container div.prompt {
                width: 4.5ex;
                padding-top: 5px;
                position: relative;
                user-select: none;
            }

            div.nbinput.container div.prompt > div,
            div.nboutput.container div.prompt > div {
                position: absolute;
                right: 0;
                margin-right: 0.3ex;
            }

            @media (max-width: 540px) {
                div.nbinput.container div.prompt,
                div.nboutput.container div.prompt {
                    width: unset;
                    text-align: left;
                    padding: 0.4em;
                }
                div.nboutput.container div.prompt.empty {
                    padding: 0;
                }

                div.nbinput.container div.prompt > div,
                div.nboutput.container div.prompt > div {
                    position: unset;
                }
            }

            /* disable scrollbars on prompts */
            div.nbinput.container div.prompt pre,
            div.nboutput.container div.prompt pre {
                overflow: hidden;
            }

            /* input/output area */
            div.nbinput.container div.input_area,
            div.nboutput.container div.output_area {
                -webkit-flex: 1;
                flex: 1;
                overflow: auto;
            }
            @media (max-width: 540px) {
                div.nbinput.container div.input_area,
                div.nboutput.container div.output_area {
                    width: 100%;
                }
            }

            /* input area */
            div.nbinput.container div.input_area {
                border: 1px solid #e0e0e0;
                border-radius: 2px;
                /*background: #f5f5f5;*/
            }

            /* override MathJax center alignment in output cells */
            div.nboutput.container div[class*=MathJax] {
                text-align: left !important;
            }

            /* override sphinx.ext.imgmath center alignment in output cells */
            div.nboutput.container div.math p {
                text-align: left;
            }

            /* standard error */
            div.nboutput.container div.output_area.stderr {
                background: #fdd;
            }

            /* ANSI colors */
            .ansi-black-fg { color: #3E424D; }
            .ansi-black-bg { background-color: #3E424D; }
            .ansi-black-intense-fg { color: #282C36; }
            .ansi-black-intense-bg { background-color: #282C36; }
            .ansi-red-fg { color: #E75C58; }
            .ansi-red-bg { background-color: #E75C58; }
            .ansi-red-intense-fg { color: #B22B31; }
            .ansi-red-intense-bg { background-color: #B22B31; }
            .ansi-green-fg { color: #00A250; }
            .ansi-green-bg { background-color: #00A250; }
            .ansi-green-intense-fg { color: #007427; }
            .ansi-green-intense-bg { background-color: #007427; }
            .ansi-yellow-fg { color: #DDB62B; }
            .ansi-yellow-bg { background-color: #DDB62B; }
            .ansi-yellow-intense-fg { color: #B27D12; }
            .ansi-yellow-intense-bg { background-color: #B27D12; }
            .ansi-blue-fg { color: #208FFB; }
            .ansi-blue-bg { background-color: #208FFB; }
            .ansi-blue-intense-fg { color: #0065CA; }
            .ansi-blue-intense-bg { background-color: #0065CA; }
            .ansi-magenta-fg { color: #D160C4; }
            .ansi-magenta-bg { background-color: #D160C4; }
            .ansi-magenta-intense-fg { color: #A03196; }
            .ansi-magenta-intense-bg { background-color: #A03196; }
            .ansi-cyan-fg { color: #60C6C8; }
            .ansi-cyan-bg { background-color: #60C6C8; }
            .ansi-cyan-intense-fg { color: #258F8F; }
            .ansi-cyan-intense-bg { background-color: #258F8F; }
            .ansi-white-fg { color: #C5C1B4; }
            .ansi-white-bg { background-color: #C5C1B4; }
            .ansi-white-intense-fg { color: #A1A6B2; }
            .ansi-white-intense-bg { background-color: #A1A6B2; }

            .ansi-default-inverse-fg { color: #FFFFFF; }
            .ansi-default-inverse-bg { background-color: #000000; }

            .ansi-bold { font-weight: bold; }
            .ansi-underline { text-decoration: underline; }


            div.nbinput.container div.input_area div[class*=highlight] > pre,
            div.nboutput.container div.output_area div[class*=highlight] > pre,
            div.nboutput.container div.output_area div[class*=highlight].math,
            div.nboutput.container div.output_area.rendered_html,
            div.nboutput.container div.output_area > div.output_javascript,
            div.nboutput.container div.output_area:not(.rendered_html) > img{
                padding: 5px;
                margin: 0;
            }

            /* fix copybtn overflow problem in chromium (needed for 'sphinx_copybutton') */
            div.nbinput.container div.input_area > div[class^='highlight'],
            div.nboutput.container div.output_area > div[class^='highlight']{
                overflow-y: hidden;
            }

            /* hide copybtn icon on prompts (needed for 'sphinx_copybutton') */
            .prompt .copybtn {
                display: none;
            }

            /* Some additional styling taken form the Jupyter notebook CSS */
            div.rendered_html table {
            border: none;
            border-collapse: collapse;
            border-spacing: 0;
            color: black;
            font-size: 12px;
            table-layout: fixed;
            }
            div.rendered_html thead {
            border-bottom: 1px solid black;
            vertical-align: bottom;
            }
            div.rendered_html tr,
            div.rendered_html th,
            div.rendered_html td {
            text-align: right;
            vertical-align: middle;
            padding: 0.5em 0.5em;
            line-height: normal;
            white-space: normal;
            max-width: none;
            border: none;
            }
            div.rendered_html th {
            font-weight: bold;
            }
            div.rendered_html tbody tr:nth-child(odd) {
            background: #f5f5f5;
            }
            div.rendered_html tbody tr:hover {
            background: rgba(66, 165, 245, 0.2);
            }
        </style>
    </head>
    <body>
        <script src="https://unpkg.com/@jupyter-widgets/html-manager@^0.20.1/dist/embed.js" crossorigin="anonymous"></script>
    

    <script type="application/vnd.jupyter.widget-state+json">
    {"version_major": 2, "version_minor": 0, "state": {"2aa444dc9b4045359051a1d1a93ddc2c": {"model_name": "VBoxModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": ["rendered_html", "jp-RenderedHTMLCommon", "jp-RenderedHTML", "jp-OutputArea-output"], "children": ["IPY_MODEL_f49bd932f34e4dc3bebc8c2065ab80f6", "IPY_MODEL_aeb4ece75c7d46bd98d5a6538588a83e", "IPY_MODEL_f754e4ecd88c419a82dafbec225975bd", "IPY_MODEL_4d9d5cd5f5e0421985f26994bbda2a52"], "layout": "IPY_MODEL_2fa5121e9c6c4cdf8ac3d98dd6d089ba"}}, "f49bd932f34e4dc3bebc8c2065ab80f6": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_4924e3f0537c46f2a43d8de688c6c3f3", "style": "IPY_MODEL_525669e4886040c2946e2c3721fdea09", "value": "<h4><b>Train Test Performance</b></h4>"}}, "4924e3f0537c46f2a43d8de688c6c3f3": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "525669e4886040c2946e2c3721fdea09": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "aeb4ece75c7d46bd98d5a6538588a83e": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_c2ef46aee6f7404cb75a48d35ce22fa6", "style": "IPY_MODEL_d294d1a54e42461da26896296ecd8f3a", "value": "<p>Summarize given model performance on the train and test datasets based on selected scorers. <a href=\"https://docs.deepchecks.com/0.18/nlp/auto_checks/model_evaluation/plot_train_test_performance.html?utm_source=display_output&utm_medium=referral&utm_campaign=check_link\" target=\"_blank\">Read More...</a></p>"}}, "c2ef46aee6f7404cb75a48d35ce22fa6": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "d294d1a54e42461da26896296ecd8f3a": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "f754e4ecd88c419a82dafbec225975bd": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_fd0e86f18e834c3aa2d4ce93300d1585", "style": "IPY_MODEL_6da61cbcc0fb4a0fa9595f37201e65e0", "value": "<h5><b>Conditions Summary</b></h5><style type=\"text/css\">\n#T_f3859_ table {\n  text-align: left;\n  white-space: pre-wrap;\n}\n#T_f3859_ thead {\n  text-align: left;\n  white-space: pre-wrap;\n}\n#T_f3859_ tbody {\n  text-align: left;\n  white-space: pre-wrap;\n}\n#T_f3859_ th {\n  text-align: left;\n  white-space: pre-wrap;\n}\n#T_f3859_ td {\n  text-align: left;\n  white-space: pre-wrap;\n}\n</style>\n<table id=\"T_f3859_\">\n  <thead>\n    <tr>\n      <th class=\"col_heading level0 col0\" >Status</th>\n      <th class=\"col_heading level0 col1\" >Condition</th>\n      <th class=\"col_heading level0 col2\" >More Info</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <td id=\"T_f3859_row0_col0\" class=\"data row0 col0\" ><div style=\"color: red;text-align: center\">\u2716</div></td>\n      <td id=\"T_f3859_row0_col1\" class=\"data row0 col1\" >Train-Test scores relative degradation is less than 0.1</td>\n      <td id=\"T_f3859_row0_col2\" class=\"data row0 col2\" >18 scores failed. Found max degradation of 45.67% for metric recall_per_class and class OtherScientificTerm.</td>\n    </tr>\n  </tbody>\n</table>\n"}}, "fd0e86f18e834c3aa2d4ce93300d1585": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "6da61cbcc0fb4a0fa9595f37201e65e0": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "4d9d5cd5f5e0421985f26994bbda2a52": {"model_name": "VBoxModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "children": ["IPY_MODEL_3cbfbdcdaee14be4b80e7e090507ab8b", "IPY_MODEL_b67813bd4340460fab21e63388e91330"], "layout": "IPY_MODEL_a418ff04ebae4dbda59d7b903a3bd686"}}, "3cbfbdcdaee14be4b80e7e090507ab8b": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_14fca5ea3d5c4e0f808c199c58e08f60", "style": "IPY_MODEL_76d4e8b68d4d4ba59e98a45020c2d5f8", "value": "<h5><b>Additional Outputs</b></h5>"}}, "14fca5ea3d5c4e0f808c199c58e08f60": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "76d4e8b68d4d4ba59e98a45020c2d5f8": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "b67813bd4340460fab21e63388e91330": {"model_name": "FigureModel", "model_module": "jupyterlab-plotly", "model_module_version": "^5.18.0", "state": {"_config": {"plotlyServerURL": "https://plot.ly"}, "_data": [{"alignmentgroup": "True", "bingroup": "x", "histfunc": "sum", "hovertemplate": "Dataset=Train<br>Metric=f1_per_class<br>Class=%{x}<br>sum of Value=%{y}<extra></extra>", "legendgroup": "Train", "marker": {"color": "#00008b", "pattern": {"shape": ""}}, "name": "Train", "offsetgroup": "Train", "orientation": "v", "showlegend": true, "x": ["Generic", "Material", "Method", "Metric", "OtherScientificTerm", "Task"], "xaxis": "x", "y": {"dtype": "float64", "shape": [6]}, "yaxis": "y", "type": "histogram", "uid": "92664849-1879-44f6-a14a-2d9b5d8a0db9"}, {"alignmentgroup": "True", "bingroup": "x", "histfunc": "sum", "hovertemplate": "Dataset=Train<br>Metric=precision_per_class<br>Class=%{x}<br>sum of Value=%{y}<extra></extra>", "legendgroup": "Train", "marker": {"color": "#00008b", "pattern": {"shape": ""}}, "name": "Train", "offsetgroup": "Train", "orientation": "v", "showlegend": false, "x": ["Generic", "Material", "Method", "Metric", "OtherScientificTerm", "Task"], "xaxis": "x2", "y": {"dtype": "float64", "shape": [6]}, "yaxis": "y2", "type": "histogram", "uid": "f55700a5-0bd8-45c6-965a-90f117f0bacf"}, {"alignmentgroup": "True", "bingroup": "x", "histfunc": "sum", "hovertemplate": "Dataset=Train<br>Metric=recall_per_class<br>Class=%{x}<br>sum of Value=%{y}<extra></extra>", "legendgroup": "Train", "marker": {"color": "#00008b", "pattern": {"shape": ""}}, "name": "Train", "offsetgroup": "Train", "orientation": "v", "showlegend": false, "x": ["Generic", "Material", "Method", "Metric", "OtherScientificTerm", "Task"], "xaxis": "x3", "y": {"dtype": "float64", "shape": [6]}, "yaxis": "y3", "type": "histogram", "uid": "617528bc-76b6-483d-bd30-4d49cd9112f7"}, {"alignmentgroup": "True", "bingroup": "x", "histfunc": "sum", "hovertemplate": "Dataset=Test<br>Metric=f1_per_class<br>Class=%{x}<br>sum of Value=%{y}<extra></extra>", "legendgroup": "Test", "marker": {"color": "#69b3a2", "pattern": {"shape": ""}}, "name": "Test", "offsetgroup": "Test", "orientation": "v", "showlegend": true, "x": ["Generic", "Material", "Method", "Metric", "OtherScientificTerm", "Task"], "xaxis": "x", "y": {"dtype": "float64", "shape": [6]}, "yaxis": "y", "type": "histogram", "uid": "03f5e23c-3fb8-43b8-ad66-5f52d2cbbcda"}, {"alignmentgroup": "True", "bingroup": "x", "histfunc": "sum", "hovertemplate": "Dataset=Test<br>Metric=precision_per_class<br>Class=%{x}<br>sum of Value=%{y}<extra></extra>", "legendgroup": "Test", "marker": {"color": "#69b3a2", "pattern": {"shape": ""}}, "name": "Test", "offsetgroup": "Test", "orientation": "v", "showlegend": false, "x": ["Generic", "Material", "Method", "Metric", "OtherScientificTerm", "Task"], "xaxis": "x2", "y": {"dtype": "float64", "shape": [6]}, "yaxis": "y2", "type": "histogram", "uid": "814bcb66-2bd7-4dbb-bfd2-39dd2d8e3a74"}, {"alignmentgroup": "True", "bingroup": "x", "histfunc": "sum", "hovertemplate": "Dataset=Test<br>Metric=recall_per_class<br>Class=%{x}<br>sum of Value=%{y}<extra></extra>", "legendgroup": "Test", "marker": {"color": "#69b3a2", "pattern": {"shape": ""}}, "name": "Test", "offsetgroup": "Test", "orientation": "v", "showlegend": false, "x": ["Generic", "Material", "Method", "Metric", "OtherScientificTerm", "Task"], "xaxis": "x3", "y": {"dtype": "float64", "shape": [6]}, "yaxis": "y3", "type": "histogram", "uid": "f8724b0d-3c3d-4009-a6b8-b6308b3825f2"}], "_dom_classes": [], "_js2py_layoutDelta": {}, "_js2py_pointsCallback": {}, "_js2py_relayout": {}, "_js2py_restyle": {}, "_js2py_traceDeltas": {}, "_js2py_update": {}, "_layout": {"annotations": [{"showarrow": false, "text": "f1_per_class", "x": 0.15, "xanchor": "center", "xref": "paper", "y": 1.0, "yanchor": "bottom", "yref": "paper"}, {"showarrow": false, "text": "precision_per_class", "x": 0.49999999999999994, "xanchor": "center", "xref": "paper", "y": 1.0, "yanchor": "bottom", "yref": "paper"}, {"showarrow": false, "text": "recall_per_class", "x": 0.85, "xanchor": "center", "xref": "paper", "y": 1.0, "yanchor": "bottom", "yref": "paper"}, {"showarrow": false, "text": "Class", "x": -0.1, "xref": "paper", "y": -0.1, "yref": "paper"}], "barmode": "group", "legend": {"title": {"text": "Dataset"}, "tracegroupgap": 0}, "margin": {"t": 60}, "template": {"data": {"barpolar": [{"marker": {"line": {"color": "#E5ECF6", "width": 0.5}, "pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "barpolar"}], "bar": [{"error_x": {"color": "#2a3f5f"}, "error_y": {"color": "#2a3f5f"}, "marker": {"line": {"color": "#E5ECF6", "width": 0.5}, "pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "bar"}], "carpet": [{"aaxis": {"endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f"}, "baxis": {"endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f"}, "type": "carpet"}], "choropleth": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "choropleth"}], "contourcarpet": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "contourcarpet"}], "contour": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "contour"}], "heatmapgl": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "heatmapgl"}], "heatmap": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "heatmap"}], "histogram2dcontour": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "histogram2dcontour"}], "histogram2d": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "histogram2d"}], "histogram": [{"marker": {"pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "histogram"}], "mesh3d": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "mesh3d"}], "parcoords": [{"line": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "parcoords"}], "pie": [{"automargin": true, "type": "pie"}], "scatter3d": [{"line": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatter3d"}], "scattercarpet": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattercarpet"}], "scattergeo": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattergeo"}], "scattergl": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattergl"}], "scattermapbox": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattermapbox"}], "scatterpolargl": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterpolargl"}], "scatterpolar": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterpolar"}], "scatter": [{"fillpattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}, "type": "scatter"}], "scatterternary": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterternary"}], "surface": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "surface"}], "table": [{"cells": {"fill": {"color": "#EBF0F8"}, "line": {"color": "white"}}, "header": {"fill": {"color": "#C8D4E3"}, "line": {"color": "white"}}, "type": "table"}]}, "layout": {"annotationdefaults": {"arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1}, "autotypenumbers": "strict", "coloraxis": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "colorscale": {"diverging": [[0, "#8e0152"], [0.1, "#c51b7d"], [0.2, "#de77ae"], [0.3, "#f1b6da"], [0.4, "#fde0ef"], [0.5, "#f7f7f7"], [0.6, "#e6f5d0"], [0.7, "#b8e186"], [0.8, "#7fbc41"], [0.9, "#4d9221"], [1, "#276419"]], "sequential": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "sequentialminus": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]]}, "colorway": ["#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52"], "font": {"color": "#2a3f5f"}, "geo": {"bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white"}, "hoverlabel": {"align": "left"}, "hovermode": "closest", "mapbox": {"style": "light"}, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": {"angularaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "bgcolor": "#E5ECF6", "radialaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}}, "scene": {"xaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}, "yaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}, "zaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}}, "shapedefaults": {"line": {"color": "#2a3f5f"}}, "ternary": {"aaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "baxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "bgcolor": "#E5ECF6", "caxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}}, "title": {"x": 0.05}, "xaxis": {"automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": {"standoff": 15}, "zerolinecolor": "white", "zerolinewidth": 2}, "yaxis": {"automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": {"standoff": 15}, "zerolinecolor": "white", "zerolinewidth": 2}}}, "xaxis": {"anchor": "y", "domain": [0.0, 0.3], "tickangle": 60, "type": "category"}, "yaxis": {"anchor": "x", "domain": [0.0, 1.0], "showticklabels": true}, "xaxis2": {"anchor": "y2", "domain": [0.35, 0.6499999999999999], "matches": "x", "tickangle": 60, "type": "category"}, "yaxis2": {"anchor": "x2", "domain": [0.0, 1.0], "showticklabels": true}, "xaxis3": {"anchor": "y3", "domain": [0.7, 1.0], "matches": "x", "tickangle": 60, "type": "category"}, "yaxis3": {"anchor": "x3", "domain": [0.0, 1.0], "showticklabels": true}}, "_py2js_addTraces": {}, "_py2js_animate": {}, "_py2js_deleteTraces": {}, "_py2js_moveTraces": {}, "_py2js_relayout": {}, "_py2js_removeLayoutProps": {}, "_py2js_removeTraceProps": {}, "_py2js_restyle": {}, "_py2js_update": {}, "_view_count": 0}, "buffers": [{"encoding": "base64", "path": ["_data", 0, "y", "buffer"], "data": "OTzlC1AV6j9CPvZiXXHEP6qgIuVr7cI/Ynsr0Gen1D++N0C1rzXKP2j9QMw4ILw/"}, {"encoding": "base64", "path": ["_data", 1, "y", "buffer"], "data": "z5Kg17Mk6D8UFBQUFBTEP8z+fDgNocI/RERERERE1D/suIVSH+zIP8x6v6z3y7o/"}, {"encoding": "base64", "path": ["_data", 2, "y", "buffer"], "data": "/vg27rJc7D92c3yNGdLEPz5BBOpGPMM/oTlULGgO1T8yusnoJqPLP77RkrP9l70/"}, {"encoding": "base64", "path": ["_data", 3, "y", "buffer"], "data": "4uHh4eHh5T/jqSEDJJ66P6NXbDtSNr8/EjlBuBv7zj884EjREH+8P8SyfuINkLY/"}, {"encoding": "base64", "path": ["_data", 4, "y", "buffer"], "data": "K8M3osCk4z/L6l3yYMC5P9OlYltveL8/EnfEHXFHzD/HcRzHcRy7P61FoT7QWrQ/"}, {"encoding": "base64", "path": ["_data", 5, "y", "buffer"], "data": "2XSy2HSy6D8XN2F+V4u7P4pU76FI9b4/DiRaYXMg0T9+4Ad+4Ae+P7N4XBfYTrk/"}]}, "a418ff04ebae4dbda59d7b903a3bd686": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "2fa5121e9c6c4cdf8ac3d98dd6d089ba": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}}}
    </script>
    <script type="application/vnd.jupyter.widget-view+json">
    {"version_major": 2, "version_minor": 0, "model_id": "2aa444dc9b4045359051a1d1a93ddc2c"}
    </script>

    </body>
    </html>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 139-150

We can see that the model performs better on the train set than on the test set, which is expected. We can also note
specifically that the recall for class "OtherScientificTerm" has declined significantly on the test set, which is
something we might want to investigate further.

Embeddings Drift
----------------

The :ref:`EmbeddingsDrift <nlp__embeddings_drift>` check compares the embeddings of the train and test sets. In
order to run this check you must have text embeddings loaded to
both datasets. You can read more about using embeddings in deepchecks NLP in our
:ref:`Embeddings Guide <nlp__embeddings_guide>`. In this example, we have the embeddings already pre-calculated:

.. GENERATED FROM PYTHON SOURCE LINES 150-157

.. code-block:: default


    train_embeddings, test_embeddings = scierc_ner.load_embeddings()


    train.set_embeddings(train_embeddings)
    test.set_embeddings(test_embeddings)


.. GENERATED FROM PYTHON SOURCE LINES 158-160

You can also calculate the embeddings using deepchecks, either using an
open-source sentence-transformer or using Open AI’s embedding API.

.. GENERATED FROM PYTHON SOURCE LINES 160-164

.. code-block:: default


    # train.calculate_builtin_embeddings()
    # test.calculate_builtin_embeddings()


.. GENERATED FROM PYTHON SOURCE LINES 165-172

.. code-block:: default


    from deepchecks.nlp.checks import TextEmbeddingsDrift

    check = TextEmbeddingsDrift()
    res = check.run(train, test)
    res.show()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    n_jobs value -1 overridden to 1 by setting random_state. Use no seed for parallelism.
    n_jobs value -1 overridden to 1 by setting random_state. Use no seed for parallelism.


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <!--
        ~ ----------------------------------------------------------------------------
        ~ Copyright (C) 2021-2023 Deepchecks (https://www.deepchecks.com)
        ~
        ~ This file is part of Deepchecks.
        ~ Deepchecks is distributed under the terms of the GNU Affero General
        ~ Public License (version 3 or later).
        ~ You should have received a copy of the GNU Affero General Public License
        ~ along with Deepchecks.  If not, see <http://www.gnu.org/licenses/>.
        ~ ----------------------------------------------------------------------------
        ~
    -->
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>Embeddings Drift</title>
        <link rel="icon" type="image/x-icon" href="https://deepchecks.com/wp-content/themes/deepchecks/assets/favicons/favicon.ico">
        <style>
            body {
                font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol';
                font-size: 1rem;
                line-height: 1.5;
                color: #212529;
                text-align: left;
                max-width: 1200px;
                margin: auto;
                background-color: white;
            }

            div.nbinput.container div.prompt *,
            div.nboutput.container div.prompt *,
            div.nbinput.container div.input_area pre,
            div.nboutput.container div.output_area pre,
            div.nbinput.container div.input_area .highlight,
            div.nboutput.container div.output_area .highlight {
                border: none;
                padding: 0;
                margin: 0;
                box-shadow: none;
            }

            div.nbinput.container > div[class*=highlight],
            div.nboutput.container > div[class*=highlight] {
                margin: 0;
            }

            div.nbinput.container div.prompt *,
            div.nboutput.container div.prompt * {
                background: none;
            }

            div.nboutput.container div.output_area .highlight,
            div.nboutput.container div.output_area pre {
                background: unset;
            }

            div.nboutput.container div.output_area div.highlight {
                color: unset;  /* override Pygments text color */
            }

            /* avoid gaps between output lines */
            div.nboutput.container div[class*=highlight] pre {
                line-height: normal;
            }

            /* input/output containers */
            div.nbinput.container,
            div.nboutput.container {
                display: -webkit-flex;
                display: flex;
                align-items: flex-start;
                margin: 0;
                width: 100%;
            }
            @media (max-width: 540px) {
                div.nbinput.container,
                div.nboutput.container {
                    flex-direction: column;
                }
            }

            /* input container */
            div.nbinput.container {
                padding-top: 5px;
            }

            /* last container */
            div.nblast.container {
                padding-bottom: 5px;
            }

            /* input prompt */
            div.nbinput.container div.prompt pre {
                color: #307FC1;
            }

            /* output prompt */
            div.nboutput.container div.prompt pre {
                color: #BF5B3D;
            }

            /* all prompts */
            div.nbinput.container div.prompt,
            div.nboutput.container div.prompt {
                width: 4.5ex;
                padding-top: 5px;
                position: relative;
                user-select: none;
            }

            div.nbinput.container div.prompt > div,
            div.nboutput.container div.prompt > div {
                position: absolute;
                right: 0;
                margin-right: 0.3ex;
            }

            @media (max-width: 540px) {
                div.nbinput.container div.prompt,
                div.nboutput.container div.prompt {
                    width: unset;
                    text-align: left;
                    padding: 0.4em;
                }
                div.nboutput.container div.prompt.empty {
                    padding: 0;
                }

                div.nbinput.container div.prompt > div,
                div.nboutput.container div.prompt > div {
                    position: unset;
                }
            }

            /* disable scrollbars on prompts */
            div.nbinput.container div.prompt pre,
            div.nboutput.container div.prompt pre {
                overflow: hidden;
            }

            /* input/output area */
            div.nbinput.container div.input_area,
            div.nboutput.container div.output_area {
                -webkit-flex: 1;
                flex: 1;
                overflow: auto;
            }
            @media (max-width: 540px) {
                div.nbinput.container div.input_area,
                div.nboutput.container div.output_area {
                    width: 100%;
                }
            }

            /* input area */
            div.nbinput.container div.input_area {
                border: 1px solid #e0e0e0;
                border-radius: 2px;
                /*background: #f5f5f5;*/
            }

            /* override MathJax center alignment in output cells */
            div.nboutput.container div[class*=MathJax] {
                text-align: left !important;
            }

            /* override sphinx.ext.imgmath center alignment in output cells */
            div.nboutput.container div.math p {
                text-align: left;
            }

            /* standard error */
            div.nboutput.container div.output_area.stderr {
                background: #fdd;
            }

            /* ANSI colors */
            .ansi-black-fg { color: #3E424D; }
            .ansi-black-bg { background-color: #3E424D; }
            .ansi-black-intense-fg { color: #282C36; }
            .ansi-black-intense-bg { background-color: #282C36; }
            .ansi-red-fg { color: #E75C58; }
            .ansi-red-bg { background-color: #E75C58; }
            .ansi-red-intense-fg { color: #B22B31; }
            .ansi-red-intense-bg { background-color: #B22B31; }
            .ansi-green-fg { color: #00A250; }
            .ansi-green-bg { background-color: #00A250; }
            .ansi-green-intense-fg { color: #007427; }
            .ansi-green-intense-bg { background-color: #007427; }
            .ansi-yellow-fg { color: #DDB62B; }
            .ansi-yellow-bg { background-color: #DDB62B; }
            .ansi-yellow-intense-fg { color: #B27D12; }
            .ansi-yellow-intense-bg { background-color: #B27D12; }
            .ansi-blue-fg { color: #208FFB; }
            .ansi-blue-bg { background-color: #208FFB; }
            .ansi-blue-intense-fg { color: #0065CA; }
            .ansi-blue-intense-bg { background-color: #0065CA; }
            .ansi-magenta-fg { color: #D160C4; }
            .ansi-magenta-bg { background-color: #D160C4; }
            .ansi-magenta-intense-fg { color: #A03196; }
            .ansi-magenta-intense-bg { background-color: #A03196; }
            .ansi-cyan-fg { color: #60C6C8; }
            .ansi-cyan-bg { background-color: #60C6C8; }
            .ansi-cyan-intense-fg { color: #258F8F; }
            .ansi-cyan-intense-bg { background-color: #258F8F; }
            .ansi-white-fg { color: #C5C1B4; }
            .ansi-white-bg { background-color: #C5C1B4; }
            .ansi-white-intense-fg { color: #A1A6B2; }
            .ansi-white-intense-bg { background-color: #A1A6B2; }

            .ansi-default-inverse-fg { color: #FFFFFF; }
            .ansi-default-inverse-bg { background-color: #000000; }

            .ansi-bold { font-weight: bold; }
            .ansi-underline { text-decoration: underline; }


            div.nbinput.container div.input_area div[class*=highlight] > pre,
            div.nboutput.container div.output_area div[class*=highlight] > pre,
            div.nboutput.container div.output_area div[class*=highlight].math,
            div.nboutput.container div.output_area.rendered_html,
            div.nboutput.container div.output_area > div.output_javascript,
            div.nboutput.container div.output_area:not(.rendered_html) > img{
                padding: 5px;
                margin: 0;
            }

            /* fix copybtn overflow problem in chromium (needed for 'sphinx_copybutton') */
            div.nbinput.container div.input_area > div[class^='highlight'],
            div.nboutput.container div.output_area > div[class^='highlight']{
                overflow-y: hidden;
            }

            /* hide copybtn icon on prompts (needed for 'sphinx_copybutton') */
            .prompt .copybtn {
                display: none;
            }

            /* Some additional styling taken form the Jupyter notebook CSS */
            div.rendered_html table {
            border: none;
            border-collapse: collapse;
            border-spacing: 0;
            color: black;
            font-size: 12px;
            table-layout: fixed;
            }
            div.rendered_html thead {
            border-bottom: 1px solid black;
            vertical-align: bottom;
            }
            div.rendered_html tr,
            div.rendered_html th,
            div.rendered_html td {
            text-align: right;
            vertical-align: middle;
            padding: 0.5em 0.5em;
            line-height: normal;
            white-space: normal;
            max-width: none;
            border: none;
            }
            div.rendered_html th {
            font-weight: bold;
            }
            div.rendered_html tbody tr:nth-child(odd) {
            background: #f5f5f5;
            }
            div.rendered_html tbody tr:hover {
            background: rgba(66, 165, 245, 0.2);
            }
        </style>
    </head>
    <body>
        <script src="https://unpkg.com/@jupyter-widgets/html-manager@^0.20.1/dist/embed.js" crossorigin="anonymous"></script>
    

    <script type="application/vnd.jupyter.widget-state+json">
    {"version_major": 2, "version_minor": 0, "state": {"b3b4ad8d01ca4a3da39fb1a405e5aa80": {"model_name": "VBoxModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": ["rendered_html", "jp-RenderedHTMLCommon", "jp-RenderedHTML", "jp-OutputArea-output"], "children": ["IPY_MODEL_cf493dc3f954460c9556e117bb144d94", "IPY_MODEL_25c34c695c4d43edb8d245a4a48587b1", "IPY_MODEL_d77288bc9b4e46c3941f347268730ba4", "IPY_MODEL_9391e8d4badc4615917cbd3ce3a70c84"], "layout": "IPY_MODEL_40accb9ec4f04950bb7c9d34a8f852bb"}}, "cf493dc3f954460c9556e117bb144d94": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_4924f760da554f15bd6c9b843cf277ea", "style": "IPY_MODEL_d1a641f5ed014d0a9fb33a34c716327e", "value": "<h4><b>Embeddings Drift</b></h4>"}}, "4924f760da554f15bd6c9b843cf277ea": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "d1a641f5ed014d0a9fb33a34c716327e": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "25c34c695c4d43edb8d245a4a48587b1": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_e501b9ffe5c64c309e82f8a434a1e999", "style": "IPY_MODEL_2c8b8cfa515843ebaf7f79aa7edabd88", "value": "<p>    Calculate drift between the train and test datasets using a model trained to distinguish between their embeddings. <a href=\"https://docs.deepchecks.com/0.18/nlp/auto_checks/train_test_validation/plot_text_embeddings_drift.html?utm_source=display_output&utm_medium=referral&utm_campaign=check_link\" target=\"_blank\">Read More...</a></p>"}}, "e501b9ffe5c64c309e82f8a434a1e999": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "2c8b8cfa515843ebaf7f79aa7edabd88": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "d77288bc9b4e46c3941f347268730ba4": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_e0c90d0e6d054fc59cae6912b1d0858f", "style": "IPY_MODEL_45a15b0a7206484a9892e5c8ecd3a728"}}, "e0c90d0e6d054fc59cae6912b1d0858f": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "45a15b0a7206484a9892e5c8ecd3a728": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "9391e8d4badc4615917cbd3ce3a70c84": {"model_name": "VBoxModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "children": ["IPY_MODEL_f1d59f90f01749b4827794b3ffd6f9bc", "IPY_MODEL_eadf74bf944f4c7089e12b0bf8fab21c", "IPY_MODEL_128f2a31b3964bedb5876d2dd61a8114"], "layout": "IPY_MODEL_8506dddcbeac4c9c8f58c4c83dc95c4d"}}, "f1d59f90f01749b4827794b3ffd6f9bc": {"model_name": "HTMLModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {"_dom_classes": [], "layout": "IPY_MODEL_6d0eeea3f1244b0c99ceb45baffbc472", "style": "IPY_MODEL_3652476e47d54fd797768b40a270c901", "value": "<h5><b>Additional Outputs</b></h5>"}}, "6d0eeea3f1244b0c99ceb45baffbc472": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "3652476e47d54fd797768b40a270c901": {"model_name": "DescriptionStyleModel", "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "state": {}}, "eadf74bf944f4c7089e12b0bf8fab21c": {"model_name": "FigureModel", "model_module": "jupyterlab-plotly", "model_module_version": "^5.18.0", "state": {"_config": {"plotlyServerURL": "https://plot.ly"}, "_data": [{"base": 0, "marker": {"color": "#01B8AA"}, "offsetgroup": "0", "orientation": "h", "showlegend": false, "x": [0.1], "y": ["Drift Score"], "type": "bar", "uid": "d0105cc3-954f-4f51-8149-f699992824dd"}, {"base": 0.1, "marker": {"color": "#F2C80F"}, "offsetgroup": "0", "orientation": "h", "showlegend": false, "x": [0.042222222222222244], "y": ["Drift Score"], "type": "bar", "uid": "702b1c56-8d33-4e09-b91c-fa127d17c373"}], "_dom_classes": [], "_js2py_layoutDelta": {}, "_js2py_pointsCallback": {}, "_js2py_relayout": {}, "_js2py_restyle": {}, "_js2py_traceDeltas": {}, "_js2py_update": {}, "_layout": {"height": 200, "template": {"data": {"barpolar": [{"marker": {"line": {"color": "#E5ECF6", "width": 0.5}, "pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "barpolar"}], "bar": [{"error_x": {"color": "#2a3f5f"}, "error_y": {"color": "#2a3f5f"}, "marker": {"line": {"color": "#E5ECF6", "width": 0.5}, "pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "bar"}], "carpet": [{"aaxis": {"endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f"}, "baxis": {"endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f"}, "type": "carpet"}], "choropleth": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "choropleth"}], "contourcarpet": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "contourcarpet"}], "contour": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "contour"}], "heatmapgl": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "heatmapgl"}], "heatmap": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "heatmap"}], "histogram2dcontour": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "histogram2dcontour"}], "histogram2d": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "histogram2d"}], "histogram": [{"marker": {"pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "histogram"}], "mesh3d": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "mesh3d"}], "parcoords": [{"line": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "parcoords"}], "pie": [{"automargin": true, "type": "pie"}], "scatter3d": [{"line": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatter3d"}], "scattercarpet": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattercarpet"}], "scattergeo": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattergeo"}], "scattergl": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattergl"}], "scattermapbox": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattermapbox"}], "scatterpolargl": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterpolargl"}], "scatterpolar": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterpolar"}], "scatter": [{"fillpattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}, "type": "scatter"}], "scatterternary": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterternary"}], "surface": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "surface"}], "table": [{"cells": {"fill": {"color": "#EBF0F8"}, "line": {"color": "white"}}, "header": {"fill": {"color": "#C8D4E3"}, "line": {"color": "white"}}, "type": "table"}]}, "layout": {"annotationdefaults": {"arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1}, "autotypenumbers": "strict", "coloraxis": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "colorscale": {"diverging": [[0, "#8e0152"], [0.1, "#c51b7d"], [0.2, "#de77ae"], [0.3, "#f1b6da"], [0.4, "#fde0ef"], [0.5, "#f7f7f7"], [0.6, "#e6f5d0"], [0.7, "#b8e186"], [0.8, "#7fbc41"], [0.9, "#4d9221"], [1, "#276419"]], "sequential": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "sequentialminus": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]]}, "colorway": ["#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52"], "font": {"color": "#2a3f5f"}, "geo": {"bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white"}, "hoverlabel": {"align": "left"}, "hovermode": "closest", "mapbox": {"style": "light"}, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": {"angularaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "bgcolor": "#E5ECF6", "radialaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}}, "scene": {"xaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}, "yaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}, "zaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}}, "shapedefaults": {"line": {"color": "#2a3f5f"}}, "ternary": {"aaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "baxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "bgcolor": "#E5ECF6", "caxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}}, "title": {"x": 0.05}, "xaxis": {"automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": {"standoff": 15}, "zerolinecolor": "white", "zerolinewidth": 2}, "yaxis": {"automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": {"standoff": 15}, "zerolinecolor": "white", "zerolinewidth": 2}}}, "title": {"text": "Drift Score - Multivariable"}, "xaxis": {"dtick": 0.05, "fixedrange": true, "gridcolor": "black", "linecolor": "black", "range": [0, 0.4], "showgrid": false, "title": {"text": "Drift score"}}, "yaxis": {"autorange": true, "color": "black", "fixedrange": true, "rangemode": "normal", "showgrid": false, "showline": false, "showticklabels": false, "zeroline": false}}, "_py2js_addTraces": {}, "_py2js_animate": {}, "_py2js_deleteTraces": {}, "_py2js_moveTraces": {}, "_py2js_relayout": {}, "_py2js_removeLayoutProps": {}, "_py2js_removeTraceProps": {}, "_py2js_restyle": {}, "_py2js_update": {}, "_view_count": 0}}, "128f2a31b3964bedb5876d2dd61a8114": {"model_name": "FigureModel", "model_module": "jupyterlab-plotly", "model_module_version": "^5.18.0", "state": {"_config": {"plotlyServerURL": "https://plot.ly"}, "_data": [{"customdata": [["{'O': 83, 'Generic': 2, 'OtherScientificTerm': 8, 'Material'-<br>: 2, 'Metric': 1, 'Method': 1}", "We describe a novel technique and implemented <b>system</b><br>for constructing a <b>subcategorization</b> <b>dictionary-<br></b> from <b>textual</b> <b>corpora</b> . Each dictionary<br>entry encodes the <b>relative</b> <b>frequency</b><br><b>of</b> <b>occurrence</b> of a comprehensive set of<br>subcategorization classes for English . An initial<br>experiment , on a sample of 14 verbs which exhibit<br>multiple complementation patterns , demonstrates that the<br>technique achieves accuracy comparable to previous<br>approaches , which are all limited to a highly restricted..."], ["{'O': 98, 'Task': 10, 'Generic': 7, 'Method': 2, 'OtherScien-<br>tificTerm': 10}", "There is an increased interest in the efficient <b>creation-<br></b> <b>of</b> <b>city</b> <b>models</b> , be it virtual or<br>as-built . We present a <b>method</b> for <b>synthesizing-<br></b> <b>complex</b> <b>,</b> <b>photo-realistic</b><br><b>facade</b> <b>images</b> , from a single example .<br>After parsing the example image into its <b>semantic</b><br><b>components</b> , a <b>tiling</b> for it is generated .<br>Novel <b>tilings</b> can then be created , yielding<br><b>facade</b> <b>textures</b> with different dimensions<br>or with <b>occluded</b> <b>parts</b> <b>inpainted</b> . A..."], ["{'O': 104, 'Task': 2, 'Material': 6, 'Method': 15,<br>'Generic': 5, 'OtherScientificTerm': 8}", "This article deals with the interpretation of <b>conceptual-<br></b> <b>operations</b> underlying the communicative use of<br><b>natural</b> <b>language</b> <b>-LRB-</b> <b>NL</b><br><b>-RRB-</b> within the <b>Structured</b> <b>Inheritance-<br></b> <b>Network</b> <b>-LRB-</b> <b>SI-Nets</b> <b>-RRB--<br></b> <b>paradigm</b> . The <b>operations</b> are reduced<br>to functions of a <b>formal</b> <b>language</b> , thus<br>changing the level of abstraction of the <b>operations-<br></b> to be performed on <b>SI-Nets</b> . In this sense<br>, <b>operations</b> on <b>SI-Nets</b> are not merely..."], ["{'O': 74, 'Method': 17, 'OtherScientificTerm': 9, 'Task':<br>11, 'Generic': 2}", "We focus on the problem of building large repositories of<br><b>lexical</b> <b>conceptual</b> <b>structure</b> <b>-LRB--<br></b> <b>LCS</b> <b>-RRB-</b> <b>representations</b> for<br>verbs in multiple languages . One of the main results of<br>this work is the definition of a relation between <b>broad-<br></b> <b>semantic</b> <b>classes</b> and <b>LCS</b><br><b>meaning</b> <b>components</b> . Our <b>acquisition</b><br><b>program</b> <b>-</b> <b>LEXICALL</b> <b>-</b> takes ,<br>as input , the result of previous work on <b>verb</b><br><b>classification</b> and <b>thematic</b> <b>grid</b>..."], ["{'O': 145, 'Method': 30, 'Task': 9, 'Material': 8, 'Metric':<br>12, 'Generic': 2, 'OtherScientificTerm': 2}", "Recently , <b>Stacked</b> <b>Auto-Encoders</b> <b>-LRB-</b><br><b>SAE</b> <b>-RRB-</b> have been successfully used for<br><b>learning</b> <b>imbalanced</b> <b>datasets</b> . In this<br>paper , for the first time , we propose to use a <b>Neural-<br></b> <b>Network</b> <b>classifier</b> furnished by an<br><b>SAE</b> <b>structure</b> for detecting the errors made<br>by a strong <b>Automatic</b> <b>Speech</b> <b>Recognition-<br></b> <b>-LRB-</b> <b>ASR</b> <b>-RRB-</b> <b>system</b> .<br><b>Error</b> <b>detection</b> on an <b>automatic</b><br><b>transcription</b> provided by a '' strong '' <b>ASR-..."], ["{'O': 63, 'Method': 17, 'Material': 4, 'Generic': 3, 'Task':<br>6, 'Metric': 2}", "This paper proposes the <b>Hierarchical</b> <b>Directed</b><br><b>Acyclic</b> <b>Graph</b> <b>-LRB-</b> <b>HDAG</b><br><b>-RRB-</b> <b>Kernel</b> for <b>structured</b> <b>natural-<br></b> <b>language</b> <b>data</b> . The <b>HDAG</b><br><b>Kernel</b> directly accepts several levels of both<br>chunks and their relations , and then efficiently computes<br>the weighed sum of the number of common attribute sequences<br>of the <b>HDAGs</b> . We applied the proposed <b>method</b><br>to <b>question</b> <b>classification</b> <b>and</b><br><b>sentence</b> <b>alignment</b> <b>tasks</b> to evaluate..."], ["{'O': 55, 'Method': 17, 'Task': 5, 'OtherScientificTerm': 8,<br>'Metric': 3}", "In this paper , we describe a <b>phrase-based</b> <b>unigram-<br></b> <b>model</b> for <b>statistical</b> <b>machine</b><br><b>translation</b> that uses a much simpler set of<br><b>model</b> <b>parameters</b> than similar <b>phrase-based-<br></b> <b>models</b> . The units of translation are <b>blocks-<br></b> - pairs of phrases . During <b>decoding</b> , we use a<br><b>block</b> <b>unigram</b> <b>model</b> and a <b>word-base-<br>d</b> <b>trigram</b> <b>language</b> <b>model</b> . During<br><b>training</b> , the <b>blocks</b> are learned from<br><b>source</b> <b>interval</b> <b>projections</b> using an..."], ["{'O': 29, 'Task': 5, 'Method': 8, 'Generic': 1, 'OtherScient-<br>ificTerm': 4}", "This paper proposes that <b>sentence</b> <b>analysis</b><br>should be treated as <b>defeasible</b> <b>reasoning</b> ,<br>and presents such a <b>treatment</b> for <b>Japanese</b><br><b>sentence</b> <b>analyses</b> using an <b>argumentation-<br></b> <b>system</b> by Konolige , which is a <b>formalizat-<br>ion</b> <b>of</b> <b>defeasible</b> <b>reasoning</b> ,<br>that includes <b>arguments</b> and <b>defeat</b> <b>rules-<br></b> that capture <b>defeasibility</b> ."], ["{'O': 51, 'OtherScientificTerm': 9, 'Task': 9, 'Method': 2,<br>'Metric': 9}", "In this paper we describe a novel <b>data</b> <b>structure-<br></b> for <b>phrase-based</b> <b>statistical</b> <b>machine-<br></b> <b>translation</b> which allows for the <b>retrieval-<br></b> <b>of</b> <b>arbitrarily</b> <b>long</b> <b>phrases-<br></b> while simultaneously using less <b>memory</b> than<br>is required by current <b>decoder</b> implementations .<br>We detail the <b>computational</b> <b>complexity</b> and<br><b>average</b> <b>retrieval</b> <b>times</b> for looking<br>up <b>phrase</b> <b>translations</b> in our <b>suffix-<br></b> <b>array-based</b> <b>data</b> <b>structure</b>..."], ["{'O': 41, 'Method': 13, 'Generic': 1, 'OtherScientificTerm':<br>8, 'Task': 1}", "One of the claimed benefits of <b>Tree</b> <b>Adjoining</b><br><b>Grammars</b> is that <b>they</b> have an <b>extended</b><br><b>domain</b> <b>of</b> <b>locality</b> <b>-LRB-</b><br><b>EDOL</b> <b>-RRB-</b> . We consider how this can be<br>exploited to limit the need for <b>feature</b> <b>structure-<br></b> <b>unification</b> during <b>parsing</b> . We compare<br>two wide-coverage <b>lexicalized</b> <b>grammars</b><br><b>of</b> <b>English</b> , <b>LEXSYS</b> and <b>XTAG</b> ,<br>finding that the two <b>grammars</b> exploit <b>EDOL</b> in<br>different ways ."], ["{'Task': 9, 'O': 78, 'Method': 22, 'Material': 7, 'OtherScie-<br>ntificTerm': 6}", "<b>Lyric-based</b> <b>song</b> <b>sentiment</b> <b>classific-<br>ation</b> seeks to assign songs appropriate sentiment labels<br>such as light-hearted heavy-hearted . Four problems render<br><b>vector</b> <b>space</b> <b>model</b> <b>-LRB-</b><br><b>VSM</b> <b>-RRB-</b> <b>-</b> <b>based</b> <b>text</b><br><b>classification</b> <b>approach</b> ineffective : 1 -RRB-<br>Many words within <b>song</b> <b>lyrics</b> actually<br>contribute little to <b>sentiment</b> ; 2 -RRB- Nouns and<br>verbs used to express sentiment are ambiguous ; 3 -RRB-<br><b>Negations</b> and <b>modifiers</b> around the <b>sentimen-..."], ["{'O': 151, 'Method': 30, 'OtherScientificTerm': 21,<br>'Metric': 9, 'Generic': 1}", "This paper presents a novel <b>statistical</b> <b>singing-<br></b> <b>voice</b> <b>conversion</b> <b>-LRB-</b> <b>SVC-<br></b> <b>-RRB-</b> <b>technique</b> with <b>direct</b><br><b>waveform</b> <b>modification</b> based on the <b>spe-<br>ctrum</b> <b>differential</b> that can convert <b>voice-<br></b> <b>timbre</b> of a source singer into that of a<br>target singer without using a <b>vocoder</b> to generat-<br>e <b>converted</b> <b>singing</b> <b>voice</b> <b>wavef-<br>orms</b> . <b>SVC</b> makes it possible to convert<br><b>singing</b> <b>voice</b> <b>characteristics</b> of..."], ["{'O': 100, 'Material': 18, 'OtherScientificTerm': 11,<br>'Generic': 4}", "During late-2013 through early-2014 NIST coordinated a<br>special <b>i-vector</b> <b>challenge</b> based on data used<br>in previous <b>NIST</b> <b>Speaker</b> <b>Recognition</b><br><b>Evaluations</b> <b>-LRB-</b> <b>SREs</b> <b>-RRB-</b> .<br>Unlike evaluations in the <b>SRE</b> <b>series</b> , the<br><b>i-vector</b> <b>challenge</b> was run entirely online and<br>used <b>fixed-length</b> <b>feature</b> <b>vectors</b><br>projected into a <b>low-dimensional</b> <b>space</b><br><b>-LRB-</b> <b>i-vectors</b> <b>-RRB-</b> rather than<br><b>audio</b> <b>recordings</b> . These changes made the..."], ["{'O': 140, 'Task': 7, 'Material': 2, 'OtherScientificTerm':<br>39, 'Generic': 6, 'Method': 3}", "The <b>construction</b> <b>of</b> <b>causal</b> <b>graphs-<br></b> from <b>non-experimental</b> <b>data</b> rests on a<br>set of <b>constraints</b> that the <b>graph</b> <b>struct-<br>ure</b> imposes on all <b>probability</b> <b>distribution-<br>s</b> compatible with the <b>graph</b> . These <b>constra-<br>ints</b> are of two types : <b>conditional</b> <b>inde-pe-<br>ndencies</b> and <b>algebraic</b> <b>constraints</b> ,<br>first noted by Verma . While <b>conditional</b> <b>indepe-<br>ndencies</b> are well studied and frequently used in<br><b>causal</b> <b>induction</b> <b>algorithms</b> ,..."], ["{'Method': 14, 'O': 64, 'Task': 7, 'Generic': 3}", "<b>Systemic</b> <b>grammar</b> has been used for <b>AI</b><br><b>text</b> <b>generation</b> work in the past , but the<br><b>implementations</b> have tended be ad hoc or inefficient<br>. This paper presents an <b>approach</b> to systemic<br><b>text</b> <b>generation</b> where <b>AI</b> <b>problem</b><br><b>solving</b> <b>techniques</b> are applied directly to an<br>unadulterated <b>systemic</b> <b>grammar</b> . This<br><b>approach</b> is made possible by a special relationship<br>between <b>systemic</b> <b>grammar</b> and <b>problem</b><br><b>solving</b> : both are organized primarily as choosing..."], ["{'O': 110, 'Material': 4, 'Method': 7, 'OtherScientificTerm'-<br>: 19, 'Generic': 4, 'Task': 17}", "In a motorized vehicle a number of easily <b>measurable</b><br><b>signals</b> with <b>frequency</b> <b>components</b><br>related to the <b>rotational</b> <b>speed</b> <b>of</b><br><b>the</b> <b>engine</b> can be found , e.g. , <b>vibrations-<br></b> , <b>electrical</b> <b>system</b> <b>voltage</b><br><b>level</b> , and <b>ambient</b> <b>sound</b> . These<br><b>signals</b> could potentially be used to estimate the<br><b>speed</b> <b>and</b> <b>related</b> <b>states</b><br><b>of</b> <b>the</b> <b>vehicle</b> . Unfortunately , such<br>estimates would typically require the relations -LRB- scale..."], ["{'Method': 10, 'O': 92, 'OtherScientificTerm': 21,<br>'Material': 19, 'Generic': 2}", "<b>LPC</b> <b>based</b> <b>speech</b> <b>coders</b><br>operating at <b>bit</b> <b>rates</b> below 3.0 kbits/sec are<br>usually associated with <b>buzzy</b> <b>or</b> <b>metallic-<br></b> <b>artefacts</b> in the <b>synthetic</b> <b>speech-<br></b> . These are mainly attributable to the simplifying<br>assumptions made about the <b>excitation</b> <b>source-<br></b> , which are usually required to maintain such<br><b>low</b> <b>bit</b> <b>rates</b> . In this paper a<br>new <b>LPC</b> <b>vocoder</b> is presented which split-<br>s the <b>LPC</b> <b>excitation</b> into two <b>frequen-..."], ["{'Task': 15, 'O': 76, 'Method': 15, 'Generic': 4, 'OtherScie-<br>ntificTerm': 5}", "<b>Background</b> <b>maintenance</b> is a frequent element<br>of <b>video</b> <b>surveillance</b> <b>systems</b> . We<br>develop <b>Wallflower</b> , a <b>three-component</b><br><b>system</b> for <b>background</b> <b>maintenance</b> : the<br><b>pixel-level</b> <b>component</b> performs <b>Wiener</b><br><b>filtering</b> to make <b>probabilistic</b> <b>predictions-<br></b> <b>of</b> <b>the</b> <b>expected</b> <b>background</b><br>; the <b>region-level</b> <b>component</b> fills in<br><b>homogeneous</b> <b>regions</b> <b>of</b> <b>foreground-<br></b> <b>objects</b> ; and the <b>frame-level</b> <b>compo-..."], ["{'O': 183, 'Method': 16, 'Generic': 1, 'OtherScientificTerm'-<br>: 2, 'Task': 7}", "A <b>flexible</b> <b>parser</b> can deal with input that<br>deviates from its grammar , in addition to input that<br>conforms to <b>it</b> . Ideally , such a <b>parser</b> will<br>correct the deviant input : sometimes , it will be unable to<br>correct it at all ; at other times , correction will be<br>possible , but only to within a range of ambiguous<br>possibilities . This paper is concerned with such ambiguous<br>situations , and with making it as easy as possible for the<br><b>ambiguity</b> to be resolved through consultation with<br>the user of the <b>parser</b> - we presume interactive use ...."], ["{'Method': 18, 'O': 78, 'Generic': 3, 'Material': 5,<br>'OtherScientificTerm': 8, 'Task': 3}", "<b>Statistical</b> <b>language</b> <b>modeling</b> remains a<br>challenging <b>task</b> , in particular for <b>morphological-<br>ly</b> <b>rich</b> <b>languages</b> . Recently , new<br><b>approaches</b> based on <b>factored</b> <b>language</b><br><b>models</b> have been developed to address this problem .<br>These <b>models</b> provide principled ways of including<br>additional <b>conditioning</b> <b>variables</b> other than<br>the preceding words , such as <b>morphological</b> <b>or</b><br><b>syntactic</b> <b>features</b> . However , the number of<br>possible choices for <b>model</b> <b>parameters</b> creates..."], ["{'O': 68, 'Generic': 3, 'OtherScientificTerm': 10,<br>'Material': 8, 'Method': 5}", "A <b>model</b> is presented to characterize the <b>class</b><br><b>of</b> <b>languages</b> obtained by adding <b>reduplicati-<br>on</b> to <b>context-free</b> <b>languages</b> . The<br><b>model</b> is a <b>pushdown</b> <b>automaton</b> augmented<br>with the ability to check <b>reduplication</b> by using the<br><b>stack</b> in a new way . The <b>class</b> <b>of</b><br><b>languages</b> generated is shown to lie strictly between<br>the <b>context-free</b> <b>languages</b> and the <b>indexed-<br></b> <b>languages</b> . The <b>model</b> appears capable of<br>accommodating the sort of <b>reduplications</b> that have..."], ["{'Task': 9, 'O': 137, 'Metric': 9, 'OtherScientificTerm':<br>18, 'Method': 25, 'Generic': 1, 'Material': 11}", "<b>Speech-based</b> <b>depression</b> <b>detection</b> has<br>gained importance in recent years , but most research has<br>used relatively quiet conditions or examined a single corpus<br>per study . Little is thus known about the <b>robustness</b><br>of <b>speech</b> <b>cues</b> in the wild . This study<br>compares the effect of <b>noise</b> and <b>reverberation</b><br>on <b>depression</b> <b>prediction</b> using 1 -RRB-<br>standard <b>mel-frequency</b> <b>cepstral</b> <b>coefficient-<br>s</b> <b>-LRB-</b> <b>MFCCs</b> <b>-RRB-</b> , and 2 -RRB-<br><b>features</b> designed for <b>noise</b> <b>robustness</b>..."], ["{'O': 111, 'Task': 13, 'Method': 19, 'OtherScientificTerm':<br>22, 'Generic': 1}", "We attempt to understand <b>visual</b> <b>classification</b><br>in humans using both <b>psy-chophysical</b> <b>and</b><br><b>machine</b> <b>learning</b> <b>techniques</b> .<br><b>Frontal</b> <b>views</b> <b>of</b> <b>human</b><br><b>faces</b> were used for a <b>gender</b> <b>classification-<br></b> <b>task</b> . Human subjects classified the faces and<br>their <b>gender</b> <b>judgment</b> , <b>reaction</b><br><b>time</b> and <b>confidence</b> <b>rating</b> were<br>recorded . Several <b>hyperplane</b> <b>learning</b><br><b>algorithms</b> were used on the same <b>classification-..."], ["{'O': 49, 'Method': 4, 'OtherScientificTerm': 2, 'Generic':<br>4, 'Task': 11, 'Material': 8}", "We argue in favor of the the use of <b>labeled</b><br><b>directed</b> <b>graph</b> to represent various types of<br><b>linguistic</b> <b>structures</b> , and illustrate how<br><b>this</b> allows one to view <b>NLP</b> <b>tasks</b> as<br><b>graph</b> <b>transformations</b> . We present a general<br><b>method</b> for learning such <b>transformations</b> from<br>an <b>annotated</b> <b>corpus</b> and describe experiments<br>with two <b>applications</b> of the <b>method</b> :<br><b>identification</b> <b>of</b> <b>non-local</b> <b>depenenc-<br>ies</b> -LRB- using <b>Penn</b> <b>Treebank</b> <b>data</b>..."], ["{'O': 113, 'OtherScientificTerm': 13, 'Method': 12,<br>'Generic': 1}", "Current <b>natural</b> <b>language</b> <b>interfaces</b><br>have concentrated largely on determining the literal meaning<br>of input from their users . While such <b>decoding</b> is an<br>essential underpinning , much recent work suggests that<br><b>natural</b> <b>language</b> <b>interfaces</b> will never<br>appear cooperative or graceful unless <b>they</b> also<br>incorporate numerous <b>non-literal</b> <b>aspects</b><br><b>of</b> <b>communication</b> , such as <b>robust</b><br><b>communication</b> <b>procedures</b> . This paper defends<br>that view , but claims that direct imitation of human..."], ["{'O': 150, 'Method': 18, 'Task': 10, 'OtherScientificTerm':<br>36, 'Generic': 1, 'Material': 5}", "This paper proposes a framework in which <b>Lagrangian</b><br><b>Particle</b> <b>Dynamics</b> is used for the <b>segmentat-<br>ion</b> <b>of</b> <b>high</b> <b>density</b> <b>crowd</b><br><b>flows</b> and <b>detection</b> <b>of</b> <b>flow</b><br><b>instabilities</b> . For this purpose , a <b>flow</b><br><b>field</b> generated by a <b>moving</b> <b>crowd</b> is<br>treated as an <b>aperiodic</b> <b>dynamical</b> <b>system-<br></b> . A <b>grid</b> <b>of</b> <b>particles</b> is<br>overlaid on the <b>flow</b> <b>field</b> , and is advecte-<br>d using a <b>numerical</b> <b>integration</b> <b>scheme-..."], ["{'O': 66, 'Method': 2, 'Material': 5, 'OtherScientificTerm':<br>11}", "This paper proposes an <b>annotating</b> <b>scheme</b> that<br>encodes <b>honorifics</b> -LRB- <b>respectful</b> <b>words-<br></b> -RRB- . <b>Honorifics</b> are used extensively in<br><b>Japanese</b> , reflecting the social relationship -LRB-<br>e.g. social ranks and age -RRB- of the referents . This<br><b>referential</b> <b>information</b> is vital for<br>resolving <b>zero</b> <b>pronouns</b> and improving<br><b>machine</b> <b>translation</b> <b>outputs</b> .<br>Annotating <b>honorifics</b> is a complex task that<br>involves identifying a predicate with <b>honorifics</b> ,..."], ["{'O': 73, 'Generic': 5, 'Task': 4, 'Method': 6, 'Material':<br>1, 'Metric': 3}", "A new <b>algorithm</b> for solving the three <b>dimensional-<br></b> <b>container</b> <b>packing</b> <b>problem</b> is<br>proposed in this paper . This new <b>algorithm</b> deviates<br>from the traditional <b>approach</b> <b>of</b> <b>wall</b><br><b>building</b> <b>and</b> <b>layering</b> . <b>It</b> uses<br>the concept of '' building growing '' from multiple sides<br>of the container . We tested our <b>method</b> using all<br>760 test cases from the <b>OR-Library</b> . Experimental<br>results indicate that the new <b>algorithm</b> is able to<br>achieve an <b>average</b> <b>packing</b> <b>utilization</b>..."], ["{'O': 34, 'Task': 12, 'Method': 15, 'OtherScientificTerm':<br>8, 'Generic': 4}", "The <b>translation</b> <b>of</b> <b>English</b> <b>text</b><br><b>into</b> <b>American</b> <b>Sign</b> <b>Language</b><br><b>-LRB-</b> <b>ASL</b> <b>-RRB-</b> <b>animation</b> tests<br>the limits of traditional <b>MT</b> <b>architectural</b><br><b>designs</b> . A new <b>semantic</b> <b>representation</b><br>is proposed that uses <b>virtual</b> <b>reality</b><br><b>3D</b> <b>scene</b> <b>modeling</b> <b>software</b> to<br>produce <b>spatially</b> <b>complex</b> <b>ASL</b><br><b>phenomena</b> called '' <b>classifier</b> <b>predicates-<br></b> . '' The <b>model</b> acts as an <b>interlingua</b>..."], ["{'Task': 10, 'O': 149, 'OtherScientificTerm': 19, 'Method':<br>20, 'Generic': 6}", "<b>Learning</b> <b>video</b> <b>representation</b> is not a<br>trivial task , as video is an information-intensive media<br>where each frame does not exist independently . Locally , a<br>video frame is visually and semantically similar with its<br>adjacent frames . Holistically , a video has its inherent<br>structure -- the correlations among video frames . For<br>example , even the frames far from each other may also hold<br>similar <b>semantics</b> . Such <b>context</b> <b>informatio-<br>n</b> is therefore important to characterize the <b>intrinsi-<br>c</b> <b>representation</b> <b>of</b> <b>a</b> <b>video</b>..."]], "hovertemplate": "<b>%{hovertext}</b><br><br>Dataset=Train<br>Reduced Embedding (0)=%{x}<br>Reduced Embedding (1)=%{y}<br>Label=%{customdata[0]}<br>Sample=%{customdata[1]}<extra></extra>", "hovertext": ["Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train", "Train"], "legendgroup": "Train", "marker": {"color": "#00008b", "line": {"color": "DarkSlateGrey", "width": 1}, "opacity": 0.4, "size": 8, "symbol": "circle"}, "mode": "markers", "name": "Train", "orientation": "v", "showlegend": true, "x": {"dtype": "float32", "shape": [30]}, "xaxis": "x", "y": {"dtype": "float32", "shape": [30]}, "yaxis": "y", "type": "scatter", "uid": "9e3de39f-45e8-4931-a03d-60b589b05d83"}, {"customdata": [["{'Task': 23, 'OtherScientificTerm': 26, 'O': 139, 'Material'-<br>: 7, 'Generic': 5, 'Method': 13}", "<b>Recognition</b> <b>of</b> <b>proper</b> <b>nouns</b> in<br><b>Japanese</b> <b>text</b> has been studied as a part of<br>the more general problem of <b>morphological</b> <b>analysis-<br></b> in <b>Japanese</b> <b>text</b> <b>processing</b> -LRB-<br>-LSB- 1 -RSB- -LSB- 2 -RSB- -RRB- . <b>It</b> has also been<br>studied in the framework of <b>Japanese</b> <b>information-<br></b> <b>extraction</b> -LRB- -LSB- 3 -RSB- -RRB- in recent<br>years . Our <b>approach</b> to the <b>Multi-lingual</b><br><b>Evaluation</b> <b>Task</b> <b>-LRB-</b> <b>MET</b><br><b>-RRB-</b> for <b>Japanese</b> <b>text</b> is to..."], ["{'O': 70, 'Task': 9, 'Method': 4, 'OtherScientificTerm': 19,<br>'Material': 14, 'Generic': 3}", "This work proposes a new research direction to address the<br><b>lack</b> <b>of</b> <b>structures</b> <b>in</b> <b>traditi-<br>onal</b> <b>n-gram</b> <b>models</b> . It is based on a<br><b>weakly</b> <b>supervised</b> <b>dependency</b> <b>parser-<br></b> that can model <b>speech</b> <b>syntax</b> without<br>relying on any <b>annotated</b> <b>training</b> <b>corpus-<br></b> . <b>Labeled</b> <b>data</b> is replaced by a few<br><b>hand-crafted</b> <b>rules</b> that encode basic<br><b>syntactic</b> <b>knowledge</b> . <b>Bayesian</b><br><b>inference</b> then samples the <b>rules</b> , disambig-..."], ["{'O': 71, 'Method': 6, 'Task': 6, 'OtherScientificTerm': 23,<br>'Material': 1, 'Generic': 2}", "A <b>domain</b> <b>independent</b> <b>model</b> is proposed<br>for the <b>automated</b> <b>interpretation</b> <b>of</b><br><b>nominal</b> <b>compounds</b> in <b>English</b> . This<br><b>model</b> is meant to account for <b>productive</b><br><b>rules</b> <b>of</b> <b>interpretation</b> which are<br>inferred from the <b>morpho-syntactic</b> <b>and</b><br><b>semantic</b> <b>characteristics</b> of the <b>nominal</b><br><b>constituents</b> . In particular , we make extensive use<br>of <b>Pustejovsky</b> <b>'s</b> <b>principles</b> concerning<br>the <b>predicative</b> <b>information</b> associated with..."], ["{'O': 82, 'Method': 10, 'Task': 1, 'OtherScientificTerm':<br>18, 'Metric': 1, 'Material': 4, 'Generic': 2}", "We have implemented a <b>restricted</b> <b>domain</b><br><b>parser</b> called <b>Plume</b> . Building on previous<br>work at Carnegie-Mellon University e.g. -LSB- 4 , 5 , 8<br>-RSB- , <b>Plume</b> <b>'s</b> <b>approach</b> to <b>parsing-<br></b> is based on <b>semantic</b> <b>caseframe</b> <b>instant-<br>iation</b> . This has the advantages of efficiency on<br><b>grammatical</b> <b>input</b> , and <b>robustness</b> in<br>the face of <b>ungrammatical</b> <b>input</b> . While<br><b>Plume</b> is well adapted to simple <b>declarative</b><br><b>and</b> <b>imperative</b> <b>utterances</b> , <b>it</b>..."], ["{'O': 49, 'Method': 10, 'Material': 3, 'Metric': 4,<br>'Generic': 1}", "In this paper , we present an <b>unlexicalized</b><br><b>parser</b> for <b>German</b> which employs <b>smoothing-<br></b> and <b>suffix</b> <b>analysis</b> to achieve a<br><b>labelled</b> <b>bracket</b> <b>F-score</b> of 76.2 ,<br>higher than previously reported results on the <b>NEGRA-<br></b> <b>corpus</b> . In addition to the high <b>accurac-<br>y</b> of the <b>model</b> , the use of <b>smoothing</b><br>in an <b>unlexicalized</b> <b>parser</b> allows us to<br>better examine the interplay between <b>smoothing</b><br>and <b>parsing</b> results ."], ["{'O': 43, 'Method': 9, 'OtherScientificTerm': 18, 'Generic':<br>1, 'Task': 3, 'Material': 2}", "This paper presents an <b>unsupervised</b> <b>learning</b><br><b>approach</b> to disambiguate various <b>relations</b><br><b>between</b> <b>named</b> <b>entities</b> by use of<br>various <b>lexical</b> <b>and</b> <b>syntactic</b><br><b>features</b> from the contexts . <b>It</b> works by<br>calculating <b>eigenvectors</b> of an <b>adjacency</b><br><b>graph</b> <b>'s</b> <b>Laplacian</b> to recover a<br><b>submanifold</b> of data from a <b>high</b> <b>dimensional-<br>ity</b> <b>space</b> and then performing <b>cluster</b><br><b>number</b> <b>estimation</b> on the <b>eigenvectors</b> ...."], ["{'O': 80, 'Method': 13, 'OtherScientificTerm': 12,<br>'Generic': 3, 'Task': 3}", "In this paper we specialize the <b>projective</b> <b>unifoca-<br>l</b> <b>,</b> <b>bifo-cal</b> <b>,</b> <b>and</b><br><b>trifocal</b> <b>tensors</b> to the <b>affine</b><br><b>case</b> , and show how the <b>tensors</b> obtained<br>relate to the <b>registered</b> <b>tensors</b> encountered<br>in previous work . This enables us to obtain an <b>affine-<br></b> <b>specialization</b> of known projective relations<br>connecting points and lines across two or three views .<br>In the simpler case of <b>affine</b> <b>cameras</b> we<br>give neccessary and sufficient constraints on the compone-..."], ["{'O': 113, 'Task': 4, 'Material': 3, 'OtherScientificTerm':<br>26, 'Generic': 5, 'Method': 2, 'Metric': 1}", "The <b>perception</b> <b>of</b> <b>transparent</b><br><b>objects</b> from <b>images</b> is known to be a very hard<br>problem in vision . Given a single image , it is difficult<br>to even detect the presence of <b>transparent</b> <b>objects-<br></b> in the scene . In this paper , we explore what can be<br>said about <b>transparent</b> <b>objects</b> by a moving<br>observer . We show how <b>features</b> that are imaged<br>through a <b>transparent</b> <b>object</b> behave differentl-<br>y from <b>those</b> that are rigidly attached to the scene .<br>We present a novel <b>model-based</b> <b>approach</b> to..."], ["{'O': 134, 'Method': 11, 'Task': 21, 'OtherScientificTerm':<br>15, 'Generic': 5, 'Material': 5, 'Metric': 2}", "We propose a novel <b>probabilistic</b> <b>framework</b> for<br>learning <b>visual</b> <b>models</b> <b>of</b> <b>3D</b><br><b>object</b> <b>categories</b> by combining <b>appearance-<br></b> <b>information</b> and <b>geometric</b> <b>constraint-<br>s</b> . Objects are represented as a coherent ensemble of<br>parts that are consistent under <b>3D</b> <b>viewpoint</b><br><b>transformations</b> . Each part is a collection of<br><b>salient</b> <b>image</b> <b>features</b> . A <b>generat-<br>ive</b> <b>framework</b> is used for learning a <b>model-<br></b> that captures the relative position of parts within..."], ["{'O': 81, 'Method': 4, 'Task': 9, 'Material': 4, 'OtherScien-<br>tificTerm': 8}", "While <b>sentence</b> <b>extraction</b> as an approach to<br><b>summarization</b> has been shown to work in documents of<br>certain genres , because of the conversational nature of<br><b>email</b> <b>communication</b> where utterances are made<br>in relation to one made previously , <b>sentence</b><br><b>extraction</b> may not capture the necessary segments of<br>dialogue that would make a summary coherent . In this paper<br>, we present our work on the <b>detection</b> <b>of</b><br><b>question-answer</b> <b>pairs</b> in an <b>email</b><br><b>conversation</b> for the task of <b>email</b> <b>summariz-..."], ["{'O': 62, 'Method': 8, 'OtherScientificTerm': 22, 'Metric':<br>5, 'Generic': 7, 'Task': 3}", "We present a <b>scanning</b> <b>method</b> that recovers<br><b>dense</b> <b>sub-pixel</b> <b>camera-projector</b><br><b>correspondence</b> without requiring any <b>photometric-<br></b> <b>calibration</b> nor preliminary knowledge of their<br><b>relative</b> <b>geometry</b> . <b>Subpixel</b> <b>accur-<br>acy</b> is achieved by considering several <b>zero-crossin-<br>gs</b> defined by the difference between pairs of <b>unstr-<br>uctured</b> <b>patterns</b> . We use <b>gray-level</b><br><b>band-pass</b> <b>white</b> <b>noise</b> <b>patterns</b><br>that increase <b>robustness</b> to <b>indirect</b>..."], ["{'O': 83, 'Generic': 5, 'Task': 7, 'OtherScientificTerm':<br>16, 'Material': 7, 'Method': 5, 'Metric': 2}", "This paper describes a novel <b>system</b> for <b>acquiring-<br></b> <b>adjectival</b> <b>subcategorization</b> <b>frames-<br></b> <b>-LRB-</b> <b>scfs</b> <b>-RRB-</b> and associated<br>frequency information from <b>English</b> <b>corpus</b><br><b>data</b> . The <b>system</b> incorporates a <b>decisio-<br>n-tree</b> <b>classifier</b> for 30 scf types which tests<br>for the presence of <b>grammatical</b> <b>relations</b><br><b>-LRB-</b> <b>grs</b> <b>-RRB-</b> in the output of a<br><b>robust</b> <b>statistical</b> <b>parser</b> . <b>It-<br></b> uses a powerful <b>pattern-matching</b> <b>langua-..."], ["{'Task': 6, 'O': 105, 'Material': 20, 'OtherScientificTerm':<br>8, 'Generic': 5, 'Method': 28, 'Metric': 2}", "<b>Sentence</b> <b>boundary</b> <b>detection</b> in<br><b>speech</b> is important for enriching <b>speech</b><br><b>recognition</b> <b>output</b> , making <b>it</b> easier<br>for humans to read and downstream modules to process . In<br>previous work , we have developed <b>hidden</b> <b>Markov-<br></b> <b>model</b> <b>-LRB-</b> <b>HMM</b> <b>-RRB-</b><br><b>and</b> <b>maximum</b> <b>entropy</b> <b>-LRB-</b><br><b>Maxent</b> <b>-RRB-</b> <b>classifiers</b> that<br>integrate <b>textual</b> <b>and</b> <b>prosodic</b><br><b>knowledge</b> <b>sources</b> for <b>detecting</b>..."], ["{'O': 109, 'Generic': 7, 'OtherScientificTerm': 33, 'Task':<br>6, 'Method': 10}", "We propose a novel <b>approach</b> to associate objects<br>across multiple <b>PTZ</b> <b>cameras</b> that can be used<br>to perform <b>camera</b> <b>handoff</b> <b>in</b> <b>wide-ar-<br>ea</b> <b>surveillance</b> <b>scenarios</b> . While previous<br><b>approaches</b> relied on <b>geometric</b> <b>,</b><br><b>appearance</b> <b>,</b> <b>or</b> <b>correlation-based-<br></b> <b>information</b> for establishing correspondences<br>between <b>static</b> <b>cameras</b> , they each have<br>well-known limitations and are not extendable to <b>wide--<br>area</b> <b>settings</b> with <b>PTZ</b> <b>cameras</b> ...."], ["{'O': 103, 'Task': 5, 'OtherScientificTerm': 11, 'Material':<br>3, 'Method': 19, 'Generic': 2}", "This paper solves a <b>specialized</b> <b>regression</b><br><b>problem</b> to obtain <b>sampling</b> <b>probabilities-<br></b> for <b>records</b> in <b>databases</b> . The goal is<br>to sample a small set of <b>records</b> over which<br>evaluating <b>aggregate</b> <b>queries</b> can be done<br>both efficiently and accurately . We provide a <b>princip-<br>led</b> <b>and</b> <b>provable</b> <b>solution</b> for<br>this <b>problem</b> ; <b>it</b> is parameterless and<br>requires no data insights . Unlike standard <b>regression-<br></b> <b>problems</b> , the <b>loss</b> is inversely..."], ["{'O': 166, 'Task': 11, 'Material': 8, 'OtherScientificTerm':<br>25, 'Metric': 4, 'Method': 19, 'Generic': 5}", "In spite of over two decades of intense research ,<br><b>illumination</b> and <b>pose</b> <b>invariance</b> remain<br>prohibitively challenging aspects of <b>face</b> <b>recognit-<br>ion</b> for most practical applications . The objective of<br>this work is to recognize faces using <b>video</b><br><b>sequences</b> both for training and recognition input ,<br>in a realistic , unconstrained setup in which <b>lighting-<br></b> , <b>pose</b> and <b>user</b> <b>motion</b> <b>patte-<br>rn</b> have a wide variability and <b>face</b> <b>images-<br></b> are of low <b>resolution</b> . In particular there..."], ["{'O': 53, 'Task': 4, 'Method': 7, 'Material': 6}", "Towards deep analysis of <b>compositional</b> <b>classes</b><br><b>of</b> <b>paraphrases</b> , we have examined a <b>class-o-<br>riented</b> <b>framework</b> for collecting <b>paraphrase-<br></b> <b>examples</b> , in which <b>sentential</b> <b>para-<br>phrases</b> are collected for each paraphrase class<br>separately by means of <b>automatic</b> <b>candidate</b><br><b>generation</b> and <b>manual</b> <b>judgement</b> .<br>Our preliminary experiments on building a <b>paraphrase-<br></b> <b>corpus</b> have so far been producing promising<br>results , which we have evaluated according to cost-eff-..."], ["{'O': 68, 'Metric': 14, 'Generic': 2, 'Task': 7, 'OtherScien-<br>tificTerm': 6, 'Method': 5}", "We suggest a new goal and <b>evaluation</b> <b>criterion</b><br>for <b>word</b> <b>similarity</b> <b>measures</b> . The new<br><b>criterion</b> -- <b>meaning-entailing</b> <b>substitutabi-<br>lity</b> -- fits the needs of <b>semantic-oriented</b><br><b>NLP</b> <b>applications</b> and can be evaluated directly<br>-LRB- independent of an application -RRB- at a good level of<br><b>human</b> <b>agreement</b> . Motivated by this <b>semanti-<br>c</b> <b>criterion</b> we analyze the empirical quality of<br><b>distributional</b> <b>word</b> <b>feature</b> <b>vectors-<br></b> and its impact on <b>word</b> <b>similarity</b>..."], ["{'Task': 9, 'O': 83, 'Material': 10, 'Metric': 3, 'OtherScie-<br>ntificTerm': 5, 'Method': 6, 'Generic': 3}", "<b>Topical</b> <b>blog</b> <b>post</b> <b>retrieval</b> is<br>the task of <b>ranking</b> <b>blog</b> <b>posts</b> with<br>respect to their <b>relevance</b> for a given topic . To<br>improve <b>topical</b> <b>blog</b> <b>post</b> <b>retrieval-<br></b> we incorporate <b>textual</b> <b>credibility</b><br><b>indicators</b> in the <b>retrieval</b> <b>process</b> .<br>We consider two groups of <b>indicators</b> : post level<br>-LRB- determined using information about individual<br><b>blog</b> <b>posts</b> only -RRB- and blog level -LRB-<br>determined using information from the underlying <b>blogs-..."], ["{'O': 136, 'Task': 6, 'Material': 7, 'OtherScientificTerm':<br>12, 'Generic': 4, 'Method': 13}", "We investigate the problem of learning to predict moves in<br>the <b>board</b> <b>game</b> <b>of</b> <b>Go</b> from<br><b>game</b> <b>records</b> <b>of</b> <b>expert</b><br><b>players</b> . In particular , we obtain a <b>probability-<br></b> <b>distribution</b> over legal moves for professional<br>play in a given position . This <b>distribution</b> has<br>numerous applications in <b>computer</b> <b>Go</b> ,<br>including serving as an efficient <b>stand-alone</b><br><b>Go</b> <b>player</b> . <b>It</b> would also be effective<br>as a <b>move</b> <b>selector</b> and <b>move</b> <b>sorter-..."], ["{'O': 81, 'Task': 20, 'Metric': 11, 'Method': 6, 'OtherScien-<br>tificTerm': 5, 'Generic': 2}", "The task of <b>machine</b> <b>translation</b> <b>-LRB-</b><br><b>MT</b> <b>-RRB-</b> <b>evaluation</b> is closely related<br>to the task of <b>sentence-level</b> <b>semantic</b><br><b>equivalence</b> <b>classification</b> . This paper<br>investigates the utility of applying standard <b>MT</b><br><b>evaluation</b> <b>methods</b> -LRB- <b>BLEU</b> ,<br><b>NIST</b> , <b>WER</b> and <b>PER</b> -RRB- to building<br><b>classifiers</b> to predict <b>semantic</b> <b>equivalence-<br></b> and <b>entailment</b> . We also introduce a novel<br><b>classification</b> <b>method</b> based on <b>PER</b>..."], ["{'O': 119, 'Generic': 5, 'Task': 25, 'Material': 10,<br>'OtherScientificTerm': 30, 'Method': 17, 'Metric': 1}", "A new <b>algorithm</b> is proposed for <b>novel</b><br><b>view</b> <b>generation</b> in <b>one-to-one</b><br><b>teleconferencing</b> <b>applications</b> . Given the<br><b>video</b> <b>streams</b> acquired by two <b>cameras</b><br>placed on either side of a <b>computer</b> <b>monitor</b> ,<br>the proposed <b>algorithm</b> synthesises <b>images</b> from<br>a <b>virtual</b> <b>camera</b> in <b>arbitrary</b><br><b>position</b> -LRB- typically located within the monitor<br>-RRB- to facilitate <b>eye</b> <b>contact</b> . Our<br><b>technique</b> is based on an improved , <b>dynamic-progra-..."], ["{'O': 27, 'OtherScientificTerm': 6, 'Method': 10, 'Generic':<br>2, 'Task': 5}", "We provide a unified account of <b>sentence-level</b><br><b>and</b> <b>text-level</b> <b>anaphora</b> within the<br>framework of a <b>dependency-based</b> <b>grammar</b><br><b>model</b> . <b>Criteria</b> for <b>anaphora</b><br><b>resolution</b> <b>within</b> <b>sentence</b> <b>boundarie-<br>s</b> rephrase major concepts from <b>GB</b> <b>'s</b><br><b>binding</b> <b>theory</b> , while <b>those</b> for<br><b>text-level</b> <b>anaphora</b> incorporate an adapted<br>version of a <b>Grosz-Sidner-style</b> <b>focus</b><br><b>model</b> ."], ["{'Task': 8, 'O': 233, 'Material': 14, 'OtherScientificTerm':<br>20, 'Method': 8, 'Generic': 1}", "<b>Coedition</b> of a <b>natural</b> <b>language</b><br><b>text</b> and its representation in some interlingual form<br>seems the best and simplest way to share <b>text</b><br><b>revision</b> across <b>languages</b> . For various<br>reasons , <b>UNL</b> <b>graphs</b> are the best candidates<br>in this context . We are developing a prototype where , in<br>the simplest sharing scenario , naive users interact<br>directly with the text in their language -LRB- L0 -RRB- ,<br>and indirectly with the associated <b>graph</b> . The<br>modified <b>graph</b> is then sent to the <b>UNL-L0</b>..."], ["{'Method': 15, 'O': 158, 'Generic': 3, 'Metric': 7,<br>'OtherScientificTerm': 9, 'Material': 4}", "<b>MINPRAN</b> , a new <b>robust</b> <b>operator</b> , nds<br>good ts in data sets where more than 50 % of the points are<br>outliers . Unlike other <b>techniques</b> that handle<br><b>large</b> <b>outlier</b> <b>percentages</b> , <b>MINPRAN-<br></b> does not rely on a known <b>error</b> <b>bound</b> for<br>the good data . Instead <b>it</b> assumes that the bad data<br>are randomly -LRB- uniformly -RRB- distributed within the<br><b>dynamic</b> <b>range</b> <b>of</b> <b>the</b> <b>sensor-<br></b> . Based on this , <b>MINPRAN</b> uses <b>random</b><br><b>sampling</b> to search for the t and the number of..."], ["{'O': 115, 'Method': 11, 'Task': 20, 'Generic': 1,<br>'OtherScientificTerm': 13, 'Material': 2}", "A <b>model-based</b> <b>approach</b> to <b>on-line</b><br><b>cursive</b> <b>handwriting</b> <b>analysis</b> <b>and</b><br><b>recognition</b> is presented and evaluated . In this<br><b>model</b> , <b>on-line</b> <b>handwriting</b> is<br>considered as a modulation of a simple <b>cycloidal</b><br><b>pen</b> <b>motion</b> , described by two coupled<br>oscillations with a <b>constant</b> <b>linear</b> <b>drift-<br></b> along the line of the writing . By slow modulations<br>of the amplitudes and phase lags of the two oscillators ,<br>a general <b>pen</b> <b>trajectory</b> can be efficiently..."], ["{'O': 84, 'OtherScientificTerm': 20, 'Material': 4, 'Task':<br>2, 'Method': 15, 'Generic': 5}", "This paper deals with the problem of generating the<br><b>fundamental</b> <b>frequency</b> <b>-LRB-</b> <b>F0</b><br><b>-RRB-</b> <b>contour</b> <b>of</b> <b>speech</b> from a<br><b>text</b> <b>input</b> for <b>text-to-speech</b><br><b>synthesis</b> . We have previously introduced a<br><b>statistical</b> <b>model</b> describing the generating<br>process of <b>speech</b> <b>F0</b> <b>contours</b> , based<br>on the discrete-time version of the <b>Fujisaki</b><br><b>model</b> . One <b>remarkable</b> <b>feature</b> of this<br><b>model</b> is that <b>it</b> has allowed us to derive an..."], ["{'O': 106, 'Generic': 5, 'Task': 13, 'Method': 20,<br>'OtherScientificTerm': 5, 'Material': 4}", "We introduce a <b>method</b> to accelerate the <b>evaluation-<br></b> <b>of</b> <b>object</b> <b>detection</b> <b>cascades-<br></b> with the help of a <b>divide-and-conquer</b> <b>proc-<br>edure</b> in the <b>space</b> <b>of</b> <b>candidate</b><br><b>regions</b> . Compared to the <b>exhaustive</b><br><b>procedure</b> that thus far is the state-of-the-art<br>for <b>cascade</b> <b>evaluation</b> , the proposed<br><b>method</b> requires fewer evaluations of the <b>classi-<br>fier</b> <b>functions</b> , thereby speeding up the<br><b>search</b> . Furthermore , we show how the recently..."], ["{'Task': 20, 'O': 81, 'OtherScientificTerm': 16, 'Generic':<br>4, 'Method': 1}", "<b>Background</b> <b>modeling</b> is an important component<br>of many <b>vision</b> <b>systems</b> . Existing work in the<br>area has mostly addressed scenes that consist of <b>static-<br></b> <b>or</b> <b>quasi-static</b> <b>structures</b> .<br>When the <b>scene</b> exhibits a <b>persistent</b><br><b>dynamic</b> <b>behavior</b> in time , such an assumptio-<br>n is violated and <b>detection</b> performance deteriorate-<br>s . In this paper , we propose a new <b>method</b> for the<br><b>modeling</b> <b>and</b> <b>subtraction</b> <b>of</b><br><b>such</b> <b>scenes</b> . Towards the <b>modeling</b>..."], ["{'Task': 14, 'O': 70, 'Material': 10, 'OtherScientificTerm':<br>9, 'Method': 3, 'Metric': 3}", "<b>Information</b> <b>distillation</b> aims to extract<br>relevant pieces of information related to a given query from<br><b>massive</b> <b>,</b> <b>possibly</b> <b>multilingual</b><br><b>,</b> <b>audio</b> <b>and</b> <b>textual</b> <b>document-<br></b> <b>sources</b> . In this paper , we present our<br>approach for using <b>information</b> <b>extraction</b><br><b>annotations</b> to augment <b>document</b> <b>retrieval-<br></b> <b>for</b> <b>distillation</b> . We take advantage of<br>the fact that some of the <b>distillation</b> <b>queries-<br></b> can be associated with <b>annotation</b> <b>element-..."]], "hovertemplate": "<b>%{hovertext}</b><br><br>Dataset=Test<br>Reduced Embedding (0)=%{x}<br>Reduced Embedding (1)=%{y}<br>Label=%{customdata[0]}<br>Sample=%{customdata[1]}<extra></extra>", "hovertext": ["Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test", "Test"], "legendgroup": "Test", "marker": {"color": "#69b3a2", "line": {"color": "DarkSlateGrey", "width": 1}, "opacity": 0.4, "size": 8, "symbol": "circle"}, "mode": "markers", "name": "Test", "orientation": "v", "showlegend": true, "x": {"dtype": "float32", "shape": [30]}, "xaxis": "x", "y": {"dtype": "float32", "shape": [30]}, "yaxis": "y", "type": "scatter", "uid": "352f1c70-ca01-4166-ab18-190842a3455f"}], "_dom_classes": [], "_js2py_layoutDelta": {}, "_js2py_pointsCallback": {}, "_js2py_relayout": {}, "_js2py_restyle": {}, "_js2py_traceDeltas": {}, "_js2py_update": {}, "_layout": {"legend": {"title": {"text": "Dataset"}, "tracegroupgap": 0}, "template": {"data": {"barpolar": [{"marker": {"line": {"color": "#E5ECF6", "width": 0.5}, "pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "barpolar"}], "bar": [{"error_x": {"color": "#2a3f5f"}, "error_y": {"color": "#2a3f5f"}, "marker": {"line": {"color": "#E5ECF6", "width": 0.5}, "pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "bar"}], "carpet": [{"aaxis": {"endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f"}, "baxis": {"endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f"}, "type": "carpet"}], "choropleth": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "choropleth"}], "contourcarpet": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "contourcarpet"}], "contour": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "contour"}], "heatmapgl": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "heatmapgl"}], "heatmap": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "heatmap"}], "histogram2dcontour": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "histogram2dcontour"}], "histogram2d": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "histogram2d"}], "histogram": [{"marker": {"pattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}}, "type": "histogram"}], "mesh3d": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "type": "mesh3d"}], "parcoords": [{"line": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "parcoords"}], "pie": [{"automargin": true, "type": "pie"}], "scatter3d": [{"line": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatter3d"}], "scattercarpet": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattercarpet"}], "scattergeo": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattergeo"}], "scattergl": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattergl"}], "scattermapbox": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scattermapbox"}], "scatterpolargl": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterpolargl"}], "scatterpolar": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterpolar"}], "scatter": [{"fillpattern": {"fillmode": "overlay", "size": 10, "solidity": 0.2}, "type": "scatter"}], "scatterternary": [{"marker": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "type": "scatterternary"}], "surface": [{"colorbar": {"outlinewidth": 0, "ticks": ""}, "colorscale": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "type": "surface"}], "table": [{"cells": {"fill": {"color": "#EBF0F8"}, "line": {"color": "white"}}, "header": {"fill": {"color": "#C8D4E3"}, "line": {"color": "white"}}, "type": "table"}]}, "layout": {"annotationdefaults": {"arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1}, "autotypenumbers": "strict", "coloraxis": {"colorbar": {"outlinewidth": 0, "ticks": ""}}, "colorscale": {"diverging": [[0, "#8e0152"], [0.1, "#c51b7d"], [0.2, "#de77ae"], [0.3, "#f1b6da"], [0.4, "#fde0ef"], [0.5, "#f7f7f7"], [0.6, "#e6f5d0"], [0.7, "#b8e186"], [0.8, "#7fbc41"], [0.9, "#4d9221"], [1, "#276419"]], "sequential": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]], "sequentialminus": [[0.0, "#0d0887"], [0.1111111111111111, "#46039f"], [0.2222222222222222, "#7201a8"], [0.3333333333333333, "#9c179e"], [0.4444444444444444, "#bd3786"], [0.5555555555555556, "#d8576b"], [0.6666666666666666, "#ed7953"], [0.7777777777777778, "#fb9f3a"], [0.8888888888888888, "#fdca26"], [1.0, "#f0f921"]]}, "colorway": ["#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52"], "font": {"color": "#2a3f5f"}, "geo": {"bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white"}, "hoverlabel": {"align": "left"}, "hovermode": "closest", "mapbox": {"style": "light"}, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": {"angularaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "bgcolor": "#E5ECF6", "radialaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}}, "scene": {"xaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}, "yaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}, "zaxis": {"backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white"}}, "shapedefaults": {"line": {"color": "#2a3f5f"}}, "ternary": {"aaxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "baxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}, "bgcolor": "#E5ECF6", "caxis": {"gridcolor": "white", "linecolor": "white", "ticks": ""}}, "title": {"x": 0.05}, "xaxis": {"automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": {"standoff": 15}, "zerolinecolor": "white", "zerolinewidth": 2}, "yaxis": {"automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": {"standoff": 15}, "zerolinecolor": "white", "zerolinewidth": 2}}}, "title": {"text": "Scatter Plot of Embeddings Space (reduced to 2 dimensions)"}, "xaxis": {"anchor": "y", "domain": [0.0, 1.0], "title": {"text": "Reduced Embedding (0)"}}, "yaxis": {"anchor": "x", "domain": [0.0, 1.0], "title": {"text": "Reduced Embedding (1)"}}}, "_py2js_addTraces": {}, "_py2js_animate": {}, "_py2js_deleteTraces": {}, "_py2js_moveTraces": {}, "_py2js_relayout": {}, "_py2js_removeLayoutProps": {}, "_py2js_removeTraceProps": {}, "_py2js_restyle": {}, "_py2js_update": {}, "_view_count": 0}, "buffers": [{"encoding": "base64", "path": ["_data", 0, "x", "buffer"], "data": "kJ0xwb7jJUEz0oRASus+wd6SJ8FsMqtAtvQUwTukpz/3tR7BYBCdQA/ktcDsvp3Ag4+CwBhCjECb9H9ADTeHwFL3WsA5tj5BhZIYQD0mTUCravA/c+lpwPh9JEG4M6NA6m9SQL3yLUH2UUHBTfE0QRDLJMGcFP5A"}, {"encoding": "base64", "path": ["_data", 0, "y", "buffer"], "data": "WMFqQV7L/z+Bo9lAJyhbQemtsr8PkQVBP9ACwIDcw0A3QzvAHbb7QOAU4UAmr/xAQ5voQIATH0HZK5pAvQQLQS4VAkGuEKhAGOfYQLn5EkGVpO5A02voQFclOkAWbxBBG6ScQEbopkAOu2pBT/EKQBTZA8BpFKNA"}, {"encoding": "base64", "path": ["_data", 1, "x", "buffer"], "data": "kKsywYhqgECZCfY/pfMwQCMoTUAA2Vw/54NDQeQ1R0EKrjdBbT7kPSRDSEEgaTXBJd6NwKc1IUHDjQ1BqNQFQQexQcFbQTHB2a87wczoFkE/8TTBIeMQQSIRgEBmU4hAGZIHQSmsIkE2/ZvAdjsaQY8nPkEvF6k+"}, {"encoding": "base64", "path": ["_data", 1, "y", "buffer"], "data": "PExrQV0yEkFbS6xAW63BQAsN/kC88Z5AsErqP2iok0AANDVAQ3CoQNGGLUA4J1pBBojEQPbhlUDxsbFAFlmJQJF+oL+oIZa/UVb9v0s50ECtCBbAPLRnQHszAkE1DbtAM2DKQOWfvkAXndVAsb2IQLm2g0CZBshA"}]}, "8506dddcbeac4c9c8f58c4c83dc95c4d": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}, "40accb9ec4f04950bb7c9d34a8f852bb": {"model_name": "LayoutModel", "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "state": {}}}}
    </script>
    <script type="application/vnd.jupyter.widget-view+json">
    {"version_major": 2, "version_minor": 0, "model_id": "b3b4ad8d01ca4a3da39fb1a405e5aa80"}
    </script>

    </body>
    </html>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 173-184

The check shows the samples from the train and test datasets as points in the 2-dimensional reduced embedding
space. We can see some distinct segments  - in the upper left corner we can notice (by hovering on the samples and
reading the abstracts) that these are papers about computer vision, while the bottom right corner is mostly about
Natural Language Processing. We can also see that although there isn't significant drift between the train and test,
the training dataset has a bit more samples from the NLP domain, while the test set has more samples from the
computer vision domain.

.. note::

    You can find the full list of available NLP checks in the :mod:`nlp.checks api documentation ֿ
    <deepchecks.nlp.checks>`.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 9.089 seconds)


.. _sphx_glr_download_nlp_auto_tutorials_quickstarts_plot_token_classification.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_token_classification.py <plot_token_classification.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_token_classification.ipynb <plot_token_classification.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_