Prediction Functions Demo#

This Jupyter notebook demonstrates the usage of the prediction functions that are part of ChemREL.

Tip

To run a copy of this notebook yourself, download the corresponding predict_demo.ipynb file here.

Import ChemREL#

To begin, import the predict submodule from the functions package in ChemREL, as follows.

from chemrel.functions import predict

Predict Submodule#

The predict submodule contains two available functions, predict_span() and predict_rel(). Their usage is demonstrated below. For the full documentation of each function, see the ChemREL Functions Reference.

We first define the text string to predict as a variable, example_text.

example_text = '''Two physical properties of the fluorinated carboxylic acid,
2H-hexadecafluoro-2-decenoic acid (C10F16H2O2), were investigated in this study: 
melting point and solubility. Melting point data were ascertained on a traditional melting point
apparatus and a differential scanning calorimeter (DSC), while solubility data were obtained using an LC−MS/MS system.
A melting point of 105 ± 1 °C and an aqueous solubility of 64 ± 5 ng/μL at ambient temperatures were observed'''

Span Categorization#

We demonstrate how predict_span() can be used to predict spans, as follows. This assumes that the span categorization model provided by ChemREL has been initialized at chemrelmodels/sc/tok2vec.

spans_dict = predict.predict_span("chemrelmodels/sc/tok2vec", example_text)

We can now print the contents of spans_dict to view the predicted spans.

print(spans_dict)
{'CHEMICAL': [('C10F16H2O2', 0.9998276), ('fluorinated carboxylic acid', 0.99959916), ('2H-hexadecafluoro-2-decenoic acid', 0.9974232)], 'MP': [('105 ± 1 °C', 0.99998033)]}

Relation Prediction#

We demonstrate how predict_rel() can be used to predict relations, as follows. This assumes that the span categorization and relation extraction models provided by ChemREL have been initialized at chemrelmodels/sc/tok2vec and chemrelmodels/rel/tok2vec, respectively.

rels_dict = predict.predict_rel("chemrelmodels/sc/tok2vec", "chemrelmodels/rel/tok2vec", example_text)

We can now print the contents of rels_dict to view the predicted relations.

print(rels_dict)
{('CHEMICAL', 'MP'): [(('fluorinated carboxylic acid', '105 ± 1 °C'), 0.999992), (('2H-hexadecafluoro-2-decenoic acid', '105 ± 1 °C'), 0.9999999), (('C10F16H2O2', '105 ± 1 °C'), 0.9999999)]}