Session: Using transformer models for your own NLP task - building an NLP model End To End
Transformer models have revolutionized the NLP field and are currently state-of-the art on a variety of tasks, such as named entity recognition, language inference or question answering. With new, more performant models being continuously developed (BERT, RoBERTa, AlBERT, ELECTRA, ERNIE, etc), these models are ubiquitous in virtually all domains that make use of natural language processing.
So how can you apply these models on your own task? In this talk, we will go over the process of using state-of-the-art transformer models for your own NLP task. We will discuss the entire pipeline, from building a training corpus, developing a NLP model and evaluating the model. We will offer an example of building a model to extract mentions of Experimental Methods and Datasets from full-text biomedical papers. Even though our example will focus on an NLP task for the biomedical text, the framework can be applied to any domain.
- Building or using an available training corpus for an NLP task
- Develop a NLP model for a specific task using state of the art available technology
- Evaluating an NLP model
Ana-Maria Istrate is a Research Scientist at the Chan Zuckerberg Initiative, where she has primarily worked on recommendations, ranking, and text mining algorithms supporting retrieving knowledge from biomedical journal articles. She has graduated from Stanford with a Bachelors’s Degree in Applied Math and a Masters Degree in Computer Science, the Artificial Intelligence track. She is passionate about using machine learning to unlock hidden knowledge from seemingly random/messy data. Her interests lie at the intersection of Natural Language Processing, Text Mining techniques and applications that serve the broader scientific community.