Text summarization

with Hugging Face Pipelines

Jan Kirenz

Text Summarization Hugging Face Pipeline

Setup

from transformers import pipeline

Intuition

  • Summarization creates a shorter version of a text from a longer one while trying to preserve most of the meaning of the original document.

  • Summarization is a sequence-to-sequence task

Use Cases

  • Help readers quickly understand the main points.

  • Legislative bills, legal and financial documents, patents, and scientific papers …

Two types of summarization:

  • extractive: identify and extract the most important sentences from the original text.

  • abstractive: generate the target summary (which may include new words not in the input document) from the original text

Pipeline

Create pipeline

summarizer = pipeline(task="summarization")

Provide text

my_text = "When researchers want to evaluate the effect of particular traits, treatments, \
or conditions, they conduct an experiment. For instance, we may suspect drinking a \
high-calorie energy drink will improve performance in a race. To check if there \
really is a causal relationship between the explanatory variable (whether the \
runner drank an energy drink or not) and the response variable (the race time), \
 researchers identify a sample of individuals and split them into groups. \
 The individuals in each group are assigned a treatment. When individuals are \
 randomly assigned to a group, the experiment is called a randomized experiment. \
 Random assignment organizes the participants in a study into groups that are roughly \
 equal on all aspects, thus allowing us to control for any confounding variables that \
  might affect the outcome (e.g., fitness level, racing experience, etc.). For example, \
   each runner in the experiment could be randomly assigned, perhaps by flipping a coin,  \
into one of two groups: the first group receives a placebo (fake treatment, in this case  \
a no-calorie drink) and the second group receives the high-calorie energy drink.  \
See the case study in Section 1.1 for another example of an experiment, though that  \
study did not employ a placebo. Researchers perform an observational study when they  \
collect data in a way that does not directly interfere with how the data arise.  \
For instance, researchers may collect information via surveys, review medical or company  \
records, or follow a cohort of many similar individuals to form hypotheses about why  \
certain diseases might develop. In each of these situations, researchers merely observe  \
the data that arise. In general, observational studies can provide evidence of a naturally  \
occurring association between variables, but they cannot by themselves show a causal connection  \
as they do not offer a mechanism for controlling for confounding variables."

Summarizer

summarizer(my_text, min_length=20, max_length=80)

Output

  • Output:
[{'summary_text': ' When researchers want to evaluate the effect of particular  \
traits, treatments, or conditions, they conduct an experiment . For example, we  \
may suspect drinking a high-calorie energy drink will improve performance in a race .  \
Researchers perform an observational study when they collect data in a way that does  \
not directly interfere with how the data arise .'}]

What’s next?

Congratulations! You have completed this tutorial 👍

Next, you may want to go back to the lab’s website