Simple and fast Question Answering system using HuggingFace DistilBERT — single & batch inf...

栏目: IT技术 · 发布时间: 5年前

Simple and fast Question Answering system using HuggingFace DistilBERT — single & batch inf...

Simple and fast Question Answering system using HuggingFace DistilBERT — single & batch inference examples provided.

Image from Pixabay and Stylized by AiArtist Chrome Plugin (Built by me)

Question Answering systems have many use cases like automatically responding to a customer’s query by reading through the company’s documents and finding a perfect answer.

In this blog post, we will see how we can implement a state-of-the-art, super-fast, and lightweight question answering system using DistilBERT from Huggingface transformers library.

Input

The input will be a short paragraph what we call context, and a question –

Context :
The US has passed the peak on new coronavirus cases, President Donald Trump said and predicted that some states would reopen this month.The US has over 637,000 confirmed Covid-19 cases and over 30,826 deaths, the highest for any country in the world.
Question:
What was President Donald Trump's prediction?

Output

The output from our question-answering system will be an answer from the context paragraph as shown below —

Answer: 
some states would reopen this month.

Let’s get started :

First, let’s see how we can answer a single question as shown above. Then we will see how we can leverage batch processing to answer multiple questions at once from the context.

Single Inference :

Here is the code to do single inference with DistilBERT :

The output will be :

Question What was President Donald Trump's prediction?Answer Tokens:
['some', 'states', 'would', 're', '##open', 'this', 'month']
Answer : some states would reopen this month

Batch Inference :

Now let’s try to do the same with batch inference where we try to pass three questions and get answers for them as a batch

Three questions are –

1. What was President Donald Trump's prediction?
2. How many deaths have been reported from the virus?
3. How many cases have been reported in the United States?

Here is the code to do batch inference with DistilBERT :

The output will be —

Context : The US has passed the peak on new coronavirus cases, President Donald Trump said and predicted that some states would reopen this month.The US has over 637,000 confirmed Covid-19 cases and over 30,826 deaths, the highest for any country in the world.Question: What was President Donald Trump's prediction?
Answer: some states would reopen this month
Question: How many deaths have been reported from the virus?
Answer: 30 , 826
Question: How many cases have been reported in the United States?
Answer: over 637 , 000

If you are interested in automatic question generation from context rather than question answering , you can check out my various algorithms and open-sourced code in the links below –

  1. True or False Question Generation
  2. Multiple Choice Question Generation
  3. Generate pronoun questions for English language learning
  4. Grammar MCQ generation

Happy coding! If you have any questions or just wanted to say hi, you can reach out to me on ramsrigouthamg@gmail.com


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

创业维艰

创业维艰

本·霍洛维茨 Ben Horowitz / 杨晓红、钟莉婷 / 中信出版社 / 2015-2 / 49

本·霍洛维茨,硅谷顶级投资人,与网景之父马克·安德森联手合作18年,有着丰富的创业和管理经验。2009年创立风险投资公司A16Z,被外媒誉为“硅谷最牛的50个天使投资人”之一,先后在初期投资了Facebook、Twitter、Groupon、Skype,是诸多硅谷新贵的创业导师。 在《创业维艰》中,本·霍洛维茨从自己的创业经历讲起,以自己在硅谷近20余年的创业、管理和投资经验,对创业公司(尤......一起来看看 《创业维艰》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换