Sentiment Analyzer with BERT (build, tune, deploy)
Brief description of how I developed sentiment analyzer. It covers text preprocessing, model building, tuning, API, frontend creation and containerization.
Jul 24 ·4min read
Dataset
I used the dataset published by The Stanford NLP Group . I merged two files, namely ‘dictionary.txt’ including 239,232 text fragments and ‘sentiment_labels.txt’ containing the sentiment scores assigned to the various text fragments.
Text preprocessing with regular expressions
To clean the text, I usually use a bunch of functions containing regular expressions. In common.py
you can find all of them, for example remove_nonwords
described below:
Similar functions were used for empty rows, special signs, numbers and html code removal.
After text cleaning, it’s time for BERT embeddings creation. For that purpose, I used bert-as-service . It is very simple and consists of only 3 steps: download a pre-trained model, start the BERT service and use client for sentence encodings of specified length.
There are multiple parameters that can be setup, when running a service. For example, to define max_seq_len
, I calculated 0.9 quantile of train data length.
Preprocessed data has a form of data frame containing 768 features. For full code, please go to nlp_preprocess.py
.
Model building with Keras
In this part, we build and train the model on different parameters. Let’s assume we want 5-layers neural network as below. We will parametrize batch_size, number of epochs, number of nodes in the first 4 dense layers and 5 dropout layers.
Model tuning with Sacred
Now we can tune the parameters. We will use sacred
module. Key points here are:
1. Create an Experiment and add Observer
First we need to create an experiment and observer that logs all kinds of information. It’s very simple!
2. Define the main function
The @ex.automain
decorator defines and runs the main function of the experiment when we run the Python script.
3. Add the Configuration parameters
We will define them through Config Scope.
4. Add metrics
In our case here I want to know the MAE and MSE . We can use the Metrics API for that.
5. Run the experiment
Functions from the previous steps are stored in model_experiment.py
script. In order to run our exepriment for bunch of parameters, we create and run run_sacred.py.
For all possible permutations, MAE and MSE will be saved in MongoDB.
The best result I got is 9% of MAE score. That means that our sentiment analyzer works pretty good. We can check it with model_inference
function.
Please note that the score is normalized so that outlier values can be also obtained. After model is saved, we can build a Web API!
Web API creation with Flask
Now we want to create an API that runs the code in the function and displays the returned result in the browser.
The syntax @app.route('/score', methods=['PUT'])
lets Flask know that the function, score
, should be mapped to the endpoint/score . The methods
list is a keyword argument that tells us what kind of HTTP requests are allowed. We’ll be using PUT
requests to receive sentences from a user. In function score
, we get a score in dictionary form, since it can be easily converted to a JSON string. Full code is available in api.py
.
Frontend
For web interface, three files were created:
index.html style.css index.js
For gradient HSV model was used. Saturation and Value are constants. Hue corresponds to score value. Changing hue in range [0;120] yields smooth colour change from red to yellow to green.
Docker containerization
The brilliance of Docker is that, once you package an application and all its dependencies into container, you ensure it will run in any environment. It is generally recommended to separate areas of concern by using one service per container. In my small app there are 3 parts that should be combined: bert-as-service, application and frontend. The tool that helps you build Docker images and run containers is Docker Compose .
Steps that we need to do to dockerize our code:
- Create separate folders for bert-as-service, api and frontend,
- Put there relevant files,
- Add
requirenments.txt
andDockerfile
to each folder. The first file should cover all needed libraries that will be installed via command in the second file. Its format is described in docker documentation - Create
docker-compose.yaml
in the 3 folders directory. Define the 3 services that make up the app in this file, so they can be run together in an isolated environment.
Now we are ready to build and run our application! Please see the sample outputs below.
As usual, please feel free to view the full code on my Gitlab.
Projects · Zuzanna / Sentiment Analysis with BERT
GitLab.com
gitlab.co
以上所述就是小编给大家介绍的《Sentiment Analyzer with BERT (build, tune, deploy)》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Hibernate
James Elliott / O'Reilly Media, Inc. / 2004-05-10 / USD 24.95
Do you enjoy writing software, except for the database code? Hibernate:A Developer's Notebook is for you. Database experts may enjoy fiddling with SQL, but you don't have to--the rest of the appl......一起来看看 《Hibernate》 这本书的介绍吧!