Rank your things… An AES story

栏目: IT技术 · 发布时间: 4年前

内容简介:Automated essay scoring (AES) is the use of specialized computer programs to assign grades to essays written in an educational setting. Its objective is to classify a large set of textual entities into a small number of discrete categories, corresponding t

Rank your things… An AES story

Automated essay scoring (AES) is the use of specialized computer programs to assign grades to essays written in an educational setting. Its objective is to classify a large set of textual entities into a small number of discrete categories, corresponding to the possible grades. But here, let’s look at it in an industrial problem setting.

Source: link

You are a basketball coach. You have 40 kids under you who want to play ball. You are asked to select a team of 5 tallest kids. Pretty easy! You ask all of them to line up in the decreasing order of their heights and you pick the first 5.

Now say you are comic book writer. Not really a writer, but you are the guy who gets to decide the fancy villain names. You have plenty of interns under you who create villain descriptions and also names the villains for you. They create hundreds of such potential villain characters for your comic and now you have to choose which among them should you even consider for your comics. The tricky bit is that you might want to do the selection based on what your readers like too.

Technically, you want to score each of your potential villains and rank them in the decreasing order of the reader’s affinity score (or rating).

(Ignore the detail of how you have the reader affinity. Assume that the comic God gave you those.)

So, you have all your villains (that eventually got into the comics) and their respective reader affinity scores. Now your task is to use that information (somehow, duh) and rank or score the future villains that your interns create.

In literature, this is called the Automatic Essay Scoring or Automatic Text Scoring problem.

The Approach

This is a domain that is continuously progressing in the research world. So, you’ll be able to find a lot of solutions. Let’s focus on one solution that gets the job done.

One way to think of it is as a prediction problem and try to predict the reader affinity scores. But there is a small issue with that and it may not help in solving our problem. The reader affinity score is for the whole comic and not just for the villain. A person can still give a good score if she likes the plot but hates the villain. If we are trying to predict this score, we’ll need to use a lot more information (like the comic category, month of release, age group targeted etc,) and not just the villain information (like the name, description, etc).

Let’s also note the fact that the predicted score of one villain is not really of use to us because our job is to find the best villains from a pool of villains. Individually, the scores may not make as much sense as they would if they were considered relatively. If we have 100 scores, we can easily know which villain is likely to perform better than the others.

Therefore, we can still proceed with our prediction logic but instead of looking at the predicted scores objectively, we just need to make sure they are correlating with the actual scores. This means that if a villain X is scored higher than villain Y , irrespective of how good or bad our prediction is, if the actual scores also follow the same order or rank then it’s a win.

The Solution

Source: link

To get straight to one solution (out of many, like I said the literature is pretty lit :fire:), we use two specific types of models. Since scoring of text is the task, we need some sort of a text-to-embedding technique to represent our text as vectors. Any text-to-embedding technique can be picked but I’ve chosen the Universal Sentence Encoder .

usemodel = hub.Module('models/sentence_encoder')def get_use_vectors(list_text):
'''
Computing the USE vector representation of list of sentences
@param list_text : list of sentences
'''
messages = list_text
num_batches = math.ceil(len(messages) / BATCH_SIZE)
message_embeddings = []
with tf.Session() as session:
session.run([tf.global_variables_initializer(),
tf.tables_initializer()])
for batch in range(num_batches):
print(batch * batch_size, batch * batch_size + batch_size)
batch_msgs = messages[batch * batch_size: batch * batch_size + batch_size]

message_embeddings_temp = session.run([model_use(batch_msgs)])

message_embeddings.append(message_embeddings_temp)

all_embedding = np.concatenate(tuple(message_embeddings))
return all_embedding1, all_embedding2

This model is used to convert the villain names and their descriptions into vectors and we use these as features (along with other features) in a prediction model. The other features could be categorical such as category of comic, name of author, etc or ordinal features such as number of purchase, price, etc.

These can be one hot encoded and appended to our feature list.

The prediction model is a simple Random Forest Regressor model taken straight out of the sklearn tutorial section.

import pickle
from sklearn.ensemble import RandomForestRegressor

params = {'n_estimators':[20, 50, 100], 'max_depth':[2, 4, 6, 8, None], 'min_samples_split': [2, 4, 6, 8],
'n_jobs': [10]}


rf = RandomForestRegressor(n_estimators = 250, random_state = 42)

grid = GridSearchCV(rf, params)

grid.fit(X_train, y_train)

predictions = grid.predict(X_test)

errors = abs(predictions - y_test)

print(grid.best_score_)
print(grid.best_estimator_)

This gives us a model that is trained on our past historic data that predicts how well a villain name/description would perform. Technically, it’s a user affinity score predictor. But, like we discussed, since we aren’t using all possible and available features to predict this score and since we aren’t treating this as a user affinity score predictor model, the final predictions that we get will be inaccurate. But if the scores give us a relative indication about the performance of two or more villains, it’ll help us pick the top villains.

The Metrics

Cohen’s Kappa Score is usually used as a metric to identify how close our ranking or ordering of the predictions is when compared to the actual ordering. But, this metric assumes the predictions to be categories such as marks (0 to 5). We have a more continuous prediction and hence this metric wouldn’t work well for us.

For this, we can use simple Spearman and Pearson Correlations.

Plotting the actual vs the predicted scores plot gives a good idea if our predictions are following the right trend or not.

The correlation coefficients corresponding to the predictions to the left are:

Pearson: 0.65, pvalue = 2.14 e-92 | Spearman 0.60, pvalue =8.13 e-123


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

颠覆式创新:移动互联网时代的生存法则

颠覆式创新:移动互联网时代的生存法则

李善友 / 机械工业出版社 / 2015-3-1

为什么把每件事情都做对了,仍有可能错失城池?为什么无人可敌的领先企业,却在一夜之间虎落平阳?短短三年间诺基亚陨落,摩托罗拉以区区29亿美元出售给联想,芯片业霸主英特尔在移动芯片领域份额几乎为零,风光无限的巨头转眼成为被颠覆的恐龙,默默无闻的小公司一战成名迅速崛起,令人瞠目结舌的现象几乎都能被“颠覆式创新”法则所解释。 颠覆式创新教你在新的商业竞争中“换操作系统”而不是“打补丁”,小公司用破坏......一起来看看 《颠覆式创新:移动互联网时代的生存法则》 这本书的介绍吧!

URL 编码/解码
URL 编码/解码

URL 编码/解码

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具