MLonFHIR: Fusing Sklearn with the HL7 FHIR Standard

栏目: IT技术 · 发布时间: 5年前

内容简介:Read more about internals and extendability in theThere are two general ways of searching for patients with specific properties. The first one is to search by coding system:The second one is by text. The searched tags are CodeableConcept.text, Coding.displ


A work in progress library that fuses the HL7 FHIR standard with scikit-learn

Read more about internals and extendability in the readthedocs .

Usage (taken from our demo notebook )

First: Register the base URL of your database with a FHIRCient object:

from fhir_client import FHIRClient
import logging
import pandas as pd

logger = logging.getLogger(__name__)

client = FHIRClient(service_base_url='', logger=logger)

Querying Patients

There are two general ways of searching for patients with specific properties. The first one is to search by coding system:

# To receive a list of available procedures:
procedures = client.get_all_procedures()
pd.DataFrame([prod.code['coding'][0] for prod in procedures]).drop_duplicates().sort_values(by=['display']).head()

# Now retrieve patients
patients_by_procedure_code = client.get_patients_by_procedure_code("","73761001")

The second one is by text. The searched tags are CodeableConcept.text, Coding.display, or Identifier.type.text:

conditions = client.get_all_conditions()
pd.DataFrame([cond.code['coding'][0] for cond in conditions]).drop_duplicates(subset=['display']).sort_values(by='display', ascending=True).head()

patients_by_condition_text = client.get_patients_by_condition_text("Abdominal pain")

One can also load a control group for a specific cohort of patients. The control group is of equal size of the case cohort (min size: 10) and is composed of randomly sampled patients that do not match the original query. Their class is contained in the .case property of the Patient object.

patients_by_condition_text_with_controls = client.get_patients_by_condition_text("Abdominal pain", controls=True)

print("{} are cases and {} are controls".format(len([d for d in patients_by_condition_text_with_controls if]), 
                                                len([d for d in patients_by_condition_text_with_controls if not])))

Machine Learning

To train a classifier, we need to first tell the MLOnFHIRClassifier the type of object which we would like to classify. We can then define features ( feature_attrs ) and labels ( label_attrs ) for our classification task and pass the preprocessor of our current client, so it is clear how to preprocess the features/labels of a patient. We can then simply call .fit on the MLOnFHIRClassifier instance together with our classifier of choice.

from ml_on_fhir import MLOnFHIRClassifier
from fhir_objects.patient import Patient
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, roc_curve, auc

ml_fhir = MLOnFHIRClassifier(Patient, feature_attrs=['birthDate', 'gender'],
                             label_attrs=['case'], preprocessor=client.preprocessor)
X, y, trained_clf =, DecisionTreeClassifier())

from sklearn.metrics import accuracy_score, roc_curve, auc
fpr, tpr, _ = roc_curve(y, trained_clf.predict(X))
print("Prediction accuracy {}".format( auc(fpr, tpr) ) )

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网






米奇•乔尔 / 曲强 / 中信出版社 / 2014-6-10 / 45.00元

罗振宇、丹尼尔•平克、赛斯•高汀、丹•艾瑞里、谢家华、阿里安娜•赫芬顿强烈推荐! 美国亚马逊2013年年度商业&投资类图书榜前20名! 互联网时代五大剧变让企业和个人无处可逃 进化,或被扔在旧时代? 全球顶尖的数字预言家独特分享 商业转型与思维转型的实践指南 当个人变为互联世界中的一个节点,如何开启新的工作方式? 如何与顾客建立直接关系?如何进行实用主义营......一起来看看 《重启》 这本书的介绍吧!


RGB HEX 互转工具

Base64 编码/解码
Base64 编码/解码

Base64 编码/解码

SHA 加密
SHA 加密

SHA 加密工具