HBase Thrift with Python (Kerberos)

栏目: Python · 发布时间: 6年前

内容简介:本文内容是基于0.10.0 版本之前的不支持 python 3.5生成

说在文前

本文内容是基于 Centos 7、HDP 3.0.0、HBase 2.0.0、 Python 2.7 环境下,其他环境的童鞋选择性进行参考

安装 Thrift

安装依赖包

yum install -y automake libtool flex bison pkgconfig gcc-c++ libevent-devel zlib-devel python-devel ruby-devel openssl-devel

安装 boost (CentOS 7 必做)

wget https://dl.bintray.com/boostorg/release/1.64.0/source/boost_1_64_0.tar.gz
tar zxvf boost_1_64_0.tar.gz
cd boost_1_64_0
./bootstrap.sh
./b2 install

下载 Thrift

wget https://archive.apache.org/dist/thrift/0.10.0/thrift-0.10.0.tar.gz
tar zxvf thrift-0.10.0.tar.gz
cd thrift-0.10.0/
./configure  或者 ./configure --with-boost=/usr/local  --without-java --without-php
make
make install
# 进行确认安装成功
thrift -help

0.10.0 版本之前的不支持 python 3.5

生成 hbase.thrift

HDP 下 HBase 相应的安装目录下本身就已经存在 hbase.thrift 文件了,所以我们不需要自行创建了。

生成指定语言的代码

# hdp hbase.thrift 文件路径
cd /usr/hdp/3.0.0.0-1634/hbase/include/thrift/
# 生成 python
# 该路径下存在 thrift1 和 thrift2 两种,可以自行选择
thrift -gen py hbase1.thrift 或 thrift -gen py hbase2.thrift

启动 Thrift 服务

cd /usr/hdp/3.0.0.0-1634/hbase/bin/
./hbase-daemon.sh start thrift -p 9090 --infoport 8086

日志路径为 /var/log/hbase/

使用 Thrift 2 模式

./hbase-daemon.sh start thrift2 -p 9090 --infoport 8086
./hbase-daemon.sh stop thrift

Python 方式连接

Thrift_1 模式

from thrift.transport.TSocket import TSocket
from thrift.transport.TTransport import TBufferedTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase

if __name__ == '__main__':
    transport = TBufferedTransport(TSocket('10.200.168.18', 9090))
    transport.open()
    protocol = TBinaryProtocol.TBinaryProtocol(transport)
    client = Hbase.Client(protocol)
    client.get()
    print(client.getTableNames())

Thrift_2 模式

from thrift.transport.TSocket import TSocket
from thrift.transport.TTransport import TBufferedTransport
from thrift.protocol import TBinaryProtocol
from hbase import THBaseService
from hbase.ttypes import TGet
import logging
if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    transport = TBufferedTransport(TSocket('10.200.168.18', 9090))
    transport.open()
    protocol = TBinaryProtocol.TBinaryProtocol(transport)
    client = THBaseService.Client(protocol)
    tget = TGet(row = '321ahah')
    tresult = client.get('shop', tget)
    for col in tresult.columnValues:
        print(col.qualifier, '=', col.value)
    print(client.send_get())
    transport.close()

Kerberos On Thrift

服务配置

core-site.xml

hadoop.proxyuser.hbase.groups=*
hadoop.proxyuser.hbase.hosts=*

hbase-site.xml

hbase.thrift.security.qop=auth
hbase.thrift.support.proxyuser=true
hbase.regionserver.thrift.http=false  # 使用http方式设为 true,binary 方式设为 false
hbase.thrift.keytab.file=/etc/security/keytabs/hbase.service.keytab 
hbase.thrift.kerberos.principal=hbase/_HOST@DEVDIP.COM 
hbase.security.authentication.spnego.kerberos.keytab=/etc/security/keytabs/spnego.service.keytab 
hbase.security.authentication.spnego.kerberos.principal=HTTP/_HOST@DEVDIP.COM

重启 HDFS 和 HBase

重启 Thrift 服务

# 停止
./hbase-daemon.sh stop thrift
# 启动
kinit -kt /etc/security/keytabs/hbase.headless.keytab hbase-dev_dmp & /usr/hdp/3.0.0.0-1634/hbase/bin/hbase-daemon.sh start thrift -p 9090 --infoport 8086
# To test the thrift server in http mode the syntax is:
hbase org.apache.hadoop.hbase.thrift.HttpDoAsClient DEVDIP.ORG 9090 hbase true
# to test in binary mode the syntax is:
hbase org.apache.hadoop.hbase.thrift.DemoClient DEVDIP.ORG 9090 true

示例

HBase Thrift with Python (Kerberos)

/var/log/hbase

HBase Thrift with Python (Kerberos)

参考文章

Python 方式连接

Thrift_1 模式

#!/usr/bin/env python
from thrift.transport import TSocket
from thrift.protocol import TBinaryProtocol
from thrift.transport import TTransport
from hbase import Hbase

# Apache HBase Thrift server coordinates (network location)
thriftServer = "dev-dmp5.fengdai.org"
thriftPort = 9090
# The service name is the "primary" component of the Kerberos principal the
# Thrift server uses.
# See: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html
# e.g. For a server principal of 'hbase/localhost@EXAMPLE.COM', the primary is "hbase"
saslServiceName = "hbase"

# HBase table and data information
tableName = 'demo_table'
row = 'test2'
colName = "cf:name"

if __name__ == '__main__':
    # Open a socket to the server
    sock = TSocket.TSocket(thriftServer, thriftPort)
    # Set up a SASL transport.
    transport = TTransport.TSaslClientTransport(sock, thriftServer, saslServiceName)
    transport.open()
    # Use the Binary protocol (must match your Thrift server's expected protocol)
    protocol = TBinaryProtocol.TBinaryProtocol(transport)

    client = Hbase.Client(protocol)
    # Pass the above to the generated HBase clietn

    # Fetch a row from HBase
    print "Row=>%s" % (client.getRow(tableName, row, {}))

    # Cleanup
    transport.close()

Thrift_2 模式

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from thrift.protocol import TCompactProtocol
from hbase import THBaseService
from hbase.ttypes import  *
import os
# Apache HBase Thrift server coordinates (network location)
thriftServer = "dev-dmp5.fengdai.org"
thriftPort = 9090
# The service name is the "primary" component of the Kerberos principal the
# Thrift server uses.
# See: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html
# e.g. For a server principal of 'hbase/localhost@EXAMPLE.COM', the primary is "hbase"
saslServiceName = 'hbase'

# HBase table and data information
tableName = 'demo_table'
row = 'test2'
coulumnValue1 = TColumnValue('cf', 'title', 'test')#ColumnFamily,Column,Value
coulumnValue2 = TColumnValue('cf', 'content', 'hello world')
coulumnValues = [coulumnValue1, coulumnValue2]
if __name__ == '__main__':
    socket = TSocket.TSocket(thriftServer, thriftPort)
    #transport = TTransport.TBufferedTransport(socket)
    transport = TTransport.TSaslClientTransport(socket,host=thriftServer,service=saslServiceName,mechanism='GSSAPI')
    protocol = TBinaryProtocol.TBinaryProtocol(transport)
    client = THBaseService.Client(protocol)
    transport.open()
    #get
    get = TGet(row=row, columns=coulumnValues)
    result = client.get(tableName,get)
    print  result

问题

1、找不到libboost_unit_test_framework.a

使用源码本地编译 boos t安装;由于默认认为是 32 位,在 /usr/lib64/libboost_unit_test_framework.a 下是找不到的。可以通过 find libboost_unit_test_framework.a 定位文件真实路径,进行创建软连接。

find / -name libboost_unit_test_framework.a
ln -s /usr/local/lib/libboost_unit_test_framework.a /usr/lib64/libboost_unit_test_framework.a

2、kerberos.GSSError: ((' Miscellaneous failure (see text)', 851968), ('Error from KDC: UNKNOWN_SERVER', -1765328377))

# 日志信息
2019-05-08 20:36:10,529 WARN  [qtp176041373-47] http.HttpParser: Illegal character 0x1 in state=START for buffer HeapByteBuffer@28800d32[p=1,l=11,c=8192,r=10]={\x01<<<\x00\x00\x00\x06GSSAPI>>>\x02\x00\x00\x03P`\x82\x03L\x06\t*\x86H\x86\xF7\x12...\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00}
2019-05-08 20:36:10,530 WARN  [qtp176041373-47] http.HttpParser: bad HTTP parsed: 400 Illegal character 0x1 for HttpChannelOverHttp@24c343ea{r=0,c=false,a=IDLE,uri=null}
thriftServer = "10.200.168.7"
改为
thriftServer = "dev-dmp5.fengdai.org"
应该跟hbase/localhost@EXAMPLE.COM => hbase/dev-dmp5.fengdai.org@DEVDIP.ORG

3、thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

访问其实已经是通了,一直以为是客户端的问题,由于代码是通过 binary 方式访问, hbase.regionserver.thrift.http=false 应该设置为 false。


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Tagging

Tagging

Gene Smith / New Riders / 2007-12-27 / GBP 28.99

Tagging is fast becoming one of the primary ways people organize and manage digital information. Tagging complements traditional organizational tools like folders and search on users desktops as well ......一起来看看 《Tagging》 这本书的介绍吧!

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

html转js在线工具
html转js在线工具

html转js在线工具