微软Azure云平台Hbase 的使用,微软azure云hbase
In this article- What is HBase?
- Prerequisites
- Provision HBase clusters using Azure Management portal
- Mange HBase tables using HBase shell
- Use HiveQL to query HBase tables
- Use the Microsoft HBase REST client library to manage HBase tabels
- See also
What is HBase?
HBase is a low-latency NoSQL database that allows online transactional processing of big data. HBase is offered as a managed cluster integrated into the Azure environment. The clusters are configured to store data directly in Azure Blob storage, which provides low latency and increased elasticity in performance/cost choices. This enables customers to build interactive websites that work with large datasets, to build services that store sensor and telemetry data from millions of end points, and to analyze this data with Hadoop jobs. For more information on HBase and the scenarios it can be used for, see HDInsight HBase overview.
NOTE:
HBase (version 0.98.0) is only available for use with HDInsight 3.1 clusters on HDInsight (based on Apache Hadoop and YARN 2.4.0). For version information, see What's new in the Hadoop cluster versions provided by HDInsight?
Prerequisites
Before you begin this tutorial, you must have the following:
- An Azure subscription For more information about obtaining a subscription, see Purchase Options, Member Offers, or Free Trial.
- An Azure storage account For instructions, see How To Create a Storage Account.
- A workstation with Visual Studio 2013 installed. For instructions, see Installing Visual Studio.
Provision an HBase cluster on the Azure portal
This section describes how to provision an HBase cluster using the Azure Management portal.
NOTE:
The steps in this article create an HDInsight cluster using basic configuration settings. For information on other cluster configuration settings, such as using Azure Virtual Network or a metastore for Hive and Oozie, see Provision an HDInsight cluster.
To provision an HDInsight cluster in the Azure Management portal
Create an HBase sample table from the HBase shell
This section describes how to enable and use the Remote Desktop Protocol (RDP) to access the HBase shell and then use it to create an HBase sample table, add rows, and then list the rows in the table.
It assumes you have completed the procedure outlined in the first section, and so have already successfully created an HBase cluster.
To enable the RDP connection to the HBase cluster
To open the HBase Shell
To create a sample table, add data and retrieve the data
Check cluster status in the HBase WebUI
HBase also ships with a WebUI that helps monitoring your cluster, for example by providing request statistics or information about regions. On the HBase cluster you can find the WebUI under the address of the zookeepernode.
http://zookeepernode:60010/master-status
In a HighAvailability (HA) cluster, you will find a link to the current active HBase master node hosting the WebUI.
Bulk load a sample table
Use Hive to query an HBase table
Now you have an HBase cluster provisioned and have created an HBase table, you can query it using Hive. This section creates a Hive table that maps to the HBase table and uses it to queries the data in your HBase table.
To open cluster dashboard
To run Hive queries
To browse the output file
Use HBase REST Client Library for .NET C# APIs to create an HBase table and retrieve data from the table
The Microsoft HBase REST Client Library for .NET project must be downloaded from GitHub and the project built to use the HBase .NET SDK. The following procedure includes the instructions for this task.
What's Next?
In this tutorial, you have learned how to provision an HBase cluster, how to create tables, and and view the data in those tables from the HBase shell. You also learned how use Hive to query the data in HBase tables and how to use the HBase C# APIs to create an HBase table and retrieve data from the table.
To learn more, see:
- HDInsight HBase overview: HBase is an Apache open source NoSQL database built on Hadoop that provides random access and strong consistency for large amounts of unstructured and semi-structured data.
- Provision HBase clusters on Azure Virtual Network: With the virtual network integration, HBase clusters can be deployed to the same virtual network as your applications so that applications can communicate with HBase directly.
- Analyze Twitter sentiment with HBase in HDInsight: Learn how to do real-time sentiment analysis of big data using HBase in an Hadoop cluster in HDInsight.