Why Hbase use Phoenix, Why NoSql doesn’t support SQL Queries?
By Default NoSql doesn’t support SQL queries, but with the help of tools, it’s possible to run SQL commands on the top of NoSQL databases. For example Phoenix is a tool, that run on the top Hbase. Phoenix is a SQL layer over HBase, use JDBC driver to convert user queries into NoSql understandable format.


Big data:

Crud not support directly on Bigdata.

Updates are not possible in Big data.

particular keyword available or not not support. (no indexing)


after quarying, results will come within a fraction of time and support the CRUD operations.

as a part of big data implementation Hbase and cassandra implemented.

Stumbleupon implemented Hbase ( QL not 100% sql) + with Phoenix 100% implemented SQL.–

Facebook-> amazon -> data stax -.< apache cassandra (Data Stax is main) (CQL) almost like 100%- joins not support


Scalability issue is trigger in Hbase, so it’s best. Other NOSQL application not on the top of hdfs.
Any application can use Hbase. Most of use coading, but not commands. Still Hbase 0.x version only.

Column family. is the Collection of columns. There is no concept of database. Everything is in the hbase is Tables.

Collection of family called Table.

only one datatype is bytes in hbase.

Local Mode,

Psuedo mode, : Hmaster only run

cluster -> Internal & external zookeeper. Protection purpose cluster external zookeeper.

Zookeeper is monitoring like Nagios . If system is highly availability must use zookeeper. It runs java threads.  Simple launch number of threads and observing those threads.  Ganglia and Nagios are networking tools, but zookeeper is process level monitoring.



start hadoopbase working or not true or not how to verify?

Hbase-site-xml… you are in cluster odistributed is true you are in, if false: no


last line expoert hbase manages zk = true. It’s local internal zookeeper..

First start zookeeper, master and region server.

Hbase shall : starting Hbase prompt

List: list of the tables.
hadoop fs -rmr /file – to delete file.

create ‘table name’, ‘column family’ // you can ad multiple column families, but creating one table only ..

create ‘t’,’cf’, ‘cf1’, ‘cf2’, ‘cf3’

no need ; just enter is end the line;


Coloum family you can add millions of columns dynamically, but it’s not possible in RDBMS.

column!= column family;
We can add wild operations also.

for eg: List ‘r’ show all tables started with r.

list list ‘r.*’ both are


Create get scan,put, drop delete (crud)

put ‘test’, ‘row1’, ‘cf:a’,’value’

where a is columns.  the value assign to a.

Hbase and cossandra follows Sparse matrix to store the data , while RDBMS follow dense matrix.

If assign new value assign to value,  in the form of revision/version.


MR integration



MRIntegration: Whey to go to MR Integration?

Hbase do sequentially, but map-reduce do parallel. So some time hbase do MRintegration for better results.