Advertisements

Hue Error – DatabaseError: database is locked

You may face this error in Hue while using Impala or Hive. This is because of the lock happening in the backend database used in Hue. Hue uses a backend database to store all the metadata and history. By default it uses sqlite, which is not suitable for multiuser environments. The usage of the sqlite causes this issue.

We can resolve this by using mysql, postgresql or oracle database as the metastore for hue.

Advertisements

ORA-01045:user name lacks CREATE SESSION privilege; logon denied

After creating a user in oracle database, I tried to login using SQL developer and got an error “ORA-01045:user name lacks CREATE SESSION privilege; logon denied”.

The reason for this error was insufficient privileges.

I solved this issue by granting the following privilege.

grant create session to "<user-name>";

 

 

Hadoop Distributions

Below are the companies offering commercial implementations and/or providing support for Apache Hadoop, which is the base for all the below.

  • Cloudera offers CDH (Cloudera’s Distribution including Apache Hadoop) and Cloudera Enterprise.
  • Hortonworks (formed by Yahoo and Benchmark Capital), whose focus is on making Hadoop more robust and easier to install, manage and use for enterprise users. Hortonworks provides Hortonworks Data Platform (HDP).
  • MapR Technologies offers distributed filesystem and MapReduce engine, the MapR Distribution for Apache Hadoop.
  • Oracle announced the Big Data Appliance, which integrates Cloudera’s Distribution Including Apache Hadoop (CDH).
  • IBM offers InfoSphere BigInsights based on Hadoop in both a basic and enterprise edition.
  • Greenplum, A Division of EMC, offers Hadoop in Community and Enterprise editions.
  • Intel – the Intel Distribution for Apache Hadoop is the product includes the Intel Manager for Apache Hadoop for managing a cluster.
  • Amazon Web Services – Amazon offers a version of Apache Hadoop on their EC2 infrastructure, sold as Amazon Elastic MapReduce.
  • VMware – Initiate Open Source project and product to enable easily and efficiently deploy and use Hadoop on virtual infrastructure.
  • Bigtop – project for the development of packaging and tests of the Apache Hadoop ecosystem.
  • DataStax – DataStax provides a product of Hadoop which fully integrates Apache Hadoop with Apache Cassandra and Apache Solr in its DataStax Enterprise platform.
  • Cascading – A popular feature-rich API for defining and executing complex and fault tolerant data processingworkflows on a Apache Hadoop cluster.
  • Mahout – Apache project using Hadoop to build scalable machine learning algorithms like canopy clustering, k-means and many more.
  • Cloudspace – uses Apache Hadoop to scale client and internal projects on Amazon’s EC2 and bare metal architectures.
  • Datameer – Datameer Analytics Solution (DAS) is a Hadoop-based solution for big data analytics that includes data source integration, storage, an analytics engine and visualization.
  • Data Mine Lab – Developing solutions based on Hadoop, Mahout, HBase and Amazon Web Services.
  • BigDataEdge (Infosys) – An Insight creation product which contains hundreds of components to get accurate insights with no pains.
  • Debian – A Debian package of Apache Hadoop is available.
  • HStreaming – offers real-time stream processing and continuous advanced analytics built into Hadoop, available as free community edition, enterprise edition, and cloud service.
  • Impetus
  • Karmasphere – Distributes Karmasphere Studio for Hadoop, which allows cross-version development and management of Apache Hadoop jobs.
  • Nutch – Apache Nutch, flexible web search engine software.
  • NGDATA – Makes available Lily Open Source that builds upon Hadoop, HBase and SOLR. Distributes Lily Enterprise.
  • Pentaho – Pentaho provides a complete, end-to-end open-source BI and offers an easy-to-use, graphical ETL tool that is integrated with Apache Hadoop for managing data and coordinating Hadoop related tasks in the broader context of ETL and Business Intelligence workflow.
  • Pervasive Software – Provides Pervasive DataRush, a parallel dataflow framework which improvesperformance of Apache Hadoop and MapReduce jobs by exploiting fine-grained parallelism on multicore servers.
  • Platform Computing – Provides an Enterprise Class MapReduce solution for Big Data Analytics with high scalability and fault tolerance. Platform MapReduce provides unique scheduling capabilities and its architecture is based on almost two decades of distributed computing research and development.
  • Sematext International – Provides consulting services around Apache Hadoop and Apache HBase, along with large-scale search using Apache Lucene, Apache Solr, and Elastic Search.
  • Talend – Talend Platform for Big Data includes support and management tools for all the major Apache Hadoop distributions. Talend Open Studio for Big Data is an Apache License Eclipse IDE, which provides a set of graphical components for HDFS, HBase, Pig, Sqoop and Hive.
  • Think Big Analytics – Offers expert consulting services specializing in Apache Hadoop, MapReduce and relateddata processing architectures.
  • Tresata – Financial Industry’s first software platform architected from the ground up on Hadoop. Data storage, processing, analytics and visualization all done on Hadoop.
  • WANdisco is a committed member & sponsor of the Apache Software community and has active committers on several projects including Apache Hadoop.