

If the command shows no output, then it means that the zookeeper servers are not running.Port 2182 is used by the HDInsight zookeeper (to provide HA for services that are not natively HA).Port 2181 is the apache zookeeper instance.Find the zookeeper servers from the /etc/hosts file or from Ambari UI.This could result in quorum loss, frequent failovers, and other issues.In the logs for Resource Manager, Namenode and others, you will see frequent client connection timeouts.Zookeeper clients are reporting frequent timeouts.In the Ambari UI, if you see near 100% sustained CPU usage on the zookeeper servers, then the zookeeper sessions open during that time can expire and time out.High CPU usage on the zookeeper servers.Jobs can fail temporarily due to Zookeeper connection issues.Make sure that the issue happens repeatedly (do not use these solutions for one off cases).Confirm from the logs that it is related to Zookeeper connections.High availability services like Yarn, NameNode, and Livy can go down for many reasons.03:17:07.7924490|Received RMFatalEvent of type STATE_STORE_FENCED, caused by $NoAuthException: KeeperErrorCode = NoAuth Transitioning to standby in 10000 ms if connection is not reestablished. You may see an error message similar to the following in yarn logs (/var/log/hadoop-yarn/yarn/yarn-yarn*.log on the headnodes): 03:17:18.3916720|Lost contact with Zookeeper. LLAP daemons fail to start on secure Spark or secure interactive Hive clusters.Spark, Hive, and Yarn jobs or Hive queries fail because of Zookeeper connection failures.Both the resource managers go to standby mode.This article describes troubleshooting steps and possible resolutions for issues related to Zookeepers in Azure HDInsight clusters.
