Skip to content

HuyNguyenDinh/hadoop-cluster-ansible

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Add id_rsa.pub to roles/ansible_user/files

Before running playbook:
- I choose to setup postgres 14 on slave01 (if changed to another please change in roles/spark/vars/main.yml)

File hosts following by
[master]
master-node ansible_host=172.18.0.4

[slaves]
slave01 ansible_host=172.18.0.3
slave02 ansible_host=172.18.0.5
slave03 ansible_host=172.18.0.2

[all:vars]
ansible_python_interpreter=/usr/bin/python3

After run playbook follow below steps to start services

-------------------------
Hadoop manual config
Step 1: Format namenode

Step 2: sudo su -> cd hadoop_dir/sbin -> run ./start-all.sh

--------------------------
Spark manual config
Step 1: cd to hadoop_dir/sbin

Step 2: sudo -u hdfs ./hdfs dfs -mkdir -p /user/hive/warehouse (in roles/spark/vars/main.yml - hive_warehouse_in_hdfs_dir)

Step 4: sudo -u hdfs ./hdfs dfs -chown -R hive:hadoop /user/hive (in roles/spark/vars/main.yml - hive_metastore_warehouse_dir)

Step 3: sudo -u hdfs ./hdfs dfs -chmod -R 770 /user/hive

Step 4: cd to spark_dir/sbin

Step 5: su hive -> run start-hive.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published