Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hdfs: add ha to Httpfs ha #859

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat(spark2|3): add ha support for spark-hs
mehdibn committed Jul 17, 2024
commit 8607c1d65e2bac814335f9c98f239f6fc9ebbe88
1 change: 0 additions & 1 deletion playbooks/spark3_kerberos_install.yml
Original file line number Diff line number Diff line change
@@ -13,7 +13,6 @@
name: tosit.tdp.spark.historyserver
tasks_from: kerberos
- ansible.builtin.meta: clear_facts # noqa unnamed-task

- name: Spark3 Kerberos Client install
hosts: spark3_client
strategy: linear
6 changes: 4 additions & 2 deletions tdp_vars_defaults/knox/knox.yml
Original file line number Diff line number Diff line change
@@ -157,11 +157,13 @@ tdpldap_services:
location: /ws
port: "{{ yarn_rm_https_port }}"
SPARKHISTORYUI:
hosts: "{{ groups['spark_hs'] | default([]) | map('tosit.tdp.access_fqdn', hostvars) | list }}"
hosts: "{% if spark2_hs_ha_address is defined %}{{ spark2_hs_ha_address | urlsplit('hostname') | split(' ') | list }}{% else %}{{ groups['spark_hs'] | default([]) | map('tosit.tdp.access_fqdn', hostvars) | list }}{% endif %}"
port: "{{ spark_hs_https_port }}"
scheme: "{% if spark2_hs_ha_address is defined %}{{ spark2_hs_ha_address | urlsplit('scheme') }}://{% endif %}"
SPARK3HISTORYUI:
hosts: "{{ groups['spark3_hs'] | default([]) | map('tosit.tdp.access_fqdn', hostvars) | list }}"
hosts: "{% if spark3_hs_ha_address is defined %}{{ spark3_hs_ha_address | urlsplit('hostname') | split(' ') | list }}{% else %}{{ groups['spark3_hs'] | default([]) | map('tosit.tdp.access_fqdn', hostvars) | list }}{% endif %}"
port: "{{ spark3_hs_https_port}}"
scheme: "{% if spark3_hs_ha_address is defined %}{{ spark3_hs_ha_address | urlsplit('scheme') }}://{% endif %}"
WEBHBASE:
hosts: "{{ groups['hbase_rest'] | default([]) | map('tosit.tdp.access_fqdn', hostvars) | list }}"
port: "{{ hbase_rest_client_port }}"
2 changes: 1 addition & 1 deletion tdp_vars_defaults/spark/spark.yml
Original file line number Diff line number Diff line change
@@ -70,7 +70,7 @@ spark_truststore_location: /etc/ssl/certs/truststore.jks
spark_truststore_password: Truststore123!

# Spark History Server kerberos
spark_ui_spnego_principal: "HTTP/{{ ansible_fqdn }}@{{ realm }}"
spark_ui_spnego_principal: "*"
spark_ui_spnego_keytab: /etc/security/keytabs/spnego.service.keytab

# spark-defaults.conf - common
2 changes: 1 addition & 1 deletion tdp_vars_defaults/spark3/spark3.yml
Original file line number Diff line number Diff line change
@@ -71,7 +71,7 @@ hadoop_credentials_properties:
value: '{{ spark_keystore_password }}'

# Spark History Server kerberos
spark_ui_spnego_principal: "HTTP/{{ ansible_fqdn }}@{{ realm }}"
spark_ui_spnego_principal: "*"
spark_ui_spnego_keytab: /etc/security/keytabs/spnego.service.keytab

# spark-defaults.conf - common
2 changes: 2 additions & 0 deletions tdp_vars_defaults/tdp-cluster/tdp-cluster.yml
Original file line number Diff line number Diff line change
@@ -252,3 +252,5 @@ ldap:
#############################

# ranger_ha_address: "http[s]://dns_alias:port"
# spark2_hs_ha_address: "http[s]://dns_alias:port"
# spark3_hs_ha_address: "http[s]://dns_alias:port"
5 changes: 2 additions & 3 deletions topology.ini
Original file line number Diff line number Diff line change
@@ -97,12 +97,14 @@ master3
edge

[spark_hs:children]
master2
master3

[spark_client:children]
edge

[spark3_hs:children]
master2
master3

[spark3_client:children]
@@ -111,9 +113,6 @@ edge
[knox:children]
edge

[spnego_ha:children]
ranger_admin

# Section Postgresql_client from tdp_prerequisites
[postgresql_client:children]
ranger_admin

Unchanged files with check annotations Beta

tasks:
- tosit.tdp.resolve: # noqa unnamed-task
node_name: hdfs_client
- ansible.builtin.import_role:

Check warning on line 11 in playbooks/utils/hdfs_user_homes.yml

GitHub Actions / ansible-lint

name[missing]

All tasks should be named.
name: tosit.tdp.utils.hdfs_user_homes
- ansible.builtin.meta: clear_facts # noqa unnamed-task
tasks:
- tosit.tdp.resolve: # noqa unnamed-task
node_name: ranger_admin
- ansible.builtin.import_role:

Check warning on line 11 in playbooks/utils/ranger_policies.yml

GitHub Actions / ansible-lint

name[missing]

All tasks should be named.
name: tosit.tdp.utils.ranger_policies
- ansible.builtin.meta: clear_facts # noqa unnamed-task
tasks:
- tosit.tdp.resolve: # noqa unnamed-task
node_name: yarn_resourcemanager
- ansible.builtin.import_role:

Check warning on line 10 in playbooks/utils/yarn_capacity_scheduler.yml

GitHub Actions / ansible-lint

name[missing]

All tasks should be named.
name: tosit.tdp.utils.yarn_capacity_scheduler
- ansible.builtin.meta: clear_facts # noqa unnamed-task
when: enable_ranger_audit_log4j
- name: Run enable-hbase-plugin.sh

Check warning on line 39 in roles/hbase/ranger/tasks/config.yml

GitHub Actions / ansible-lint

no-changed-when

Commands should not change things if nothing needs doing.
ansible.builtin.shell: |
export JAVA_HOME={{ java_home }}
./enable-hbase-plugin.sh
run_once: true
become_user: "{{ hdfs_user }}"
block:
- name: HDFS service check - RPC check put file

Check warning on line 15 in roles/hdfs/check/tasks/main.yml

GitHub Actions / ansible-lint

risky-shell-pipe

Shells that use pipes should set the pipefail option.
ansible.builtin.shell: echo "HDFS Service Check" | hdfs dfs -put - {{ hdfs_check_path_file }}
register: hdfs_put_file
changed_when: false
run_once: true
become_user: "{{ hdfs_user }}"
block:
- name: HDFS service check - Get active namenode host

Check warning on line 53 in roles/hdfs/check/tasks/main.yml

GitHub Actions / ansible-lint

risky-shell-pipe

Shells that use pipes should set the pipefail option.
ansible.builtin.shell: hdfs haadmin -getAllServiceState | grep 'active' | cut -d':' -f 1
register: webhdfs_nn_host
changed_when: false
- name: HDFS namenode component check - Check namenode safemode & state
become_user: "{{ hdfs_user }}"
block:
- name: HDFS namenode component check - Check nn safemode

Check warning on line 51 in roles/hdfs/namenode/tasks/check.yml

GitHub Actions / ansible-lint

risky-shell-pipe

Shells that use pipes should set the pipefail option.
ansible.builtin.shell: hdfs dfsadmin -safemode get | grep "{{ ansible_hostname }}"
register: nn_safemode_res
retries: "{{ hdfs_check_retries }}"
# SPDX-License-Identifier: Apache-2.0
---
- name: Format Zookeeper

Check warning on line 5 in roles/hdfs/namenode/tasks/formatzk.yml

GitHub Actions / ansible-lint

risky-shell-pipe

Shells that use pipes should set the pipefail option.
run_once: true
become: true
register: format_zk
when: enable_ranger_audit_log4j
- name: Run enable-hdfs-plugin.sh

Check warning on line 48 in roles/hdfs/ranger/tasks/config.yml

GitHub Actions / ansible-lint

no-changed-when

Commands should not change things if nothing needs doing.
ansible.builtin.shell: |
export JAVA_HOME={{ java_home }}
./enable-hdfs-plugin.sh
hive_validate.rc > 1 or
(hive_validate.rc == 1 and 'Failed to get schema version' not in hive_validate.stderr)
- name: Hive Metastore initSchema

Check warning on line 22 in roles/hive/common/tasks/init_schema.yml

GitHub Actions / ansible-lint

no-changed-when

Commands should not change things if nothing needs doing.
ansible.builtin.command: >-
{{ hive_install_dir }}/bin/hive \
--config {{ hive_ms_conf_dir }} \