Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: certificate-based ssh authentication doesn't work #972

Open
rnwolfe opened this issue Jan 30, 2025 · 0 comments · May be fixed by #973
Open

[Bug]: certificate-based ssh authentication doesn't work #972

rnwolfe opened this issue Jan 30, 2025 · 0 comments · May be fixed by #973
Labels
bug Something isn't working

Comments

@rnwolfe
Copy link

rnwolfe commented Jan 30, 2025

Suzieq version

0.23.0

Install Type

container

Python version

3.8

Impacted component

sq-poller

Steps to Reproduce

My environment has linux nodes that only use certificate-based authentication for ssh login. The sq-poller fails to login successfully.

Use a certificate keypair for certificate-based authentication to a linux node:

root@suzieq-7fcbfbf9cf-7jnq5:/home/suzieq# ls -la /suzieq/tls/gdc1/
total 0
drwxrwxrwt. 3 root root 160 Jan 30 19:03 .
drwxrwxrwt. 6 root root 200 Jan 30 19:03 ..
drwxr-xr-x. 2 root root 120 Jan 30 19:03 ..2025_01_30_19_03_22.2683186022
lrwxrwxrwx. 1 root root  32 Jan 30 19:03 ..data -> ..2025_01_30_19_03_22.2683186022
lrwxrwxrwx. 1 root root  13 Jan 30 19:03 id_rsa -> ..data/id_rsa
lrwxrwxrwx. 1 root root  22 Jan 30 19:03 id_rsa-cert.pub -> ..data/id_rsa-cert.pub
lrwxrwxrwx. 1 root root  17 Jan 30 19:03 id_rsa.pub -> ..data/id_rsa.pub
lrwxrwxrwx. 1 root root  18 Jan 30 19:03 known_hosts -> ..data/known_hosts

and a basic inventory:

sources:
  - name: test-linux-node
    hosts:
      - url: ssh://[email protected]

devices:
  - name: ignore-known-hosts
    ignore-known-hosts: true
    port: 22
    transport: ssh

auths:
  - name: test-linux-root
    keyfile: /home/suzieq/gdc1/id_rsa
namespaces:
  - name: test-linux-node
    device: ignore-known-hosts
    auth: test-linux-root
    source: test-linux-node

Expected Behavior

My expectation is that the given keyfile in the auths block would automatically handle finding the associated -cert file (openssh does this by default when provide ssh -i /suzieq/tls/gdc1/id_rsa [email protected]).

Working output from sq-poller:

[WORKER 0]: 2025-01-30 20:36:53,988 - asyncssh - INFO - Opening SSH connection to 172.28.184.3, port 22
[WORKER 0]: 2025-01-30 20:36:53,990 - asyncssh - INFO - [conn=0] Connected to SSH server at 172.28.184.3, port 22
[WORKER 0]: 2025-01-30 20:36:53,991 - asyncssh - INFO - [conn=0]   Local address: 10.56.21.193, port 45272
[WORKER 0]: 2025-01-30 20:36:53,991 - asyncssh - INFO - [conn=0]   Peer address: 172.28.184.3, port 22
[WORKER 0]: 2025-01-30 20:36:53,991 - asyncssh - DEBUG - [conn=0] Sending version SSH-2.0-AsyncSSH_2.14.2
[WORKER 0]: 2025-01-30 20:36:54,087 - asyncssh - DEBUG - [conn=0] Received version SSH-2.0-OpenSSH_8.0
[WORKER 0]: 2025-01-30 20:36:54,088 - asyncssh - DEBUG - [conn=0] Requesting key exchange
[WORKER 0]: 2025-01-30 20:36:54,103 - asyncssh - DEBUG - [conn=0] Received key exchange request
[WORKER 0]: 2025-01-30 20:36:54,104 - asyncssh - DEBUG - [conn=0] Beginning key exchange
[WORKER 0]: 2025-01-30 20:36:54,114 - asyncssh - DEBUG - [conn=0] Completed key exchange
[WORKER 0]: 2025-01-30 20:36:54,115 - asyncssh - INFO - [conn=0] Beginning auth for user root
[WORKER 0]: 2025-01-30 20:36:54,123 - asyncssh - DEBUG - [conn=0] Trying public key auth with [email protected] key
[WORKER 0]: 2025-01-30 20:36:54,131 - asyncssh - DEBUG - [conn=0] Trying public key auth with [email protected] key
[WORKER 0]: 2025-01-30 20:36:54,135 - asyncssh - DEBUG - [conn=0] Signing request with [email protected] key
[WORKER 0]: 2025-01-30 20:36:54,153 - asyncssh - INFO - [conn=0] Auth for user root succeeded
[WORKER 0]: 2025-01-30 20:36:54,153 - suzieq.poller.worker.nodes.node - INFO - Connected to 172.28.184.3:22 at 1738269414.1535795
[WORKER 0]: 2025-01-30 20:36:54,153 - suzieq.poller.worker.nodes.node - INFO - Connection succeeded via SSH for 172.28.184.3
[WORKER 0]: 2025-01-30 20:36:54,154 - asyncssh - DEBUG - [conn=0, chan=0] Set write buffer limits: low-water=16384, high-water=65536
[WORKER 0]: 2025-01-30 20:36:54,154 - asyncssh - INFO - [conn=0, chan=0] Requesting new SSH session

Observed Behavior

When running the poller, I see:

[WORKER 0]: 2025-01-29 22:10:46,792 - asyncssh - DEBUG - [conn=1] Sending version SSH-2.0-AsyncSSH_2.14.2
[WORKER 0]: 2025-01-29 22:10:46,947 - asyncssh - DEBUG - [conn=1] Received version SSH-2.0-OpenSSH_8.0
[WORKER 0]: 2025-01-29 22:10:46,948 - asyncssh - DEBUG - [conn=1] Requesting key exchange
[WORKER 0]: 2025-01-29 22:10:46,951 - asyncssh - DEBUG - [conn=1] Received key exchange request
[WORKER 0]: 2025-01-29 22:10:46,951 - asyncssh - DEBUG - [conn=1] Beginning key exchange
[WORKER 0]: 2025-01-29 22:10:46,961 - asyncssh - DEBUG - [conn=1] Completed key exchange
[WORKER 0]: 2025-01-29 22:10:46,962 - asyncssh - INFO - [conn=1] Beginning auth for user root
[WORKER 0]: 2025-01-29 22:10:46,969 - asyncssh - DEBUG - [conn=1] Trying public key auth with rsa-sha2-256 key
[WORKER 0]: 2025-01-29 22:10:47,704 - asyncssh - INFO - [conn=1] Auth failed for user root
[WORKER 0]: 2025-01-29 22:10:47,705 - asyncssh - INFO - [conn=1] Connection failure: Permission denied for user root on host 72.28.24.67
[WORKER 0]: 2025-01-29 22:10:47,705 - asyncssh - INFO - [conn=1] Aborting connection
[WORKER 0]: 2025-01-29 22:10:47,705 - suzieq.poller.worker.nodes.node - ERROR - Authentication failed to 172.28.184.3 Not retrying to avoid locking out user. Please restart poller with proper authentication.: Permission denied for user root on host 172.28.184.3

This message indicates it is using the id_rsa private key and not the id_rsa-cert.pub key: Trying public key auth with rsa-sha2-256 key.

What we should see is this:

Opening SSH connection to 172.28.24.67, port 22
[conn=0] Connected to SSH server at 172.28.24.67, port 22
[conn=0]   Local address: 10.51.16.156, port 50330
[conn=0]   Peer address: 172.28.24.67, port 22
[conn=0] Sending version SSH-2.0-AsyncSSH_2.14.2
[conn=0] Received version SSH-2.0-OpenSSH_8.0
[conn=0] Requesting key exchange
[conn=0] Received key exchange request
[conn=0] Beginning key exchange
[conn=0] Completed key exchange
[conn=0] Beginning auth for user root
[conn=0] Trying public key auth with [email protected] key
[conn=0] Trying public key auth with [email protected] key
[conn=0] Signing request with [email protected] key
[conn=0] Auth for user root succeeded
[conn=0, chan=0] Set write buffer limits: low-water=16384, high-water=65536
[conn=0, chan=0] Requesting new SSH session
[conn=0] Received unknown global request: [email protected]
[conn=0] Received debug message: cert: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
[conn=0] Received debug message: cert: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
[conn=0, chan=0]   Command: echo "Hello, world!"
[conn=0, chan=0] Received exit status 0
[conn=0, chan=0] Received channel close
[conn=0, chan=0] Channel closed
[conn=0] Closing connection
[conn=0] Sending disconnect: Disconnected by application (11)
[conn=0] Connection closed

Which was produced by this test script (coming from the same container, etc.):

import asyncssh, asyncio
import logging

logging.basicConfig(level=logging.DEBUG)

pvtkey = '/suzieq/tls/gdc1/id_rsa'
async def connect():
    options = asyncssh.SSHClientConnectionOptions(
        connect_timeout=10,
        username='root',
        agent_identities=pvtkey,
        client_keys=pvtkey,
        client_key_passphrase=None,
        password=None,
        known_hosts=None,
        kex_algs='+diffie-hellman-group1-sha1',  # for older boxes
        encryption_algs='+aes256-cbc',           # for older boxes
    )

    try:
        async with asyncssh.connect('172.28.184.3', options=options) as conn:
            # Perform operations with the connection
            result = await conn.run('echo "Hello, world!"', check=True)
            print(result.stdout, end='')
    except (asyncssh.Error, OSError) as exc:
        logging.error('SSH connection failed: %s', exc)

asyncio.run(connect())

Note that passing in the keyfile value as id_rsa-cert.pub doesn't work as the _decrypt_pvtpkey() function expects a private key. If the cert is provided, the private key loading results in an error:

def _decrypt_pvtkey(self, pvtkey_file: str, passphrase: str) -> str:
"""Decrypt private key file"""
keydata: str = None
if pvtkey_file:
try:
keydata = asyncssh.public_key.read_private_key(pvtkey_file,
passphrase)
except Exception as e: # pylint: disable=broad-except
self.logger.error(
f"ERROR: Unable to read private key file {pvtkey_file}"
f"for jump host due to {e}")
return keydata

Which is called from here:

if pvtkey_file:
self.jump_host_key = self._decrypt_pvtkey(pvtkey_file,
passphrase)

As a direct test, I adjusted

options = asyncssh.SSHClientConnectionOptions(
connect_timeout=self.connect_timeout,
username=self.username,
agent_identities=self.pvtkey if self.pvtkey else None,
client_keys=self.pvtkey if self.pvtkey else None,
password=self.password if not self.pvtkey else None,
kex_algs='+diffie-hellman-group1-sha1', # for older boxes
encryption_algs='+aes256-cbc', # for older boxes
)
to

        options = asyncssh.SSHClientConnectionOptions(
            connect_timeout=self.connect_timeout,
            username=self.username,
            agent_identities=self.pvtkey if self.pvtkey else None,
            # previous: client_keys=self.pvtkey if self.pvtkey else None,
            client_keys=["/suzieq/tls/gdc1/id_rsa"], # could also be: client_keys="/suzieq/tls/gdc1/id_rsa",
            password=self.password if not self.pvtkey else None,
            kex_algs='+diffie-hellman-group1-sha1',  # for older boxes
            encryption_algs='+aes256-cbc',           # for older boxes
        )

The connection is successful:

root>devices show
1   test-linux-node    ah-aa-base02                               server  Linux 8.6 (Green Obsidian)  Rocky                   alive    172.28.184.3 2025-01-16 15:01:24+00:00
root>

This seems to be as a result of asyncssh automatically handling the locating of keypath + "-cert" when passed a file path instead of a key object.

So it seems suzieq provides an asyncssh.SSHKey object in self.pvtkey . Since in my case, I am using cert-based auth, this actually breaks the auth flow. asyncssh handles this natively when provided a filepath to client_keys by automatically finding an associated pubkey in the filepath + "-cert" which is standard ssh functionality. So, when i hardcode the filepath as a string, it finds the cert correctly and authenticates.
Since suzieq handles _decrypt_pvtkey() before this, the self.pvtkey becomes an SSHKey type and certs no longer work. So, it seems some handling needs to be changed to support this better.

Screenshots

Additional Context

Relevant thread in slack: https://netenglabs.slack.com/archives/C015TD9DR8U/p1738176065631799

@rnwolfe rnwolfe added the bug Something isn't working label Jan 30, 2025
@rnwolfe rnwolfe linked a pull request Jan 30, 2025 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant