Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-75011] Use Apache Mina as ssh transport layer, remove trilead #1022

Merged

Conversation

jmdesprez
Copy link
Contributor

@jmdesprez jmdesprez commented Dec 17, 2024

PR for JENKINS-75011

Mainly, the changes consist of:

  • Create a SshClient where a connection is needed
  • Replace Connection by ClientSession
  • Replace SCPClient by CloseableScpClient
  • use try-with-resource for Closeable resources
  • Add ProxyCONNECTListener for the http proxy CONNECT
  • Add PEMParser because Mina requires a KeyPair instead of a char[]

Implementation note

I focused on migrating from trilead to mina. I didn't refactor because I didn't want to mix too many things up.

Testing done

I create this in draft mode because I'm still doing manual tests.
Manual testing has been done with the help of @Priya-CB and @mikecirioli.

What is tested:

  • Connect to an Unix AMI
    • Using external SSH process
    • Using Mina
      • With a proxy
      • Without a proxy

What is not tested:

  • Connect to an Mac AMI
  • Connect to an Windows AMI

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

public final String targetHost;
public final int targetPort;
public final String proxyUser;
public final String proxyPass;

Check warning

Code scanning / Jenkins Security Scan

Jenkins: Plaintext password storage Warning

Field should be reviewed whether it stores a password and is serialized to disk: proxyPass
@@ -39,6 +49,8 @@
public abstract class EC2ComputerLauncher extends ComputerLauncher {
private static final Logger LOGGER = Logger.getLogger(EC2ComputerLauncher.class.getName());

private static final long timeout = Duration.ofSeconds(10).toMillis();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: make the constant all-caps

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to publish the comment until it was marked as ready for review, I am still learning GitHub.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem 🙂
This should use the timeout from the node anyway: 7680d4e

computer,
listener,
"SSH service responded. Waiting " + bootDelay + "ms for service to stabilize");
Thread.sleep(bootDelay);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check the max delay time so that it is always a realistic value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this suggestion may be worth considering (I leave it to the discretion of the maintainer), I have kept the previous implementation: https://github.com/jmdesprez/ec2-plugin/blob/master/src/main/java/hudson/plugins/ec2/ssh/EC2MacLauncher.java#L186
If requested, I can add this to this PR.

Comment on lines 240 to 241
if (initScript != null
&& !initScript.trim().isEmpty()
Copy link

@pankajy-dev pankajy-dev Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (initScript != null
&& !initScript.trim().isEmpty()
if (StringUtils.isEmpty(initScript)

Since the project already uses Apache Commons, we can utilize the utility method.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean

Suggested change
if (initScript != null
&& !initScript.trim().isEmpty()
if (StringUtils.isNotEmpty(initScript)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to the trilead migration, but I have included it as it is a minor change: fe9a901

computer,
clientSession,
javaPath + " -fullversion",
"curl -L -O https://corretto.aws/downloads/latest/amazon-corretto-11-aarch64-macos-jdk.pkg; sudo installer -pkg amazon-corretto-11-aarch64-macos-jdk.pkg -target /",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://corretto.aws/downloads/latest/ Can we have this in constant as it is used in two places? Also, would allowing configuration using system properties be a good idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to the trilead migration, but I have included it as it is a minor change: 9a4fc3b

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allowing configuration using system properties be a good idea

Although this suggestion may be worth considering (I leave it to the discretion of the maintainer), I have kept the previous implementation.
If requested, I can add this to this PR.

computer,
clientSession,
javaPath + " -fullversion",
"curl -L -O https://corretto.aws/downloads/latest/amazon-corretto-11-x64-macos-jdk.pkg; sudo installer -pkg amazon-corretto-11-x64-macos-jdk.pkg -target /",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

try {
session.executeRemoteCommand(command, logger, logger, null);
return true;
} catch (IOException e) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using trilead, there is no logging when the command fails. So I used the FINE level: 7ceaeea

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is used to execute the remote commands like directory creation, should we consider the warning level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As that will change the plugin behavior, I would prefer to change that in another PR. Again, if requested, I can add this to this PR.


final SshClient remotingClient = SshClient.setUpDefaultClient();
final ClientSession remotingSession = connectToSsh(remotingClient, computer, listener, template);
KeyPair key = computer.getCloud().getKeyPair();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check the Key length? I assume not, but worth noting.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as part of apache mina migration, that could be part of some FIPS compatibility ticket.

import java.security.MessageDigest;
import java.util.logging.Logger;

public class HostKeyVerifierImpl implements ServerHostKeyVerifier {
public class HostKeyVerifierImpl {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we refactor the class name as it is no longer implementing the Interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the name as this is a public class.

computer,
listener,
"SSH service responded. Waiting " + bootDelay + "ms for service to stabilize");
Thread.sleep(bootDelay);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Should we make a check so that it doesn't wait indefinitely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this suggestion may be worth considering (I leave it to the discretion of the maintainer), I have kept the previous implementation: https://github.com/jmdesprez/ec2-plugin/blob/master/src/main/java/hudson/plugins/ec2/ssh/EC2UnixLauncher.java#L188
If requested, I can add this to this PR.

} else if ("EC".equalsIgnoreCase(privateKey.getAlgorithm())) {
ECPrivateKeySpec ecPrivateKeySpec = keyFactory.getKeySpec(privateKey, ECPrivateKeySpec.class);
return keyFactory.generatePublic(ecPrivateKeySpec);
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we only supporting two algorithms?

Copy link
Contributor Author

@jmdesprez jmdesprez Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote it in a more robust way. Anyway, I think that this is not used because we get a PrivateKeyInfo.
4e9d26d

@jmdesprez jmdesprez marked this pull request as ready for review January 20, 2025 13:22
computer,
listener,
"SSH service responded. Waiting " + bootDelay + "ms for service to stabilize");
Thread.sleep(bootDelay);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about using non thread blocking

jenkins.util.Timer.get().schedule(() -> {}, bootDelay, TimeUnit.SECONDS);

instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this suggestion may be worth considering (I leave it to the discretion of the maintainer), I have kept the previous implementation: https://github.com/jmdesprez/ec2-plugin/blob/master/src/main/java/hudson/plugins/ec2/ssh/EC2MacLauncher.java#L186
If requested, I can add this to this PR.

Comment on lines 240 to 241
if (initScript != null
&& !initScript.trim().isEmpty()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean

Suggested change
if (initScript != null
&& !initScript.trim().isEmpty()
if (StringUtils.isNotEmpty(initScript)

if (key == null) {
isAuthenticated = false;
} else {
clientSession.addPublicKeyIdentity(KeyHelper.decodeKeyPair(key.getKeyMaterial(), ""));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we only have unencrypted key pairs? This won't work if we find a PEMEncryptedKeyPair

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why won't this work? If it's related to the empty password, I kept the previous implementation: https://github.com/jmdesprez/ec2-plugin/blob/master/src/main/java/hudson/plugins/ec2/ssh/EC2MacLauncher.java#L410-L411

Copy link

@PereBueno PereBueno Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of knowledge from my side about how AWS keys work.

IIUTC computer.getCloud().getKeyPair() is returning a KeyPair and then we're adding the public key to the client session with KeyHelper.decodeKeyPair(key.getKeyMaterial(), "").
KeyHelper is checking if the key is encrypted or not https://github.com/jenkinsci/ec2-plugin/pull/1022/files#diff-0a41b9e9dd813646b5762bea12e95c9aca2f70c8387366b1e6e9870add0d0c2aR58 but we're calling this method is always called with empty / null password, so we'll only handle unencrypted keys.

If I have followed the code correctly, this ssh credentials are configured for the cloud, and the should be a username + SSH private key (I gues obtained with aws sts get-access-key-info).

So, the question is, is this ssh key always unencrypted? AFAIK it is, but I wanted to confirm this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of knowledge from my side about how AWS keys work.

Same here

So, the question is, is this ssh key always unencrypted?

Based on previous implementation, the password is hard coded to ""

final ClientSession remotingSession = connectToSsh(remotingClient, computer, listener, template);
KeyPair key = computer.getCloud().getKeyPair();
if (key != null) {
remotingSession.addPublicKeyIdentity(KeyHelper.decodeKeyPair(key.getKeyMaterial(), ""));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


final SshClient remotingClient = SshClient.setUpDefaultClient();
final ClientSession remotingSession = connectToSsh(remotingClient, computer, listener, template);
KeyPair key = computer.getCloud().getKeyPair();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as part of apache mina migration, that could be part of some FIPS compatibility ticket.


if (object instanceof PEMEncryptedKeyPair) {
PEMKeyPair decryptedKeyPair = ((PEMEncryptedKeyPair) object)
.decryptKeyPair(new JcePEMDecryptorProviderBuilder().build(password.toCharArray()));
Copy link

@PereBueno PereBueno Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might throw a NPE when password is null (e.g.: https://github.com/jenkinsci/ec2-plugin/pull/1022/files#diff-1d5f872d1bd42e798ad1690b14df1b4a4f79926b27a7be424148228e9073ac5aR76) let's add a @NonNull annotation or perform a null check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added: 9abf818

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the password can be null, it would be better to remove the @NonNull annotation and add a null check?

PrintStream logger = listener.getLogger();
EC2AbstractSlave node = computer.getNode();
SlaveTemplate template = computer.getSlaveTemplate();
final long timeout = node == null ? 0L : node.getLaunchTimeoutInMillis();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to move the initialization of timeout after the null check for node

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved: abea87f

try {
session.executeRemoteCommand(command, logger, logger, null);
return true;
} catch (IOException e) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is used to execute the remote commands like directory creation, should we consider the warning level?


if (object instanceof PEMEncryptedKeyPair) {
PEMKeyPair decryptedKeyPair = ((PEMEncryptedKeyPair) object)
.decryptKeyPair(new JcePEMDecryptorProviderBuilder().build(password.toCharArray()));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the password can be null, it would be better to remove the @NonNull annotation and add a null check?

Copy link

@PereBueno PereBueno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fcojfernandez fcojfernandez added the enhancement Feature additions or enhancements label Jan 21, 2025
@fcojfernandez fcojfernandez requested review from res0nance and removed request for pankajy-dev January 21, 2025 16:48
@fcojfernandez
Copy link
Member

Hi @res0nance ! We are working on being able to install and use the plugin in a FIPS environment, following https://github.com/jenkinsci/jep/blob/bdc28d03ed3202be3f343a927a247341a64b9156/jep/237/README.adoc

The problem is that trilead is not FIPS compliant, and as external library, we are not confident on making it FIPS compliant, therefore we are proposing to move from trilead to apache-mina as other plugins such as git-client did some time ago. jenkinsci/git-client-plugin#1127 as reference.

It'd be great if you could make a review! Thanks in advance

PS: There's another incoming PR adding some minor validations, but this change is big enough and with enough entity to consider the two of them as separated changes.

@fcojfernandez fcojfernandez merged commit 87175d2 into jenkinsci:master Jan 22, 2025
17 checks passed
@mwebber
Copy link

mwebber commented Jan 22, 2025

@jmdesprez just a quick note before I open a ticket somewhere appropriate: updating to the release with this change resulted in Agents failing to start.
The error seen in the log was

$ ssh -o StrictHostKeyChecking=accept-new -o "UserKnownHostsFile=/tmp/ec2_3385777490670375639_known_hosts" -o "HostKeyAlgorithms=ecdsa-sha2-nistp256" -i /tmp/ec2_5246766630689733252.pem [email protected] -p 22 /opt/oup/shared/docker_clean.sh &>> /tmp/docker_clean && mv -f /tmp/docker_clean /tmp/docker_clean_old && tail -n 1000 /tmp/docker_clean_old > /tmp/docker_clean;  java  -jar /tmp/remoting.jar -workDir /var/jenkins
Failed to add the host to the list of known hosts (/tmp/ec2_3385777490670375639_known_hosts).
Error: Unable to access jarfile /tmp/remoting.jar
ERROR: Unable to launch the agent for EC2 (TS-DEV_v2) - Standard(x86_64) (i-007c9432e2c2e3b17)
java.io.EOFException: unexpected stream termination
	at hudson.remoting.ChannelBuilder.negotiate(ChannelBuilder.java:478)
	at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:422)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:440)
	at PluginClassLoader for command-launcher//hudson.slaves.CommandLauncher.launch(CommandLauncher.java:188)
	at PluginClassLoader for ec2//hudson.plugins.ec2.ssh.EC2UnixLauncher.launchScript(EC2UnixLauncher.java:392)
	at PluginClassLoader for ec2//hudson.plugins.ec2.EC2ComputerLauncher.launch(EC2ComputerLauncher.java:55)
	at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:297)
	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

For comparison, without this change the output looks like this:

$ ssh -o StrictHostKeyChecking=accept-new -o "UserKnownHostsFile=/tmp/ec2_529420978841221409_known_hosts" -o "HostKeyAlgorithms=ssh-ed25519" -i /tmp/ec2_12788628894650662551.pem [email protected] -p 22 /opt/oup/shared/docker_clean.sh &>> /tmp/docker_clean && mv -f /tmp/docker_clean /tmp/docker_clean_old && tail -n 1000 /tmp/docker_clean_old > /tmp/docker_clean;  java  -jar /tmp/remoting.jar -workDir /var/jenkins
Jan 22, 2025 12:34:44 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir

Note: in our Cloud Config, we specify "Connect by SSH Process" (I don't think that is the default).

@jmdesprez
Copy link
Contributor Author

@mwebber Thanks. I'm investigating the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature additions or enhancements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants