Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"java.io.IOException: Error writing request body to server" error during create ephemeral certificate #528

Closed
vbonne opened this issue Jun 21, 2021 · 74 comments · Fixed by #561
Assignees
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@vbonne
Copy link

vbonne commented Jun 21, 2021

Dear team,

I have been struggling for a month with the following issue.
I have a Java 11 springboot project running from a Docker image based on the OpenJdk11 slim image (already tried various other base images and always got the same error) in a Cloud run environment conecting to google postgres SQL instance.
The conection is done with the google socket factory and the datasource generated using hikari config with the SQL instance name (private acess, no public IP).
The project runs fine for a few hours, sometime almost a few days and then it stops beeing able to connect to the SQL instance The first 2 errors are the following :

Error 1
java.lang.RuntimeException: [<Name-Of-Google-SQL-Instance>] Failed to create ephemeral certificate for the Cloud SQL instance. at com.google.cloud.sql.core.CloudSqlInstance.addExceptionContext (CloudSqlInstance.java:574) at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate (CloudSqlInstance.java:515) at com.google.cloud.sql.core.CloudSqlInstance.lambda$performRefresh$0 (CloudSqlInstance.java:330) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly (TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run (InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run (TrustedListenableFutureTask.java:78) at java.util.concurrent.Executors$RunnableAdapter.call (Unknown Source) at java.util.concurrent.FutureTask.run (Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run (Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker (Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run (Unknown Source) at java.lang.Thread.run (Unknown Source) Caused by: javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake at sun.security.ssl.SSLSocketImpl.handleEOF (Unknown Source) at sun.security.ssl.SSLSocketImpl.decode (Unknown Source) at sun.security.ssl.SSLSocketImpl.readHandshakeRecord (Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake (Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake (Unknown Source) at sun.net.www.protocol.https.HttpsClient.afterConnect (Unknown Source) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect (Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0 (Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream (Unknown Source) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream (Unknown Source) at com.google.api.client.http.javanet.NetHttpRequest.execute (NetHttpRequest.java:113) at com.google.api.client.http.javanet.NetHttpRequest.execute (NetHttpRequest.java:84) at com.google.api.client.http.HttpRequest.execute (HttpRequest.java:1012) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed (AbstractGoogleClientRequest.java:514) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed (AbstractGoogleClientRequest.java:455) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute (AbstractGoogleClientRequest.java:565) at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate (CloudSqlInstance.java:513)
Error 2:
java.io.EOFException: SSL peer shut down incorrectly at sun.security.ssl.SSLSocketInputRecord.read (Unknown Source) at sun.security.ssl.SSLSocketInputRecord.readHeader (Unknown Source) at sun.security.ssl.SSLSocketInputRecord.decode (Unknown Source) at sun.security.ssl.SSLTransport.decode (Unknown Source) at sun.security.ssl.SSLSocketImpl.decode (Unknown Source) at sun.security.ssl.SSLSocketImpl.readHandshakeRecord (Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake (Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake (Unknown Source) at sun.net.www.protocol.https.HttpsClient.afterConnect (Unknown Source) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect (Unknown Source) at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect (Unknown Source) at com.google.api.client.http.javanet.NetHttpRequest.execute (NetHttpRequest.java:148) at com.google.api.client.http.javanet.NetHttpRequest.execute (NetHttpRequest.java:84) at com.google.api.client.http.HttpRequest.execute (HttpRequest.java:1012) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed (AbstractGoogleClientRequest.java:514) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed (AbstractGoogleClientRequest.java:455) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute (AbstractGoogleClientRequest.java:565) at com.google.cloud.sql.core.CloudSqlInstance.fetchMetadata (CloudSqlInstance.java:438) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly (TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run (InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run (TrustedListenableFutureTask.java:78) at java.util.concurrent.Executors$RunnableAdapter.call (Unknown Source) at java.util.concurrent.FutureTask.run (Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run (Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker (Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run (Unknown Source) at java.lang.Thread.run (Unknown Source)

then at every attempt to connect to the project or when sheduled tasks run the following error :

java.io.EOFException: SSL peer shut down incorrectly at sun.security.ssl.SSLSocketInputRecord.read (Unknown Source) at sun.security.ssl.SSLSocketInputRecord.readHeader (Unknown Source) at sun.security.ssl.SSLSocketInputRecord.decode (Unknown Source) at sun.security.ssl.SSLTransport.decode (Unknown Source) at sun.security.ssl.SSLSocketImpl.decode (Unknown Source) at sun.security.ssl.SSLSocketImpl.readHandshakeRecord (Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake (Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake (Unknown Source) at sun.net.www.protocol.https.HttpsClient.afterConnect (Unknown Source) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect (Unknown Source) at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect (Unknown Source) at com.google.api.client.http.javanet.NetHttpRequest.execute (NetHttpRequest.java:148) at com.google.api.client.http.javanet.NetHttpRequest.execute (NetHttpRequest.java:84) at com.google.api.client.http.HttpRequest.execute (HttpRequest.java:1012) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed (AbstractGoogleClientRequest.java:514) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed (AbstractGoogleClientRequest.java:455) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute (AbstractGoogleClientRequest.java:565) at com.google.cloud.sql.core.CloudSqlInstance.fetchMetadata (CloudSqlInstance.java:438) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly (TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run (InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run (TrustedListenableFutureTask.java:78) at java.util.concurrent.Executors$RunnableAdapter.call (Unknown Source) at java.util.concurrent.FutureTask.run (Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run (Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker (Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run (Unknown Source) at java.lang.Thread.run (Unknown Source)

From what I understood it seems that this error appear when the ssl is renewd on the sql instance, and also understood that this is normal as far the conection is then retried and ends succesfull. (issue 310) In my case the conection is never re-established and the whole project stops working.

For the moment, as i am mostly doing trials to see how we could migrate our project to the Google cloud plateform, my current solution is to restart the google cloud run instance when this error occurs, but this is not a viable for a production solution...
And idea where the error could be ?
Thanks for your help to put me on right track !

@kurtisvg kurtisvg added priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. labels Jun 21, 2021
@kurtisvg
Copy link
Contributor

Hi @vbonne,

It would be helpful if you can provide us with as small of an example as possible to reproduce the error. Make sure to include a dockerfile, pom/gradle, and sample code so we can try to reproduce on our end as well.

If you haven't already, make sure you are on the latest version of the library and that you don't have any dependency conflicts.

Looking around online, there are a few SO posts with this error. It looks like they are caused by the SSL handshake failing for some reason, but to find the exact reason you may need to run with -Djavax.net.debug=all while the issue is occurring.

@vbonne
Copy link
Author

vbonne commented Jun 21, 2021

Thanks @kurtisvg
1 - Yes i am working with the latest Google socket factory dependency
3 - Regarding dependency conflict, I thought about this and last week, I resolved most of them, but the project is large (for me) and created back in 2008 as non spring Java 8, I do have some dependency conflict that i have a hard time resolving in the sense that some packages are present with a few differents versions, but they never were a problem before. I will try to look further on this then.
2 - yes I did see those SO posts. and tried all possible "System.setProperty("https.protocols", "TLSv1,TLSv1.1,TLSv1.2");" combinations, without success, and with regards to solutions recomending adding certificates, the google socketfactory does not require the project to manage any client certificates for sql conection as far as I understood...

Ok, restarted with -Djavax.net.debug=all, on the next error I will update this ticket, with detailed log, and post also dockerfile + pom + datasource bean once the error happens again.

In the mean time I will try resolving the remaining dependency conflicts and try to create a small project to reproduce this error.

@vbonne
Copy link
Author

vbonne commented Jun 24, 2021

Ok all dependency conflicts resolved. Just updated Google cloud run project with latest image. Waiting for the error to happen...

Dockerfile :

FROM maven:3.6.3-openjdk-11-slim as BUILD
WORKDIR /usr/src/app
COPY checkstyle-project1.xml .
COPY checkstyle-supressions.xml .
COPY licenseheader.txt .
COPY pom.xml .
RUN mvn -B dependency:resolve dependency:resolve-plugins
COPY src src
RUN mvn -B -e -C -T 1C package  -Dmaven.test.skip=true
FROM openjdk:11-jre-slim
ARG TAG_FOR_GIT
EXPOSE 8081
EXPOSE 8031
EXPOSE 2083
ENV JAVA_TOOL_OPTIONS=""
COPY --from=BUILD /usr/src/app/target/*.jar /opt/target/hov2.jar
WORKDIR /opt/target
ENV TZ="America/Bogota"
ENV SPRING_PROFILES_ACTIVE="gcr"
ENV APP_HTTP_THREADS="4"
ENV DB_MAX_POOL="4"
ENV DB_CONN_TIMEOUT="20000"
ENV DB_HOST="<for local deployment>"
ENV DB_PORT="5432"
ENV DB_NAME=""
ENV DB_USER=""
ENV DB_PASSWORD=""
ENV SERVER_PORT="8031"
ENV IDLE_TIMEOUT="600000"
ENV MAX_LIFE_TIME="1800000"
ENV GIT_TAG=$TAG_FOR_GIT
ENV LOG_LEVEL="DEBUG"
ENV MAIL_LOG_LEVEL="ERROR"
ENV CRON_ENABLED=""
ENV DIRECT_JDBC_URL="true"
RUN apt-get -y update && apt-get -y install apt-utils tzdata ttf-dejavu ttf-dejavu-extra libfreetype6  libfontconfig1  && apt-get -y clean
ENTRYPOINT exec java -Djava.security.egd=file:/dev/./urandom -jar hov2.jar $JAR_OPTS $JAVA_TOOL_OPTIONS

Pom file with dependency conflicts resolved

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
		 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<groupId>com.vyv.hov2</groupId>
	<artifactId>HOV2-SpringBoot</artifactId>
	<packaging>jar</packaging>
	<version>2</version>
	<name>Sistema de reservas</name>
	<repositories>
		<repository>
			<id>ZK CE</id>
			<name>ZK CE Repository</name>
			<url>http://mavensync.zkoss.org/maven2</url>
		</repository>
		<!--        -->
	</repositories>
	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
		<maven.compiler.source>11</maven.compiler.source>
		<maven.compiler.target>11</maven.compiler.target>
		<java.version>11</java.version>
		<springfox-version>3.0.0</springfox-version>
		<jacoco.percentage.instruction>0.0</jacoco.percentage.instruction>
		<zk.version>9.5.0.3</zk.version>
		<jr.version>6.17.0</jr.version>
		<passay.version>1.0</passay.version>
		<zkspringboot.version>2.3.1</zkspringboot.version>
		<liquibase-hibernate5.version>3.8</liquibase-hibernate5.version>
		<validation-api.version>2.0.1.Final</validation-api.version>
		<javassist.version>3.27.0-GA</javassist.version>
		<jaxb-api.version>2.4.0-b180830.0359</jaxb-api.version>
		<maven.build.timestamp.format>yyyy-MM-dd'T'HH.mm.ss</maven.build.timestamp.format>
	</properties>
	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>2.5.1</version>
	</parent>
	<dependencyManagement>
		<dependencies>
			<!-- Start of conflict resolution forcing latest version of packages present with different versions -->
			<!-- https://mvnrepository.com/artifact/commons-logging/commons-logging -->
			<dependency>
				<groupId>commons-logging</groupId>
				<artifactId>commons-logging</artifactId>
				<version>1.2</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/commons-validator/commons-validator -->
			<dependency>
				<groupId>commons-validator</groupId>
				<artifactId>commons-validator</artifactId>
				<version>1.7</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/commons-io/commons-io -->
			<dependency>
				<groupId>commons-io</groupId>
				<artifactId>commons-io</artifactId>
				<version>2.4</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/org.javassist/javassist -->
			<dependency>
				<groupId>org.javassist</groupId>
				<artifactId>javassist</artifactId>
				<version>3.28.0-GA</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-collections4 -->
			<dependency>
				<groupId>org.apache.commons</groupId>
				<artifactId>commons-collections4</artifactId>
				<version>4.2</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
			<dependency>
				<groupId>com.google.guava</groupId>
				<artifactId>guava</artifactId>
				<version>30.1.1-jre</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/commons-beanutils/commons-beanutils -->
			<dependency>
				<groupId>commons-beanutils</groupId>
				<artifactId>commons-beanutils</artifactId>
				<version>1.9.4</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/com.google.api-client/google-api-client -->
			<dependency>
				<groupId>com.google.api-client</groupId>
				<artifactId>google-api-client</artifactId>
				<version>1.31.5</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on -->
			<dependency>
				<groupId>org.bouncycastle</groupId>
				<artifactId>bcprov-jdk15on</artifactId>
				<version>1.64</version>
			</dependency>
			<!-- https://mvnrepository.com/artifact/com.google.http-client/google-http-client -->
			<dependency>
				<groupId>com.google.http-client</groupId>
				<artifactId>google-http-client</artifactId>
				<version>1.39.2-sp.1</version>
			</dependency>
			<!-- End of conflict resolution -->
		</dependencies>
	</dependencyManagement>
	<dependencies>
		<dependency>
			<groupId>com.google.cloud.sql</groupId>
			<artifactId>postgres-socket-factory</artifactId>
			<version>1.3.0</version>
		</dependency>
		<dependency>
			<groupId>org.zkoss.zkspringboot</groupId>
			<artifactId>zkspringboot-autoconfig</artifactId>
			<version>${zkspringboot.version}</version>
		</dependency>
		<dependency>
			<groupId>org.zkoss.zk</groupId>
			<artifactId>zkbind</artifactId>
			<version>${zk.version}</version>
			<exclusions>
				<exclusion>
					<groupId>org.slf4j</groupId>
					<artifactId>slf4j-jdk14</artifactId>
				</exclusion>
			</exclusions>
		</dependency>
		<dependency>
			<groupId>org.zkoss.zk</groupId>
			<artifactId>zkplus</artifactId>
			<version>${zk.version}</version>
		</dependency>
		<dependency>
			<groupId>com.google.zxing</groupId>
			<artifactId>core</artifactId>
			<version>3.4.1</version>
		</dependency>
		<dependency>
			<groupId>com.google.zxing</groupId>
			<artifactId>javase</artifactId>
			<version>3.4.1</version>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-log4j2</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-security</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-mail</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-devtools</artifactId>
			<scope>runtime</scope>
		</dependency>
		<dependency>
			<groupId>org.projectlombok</groupId>
			<artifactId>lombok</artifactId>
			<optional>true</optional>
		</dependency>
		<!-- JPA + Postgresql -->
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-jpa</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-jdbc</artifactId>
		</dependency>
		<dependency>
			<groupId>org.postgresql</groupId>
			<artifactId>postgresql</artifactId>
			<scope>runtime</scope>
		</dependency>
		<dependency>
			<groupId>org.liquibase</groupId>
			<artifactId>liquibase-core</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
			<exclusions>
				<exclusion>
					<groupId>org.junit.vintage</groupId>
					<artifactId>junit-vintage-engine</artifactId>
				</exclusion>
			</exclusions>
		</dependency>
		<!-- Inicio SpringFox swagger dependencies -->
		<dependency>
			<groupId>io.springfox</groupId>
			<artifactId>springfox-swagger2</artifactId>
			<version>${springfox-version}</version>
		</dependency>
		<dependency>
			<groupId>io.springfox</groupId>
			<artifactId>springfox-swagger-ui</artifactId>
			<version>${springfox-version}</version>
		</dependency>
		<!-- Fin SpringFox swagger dependencies -->
		<dependency>
			<groupId>commons-net</groupId>
			<artifactId>commons-net</artifactId>
			<version>3.8.0</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/org.asteriskjava/asterisk-java -->
		<dependency>
			<groupId>org.asteriskjava</groupId>
			<artifactId>asterisk-java</artifactId>
			<version>3.12.0</version>
		</dependency>
		<dependency>
			<groupId>com.google.apis</groupId>
			<artifactId>google-api-services-calendar</artifactId>
			<version>v3-rev411-1.25.0</version>
		</dependency>
		<!-- Password Validation -->
		<dependency>
			<groupId>org.passay</groupId>
			<artifactId>passay</artifactId>
			<version>${passay.version}</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/org.cups4j/cups4j -->
		<dependency>
			<groupId>org.cups4j</groupId>
			<artifactId>cups4j</artifactId>
			<version>0.7.1</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/net.sf.jasperreports/jasperreports -->
		<dependency>
			<groupId>net.sf.jasperreports</groupId>
			<artifactId>jasperreports</artifactId>
			<version>${jr.version}</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/net.sf.jasperreports/jasperreports-fonts-->
		<dependency>
			<groupId>net.sf.jasperreports</groupId>
			<artifactId>jasperreports-fonts</artifactId>
			<version>${jr.version}</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/net.sf.jasperreports/jasperreports-functions-->
		<dependency>
			<groupId>net.sf.jasperreports</groupId>
			<artifactId>jasperreports-functions</artifactId>
			<version>${jr.version}</version>
		</dependency>
		<dependency>
			<groupId>org.zkoss.zkforge</groupId>
			<artifactId>gmapsz</artifactId>
			<version>3.0.5</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/net.sourceforge.javacsv/javacsv -->
		<dependency>
			<groupId>net.sourceforge.javacsv</groupId>
			<artifactId>javacsv</artifactId>
			<version>2.0</version>
		</dependency>
		<dependency>
			<groupId>org.zkoss.zkforge</groupId>
			<artifactId>ckez</artifactId>
			<version>4.14.0.1</version>
		</dependency>
		<dependency>
			<groupId>javax.validation</groupId>
			<artifactId>validation-api</artifactId>
			<version>${validation-api.version}</version>
		</dependency>
		<dependency>
			<groupId>org.hibernate</groupId>
			<artifactId>hibernate-validator</artifactId>
			<version>6.2.0.Final</version>
		</dependency>
		<dependency>
			<groupId>io.jsonwebtoken</groupId>
			<artifactId>jjwt-api</artifactId>
			<version>0.11.2</version>
		</dependency>
		<dependency>
			<groupId>io.jsonwebtoken</groupId>
			<artifactId>jjwt-impl</artifactId>
			<version>0.11.2</version>
			<scope>runtime</scope>
		</dependency>
		<dependency>
			<groupId>io.jsonwebtoken</groupId>
			<artifactId>jjwt-jackson</artifactId>
			<version>0.11.2</version>
			<scope>runtime</scope>
		</dependency>
	</dependencies>
	<profiles>
		<profile>
			<id>dev</id>
			<properties>
				<activatedProperties>dev</activatedProperties>
			</properties>
			<activation>
				<activeByDefault>true</activeByDefault>
			</activation>
		</profile>
		<profile>
			<id>prod</id>
			<properties>
				<activatedProperties>prod</activatedProperties>
			</properties>
		</profile>
		<profile>
			<id>gcr</id>
			<properties>
				<activatedProperties>gcr</activatedProperties>
			</properties>
		</profile>
	</profiles>
	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
				<executions>
					<execution>
						<goals>
							<goal>repackage</goal>
						</goals>
					</execution>
				</executions>
			</plugin>
			<!-- Checkstyle
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-checkstyle-plugin</artifactId>
				<version>3.1.0</version>
				<configuration>
					<configLocation>checkstyle-GlobalRemises.xml</configLocation>
					<encoding>UTF-8</encoding>
					<consoleOutput>true</consoleOutput>
					<failsOnError>true</failsOnError>
					<linkXRef>false</linkXRef>
				</configuration>
				<executions>
					<execution>
						<id>validate</id>
						<phase>validate</phase>
						<goals>
							<goal>check</goal>
						</goals>
					</execution>
				</executions>
			</plugin>-->
			<!-- PMD -->
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-pmd-plugin</artifactId>
				<version>3.12.0</version>
				<executions>
					<execution>
						<id>pmd-scan</id>
						<phase>verify</phase>
						<goals>
							<goal>check</goal>
						</goals>
					</execution>
				</executions>
			</plugin>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-resources-plugin</artifactId>
				<version>3.2.0</version>
			</plugin>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-enforcer-plugin</artifactId>
				<version>3.0.0-M3</version>
				<configuration>
					<rules><dependencyConvergence/></rules>
				</configuration>
			</plugin>
		</plugins>
	</build>
</project>

datasource bean, sorry that this bean is so large but the project runs on different infractructures that need various configuration options... but in this case of Cloud run the environment variable GOOGLE_SQL_PRIVATE_ACCESS set to true. And the DIRECT_JDBC_URL can be set to true or false in order to try the Google Sql conection with a manually created url or letting the socketfactory do its job to generate it. In both cases the same error happens after a few hours.

package com.vyv.hov2.config;

import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import javax.sql.DataSource;

@Configuration
public class DataSourceConfig {
    private static final Log logger = LogFactory.getLog(DataSourceConfig.class);
    // Saving credentials in environment variables is convenient, but not secure - consider a more
    // secure solution such as https://cloud.google.com/kms/ to help keep secrets safe.
    //@Value("${spring.datasource.username}")
    private static String DB_USER = System.getenv("DB_USER");
    //@Value("${spring.datasource.password}")
    private static String DB_PASSWORD = System.getenv("DB_PASSWORD");
    //@Value("${spring.cloud.gcp.sql.database-name}")
    private static String DB_NAME  = System.getenv("DB_NAME");
    private static String DB_HOST  = System.getenv("DB_HOST");
    private static String DB_PORT  = System.getenv("DB_PORT");
    //@Value("${spring.datasource.hikari.maximum-pool-size}")
    private static String DB_MAX_POOL  = System.getenv("DB_MAX_POOL");
    //@Value("${spring.datasource.hikari.connection-timeout}")
    private static String DB_CONN_TIMEOUT  = System.getenv("DB_CONN_TIMEOUT");
    private static String IDLE_TIMEOUT  = System.getenv("IDLE_TIMEOUT");
    private static String MAX_LIFE_TIME  = System.getenv("MAX_LIFE_TIME");
    //@Value("${spring.cloud.gcp.sql.enabled}")
    private static String GOOGLE_SQL_PRIVATE_ACCESS  = System.getenv("GOOGLE_SQL_PRIVATE_ACCESS");
    //@Value("${spring.cloud.gcp.sql.instance-connection-name}")
    private static String CLOUD_SQL_CONNECTION_NAME = System.getenv("CLOUD_SQL_CONNECTION_NAME");
    private static String DIRECT_JDBC_URL = System.getenv("DIRECT_JDBC_URL");
    @Value("${spring.profiles.active}")
    private String currentActiveEnv;
    /*
    spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgreSQLDialect
    spring.datasource.hikari.connection-test-query=SELECT 1
    spring.sql.init.continue-on-error=true
    spring.datasource.hikari.leak-detection-threshold= 60000
    spring.datasource.driver-class-name=org.postgresql.Driver
     */

    @Bean
    public DataSource dataSource() {
        logger.info("Configuring datasource for " + currentActiveEnv);
        HikariConfig config = new HikariConfig();
        // [START cloud_sql_mysql_servlet_limit]
        // maximumPoolSize limits the total number of concurrent connections this pool will keep. Ideal
        // values for this setting are highly variable on app design, infrastructure, and database.
        if ( DB_MAX_POOL == null || DB_MAX_POOL.isEmpty()) {
            logger.info("DB_MAX_POOL was absent, configuring with default value");
            DB_MAX_POOL = "4";
        }
        logger.debug("DB_MAX_POOL " + DB_MAX_POOL);
        config.setMaximumPoolSize(Integer.parseInt(DB_MAX_POOL));
        // minimumIdle is the minimum number of idle connections Hikari maintains in the pool.
        // Additional connections will be established to meet this value unless the pool is full.
        config.setMinimumIdle(5);
        // [END cloud_sql_mysql_servlet_limit]

        // [START cloud_sql_mysql_servlet_timeout]
        // setConnectionTimeout is the maximum number of milliseconds to wait for a connection checkout.
        // Any attempt to retrieve a connection from this pool that exceeds the set limit will throw an
        // SQLException.
        if ( DB_CONN_TIMEOUT == null || DB_CONN_TIMEOUT.isEmpty()) {
            logger.info("DB_CONN_TIMEOUT was absent, configuring with default value");
            DB_CONN_TIMEOUT = "20000";
        }
        logger.debug("DB_CONN_TIMEOUT " + DB_CONN_TIMEOUT);
        config.setConnectionTimeout(Integer.parseInt(DB_CONN_TIMEOUT));; // 10 seconds
        if ( DB_USER == null || DB_USER.isEmpty()) {
            logger.info("DB_USER was absent, configuring with default value");
            DB_USER = "postgres";
        }
        if ( DB_PASSWORD == null || DB_PASSWORD.isEmpty()) {
            logger.info("DB_PASS was absent, configuring with default value");
            DB_PASSWORD = "postgres";
        }
        if ( GOOGLE_SQL_PRIVATE_ACCESS == null || GOOGLE_SQL_PRIVATE_ACCESS.isEmpty()) {
            logger.info("GOOGLE_SQL_PRIVATE_ACCESS was absent, configuring with default value");
            GOOGLE_SQL_PRIVATE_ACCESS = "false";
        }
        logger.debug(DB_USER);
        config.setUsername(DB_USER);
        logger.debug(DB_PASSWORD);
        config.setPassword(DB_PASSWORD);
        logger.info("GOOGLE_SQL_PRIVATE_ACCESS " + GOOGLE_SQL_PRIVATE_ACCESS);
        if ( !"true".equals(GOOGLE_SQL_PRIVATE_ACCESS)) {
            if ( DB_HOST == null || DB_HOST.isEmpty()) {
                logger.info("DB_HOST was absent, configuring with default value");
                DB_HOST = "<localhot db ip>";
            }
            if ( DB_PORT == null || DB_PORT.isEmpty()) {
                logger.info("DB_PORT was absent, configuring with default value");
                DB_PORT = "<localhost db port>";
            }
            if ( DB_NAME == null || DB_NAME.isEmpty()) {
                logger.info("DB_NAME was absent, configuring with default value");
                DB_NAME = "vehiculosvip_dev3";
            }
            logger.info("Configuring datasource with direct url");
            logger.debug("jdbc:postgresql://"+DB_HOST+":"+DB_PORT+"/"+DB_NAME);
            config.setJdbcUrl("jdbc:postgresql://"+DB_HOST+":"+DB_PORT+"/"+DB_NAME);
            config.setDriverClassName("org.postgresql.Driver");
            HikariDataSource  datasource = new HikariDataSource(config);;
            logger.info("Datasource configured with direct url");
            return datasource;
        } else {
            logger.info("Configuring datasource with google sql");
            // *************
            // [START cloud_sql_mysql_servlet_create]
            // Note: For Java users, the Cloud SQL JDBC Socket Factory can provide authenticated connections
            // which is preferred to using the Cloud SQL Proxy with Unix sockets.
            // See https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory for details.


            // The following URL is equivalent to setting the config options below:
            // jdbc:mysql:///<DB_NAME>?cloudSqlInstance=<CLOUD_SQL_CONNECTION_NAME>&
            // socketFactory=com.google.cloud.sql.mysql.SocketFactory&user=<DB_USER>&password=<DB_PASS>
            // See the link below for more info on building a JDBC URL for the Cloud SQL JDBC Socket Factory
            // https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory#creating-the-jdbc-url

            // Configure which instance and what database user to connect with.
            if ( DB_NAME == null || DB_NAME.isEmpty()) {
                logger.info("DB_NAME was absent, configuring with default value");
                DB_NAME = "<dbname>";
            }
            if (DIRECT_JDBC_URL != null && "false".equals(DIRECT_JDBC_URL)) {
                config.setJdbcUrl(String.format("jdbc:postgresql:///%s", DB_NAME));
                config.addDataSourceProperty("socketFactory", "com.google.cloud.sql.postgres.SocketFactory");
                if (CLOUD_SQL_CONNECTION_NAME == null || CLOUD_SQL_CONNECTION_NAME.isEmpty()) {
                    logger.error("CLOUD_SQL_CONNECTION_NAME was absent, configuring with default value");
                    CLOUD_SQL_CONNECTION_NAME = "<CLOUD_SQL_CONNECTION_NAME >";
                }
                config.addDataSourceProperty("cloudSqlInstance", CLOUD_SQL_CONNECTION_NAME);

                // The ipTypes argument can be used to specify a comma delimited list of preferred IP types
                // for connecting to a Cloud SQL instance. The argument ipTypes=PRIVATE will force the
                // SocketFactory to connect with an instance's associated private IP.
                config.addDataSourceProperty("ipTypes", "PUBLIC,PRIVATE");
            } else {
                String url = "jdbc:postgresql:///"+DB_NAME+"?cloudSqlInstance="+CLOUD_SQL_CONNECTION_NAME
                             +"&socketFactory=com.google.cloud.sql.postgres.SocketFactory&user="+DB_USER+"&password="+DB_PASSWORD;
                logger.info("url " + url);
                config.setJdbcUrl(url);
                config.setDriverClassName("org.postgresql.Driver");
            }
            // ... Specify additional connection properties here.
            // [START_EXCLUDE]


            // idleTimeout is the maximum amount of time a connection can sit in the pool. Connections that
            // sit idle for this many milliseconds are retried if minimumIdle is exceeded.
            if ( IDLE_TIMEOUT == null || IDLE_TIMEOUT.isEmpty()) {
                logger.info("IDLE_TIMEOUT was absent, configuring with default value");
                IDLE_TIMEOUT = "600000";
            }
            logger.debug("IDLE_TIMEOUT " + IDLE_TIMEOUT);
            config.setIdleTimeout(Integer.parseInt(IDLE_TIMEOUT));; // 10 seconds
            // [END cloud_sql_mysql_servlet_timeout]

            // [START cloud_sql_mysql_servlet_backoff]
            // Hikari automatically delays between failed connection attempts, eventually reaching a
            // maximum delay of `connectionTimeout / 2` between attempts.
            // [END cloud_sql_mysql_servlet_backoff]

            // [START cloud_sql_mysql_servlet_lifetime]
            // maxLifetime is the maximum possible lifetime of a connection in the pool. Connections that
            // live longer than this many milliseconds will be closed and reestablished between uses. This
            // value should be several minutes shorter than the database's timeout value to avoid unexpected
            // terminations.
            if ( MAX_LIFE_TIME == null || MAX_LIFE_TIME.isEmpty()) {
                logger.info("MAX_LIFE_TIME was absent, configuring with default value");
                MAX_LIFE_TIME = "1800000";
            }
            logger.debug(MAX_LIFE_TIME );
            config.setMaxLifetime(Integer.parseInt(MAX_LIFE_TIME ));
            // [END cloud_sql_mysql_servlet_lifetime]

            // [END_EXCLUDE]
            //config.addDataSourceProperty("useSSL", true);
            //config.addDataSourceProperty("requireSSL", true);
            //config.addDataSourceProperty("verifyServerCertificate", true);
            // Initialize the connection pool using the configuration object.
            HikariDataSource pool = new HikariDataSource(config);
            logger.info(" jdbcURL " + pool.getJdbcUrl());
            // [END cloud_sql_mysql_servlet_create]
            logger.info("Datasource configured with google sql");
            return pool;
        }
    }
}

list of cloud run enviroment variables :

SPRING_PROFILES_ACTIVE  gcr
APP_HTTP_THREADS 20
DB_MAX_POOL 8
DB_CONN_TIMEOUT 20000
DB_NAME <dbname>
DB_USER postgres
DB_PASSWORD <dbpassword>
SERVER_PORT 8080
ENABLE_SSL false
TZ America/Bogota
CONTEXT_PATH /
GOOGLE_SQL_PRIVATE_ACCESS true
CLOUD_SQL_CONNECTION_NAME <google sql instance name>
GIT_TAG GCP-V257
DLL_HANDLING none
CRON_ENABLED false
JAR_OPTS  -Djavax.net.debug=all
JAVA_TOOL_OPTIONS
IDLE_TIMEOUT 60000
DIRECT_JDBC_URL true

Simple project public gitlab repo here

@vbonne
Copy link
Author

vbonne commented Jun 24, 2021

from the logs of the simple app created to reproduce the error, i can see this error in the logs :

2021-06-24T22:10:49.420102Zorg.apache.tomcat.jni.LibraryNotFoundError: Can't load library: /opt/target/bin/libtcnative-1.so, Can't load library: /opt/target/bin/liblibtcnative-1.so, no tcnative-1 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib], no libtcnative-1 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] at org.apache.tomcat.jni.Library.<init>(Library.java:102) at org.apache.tomcat.jni.Library.initialize(Library.java:206) at org.apache.catalina.core.AprLifecycleListener.init(AprLifecycleListener.java:193) at org.apache.catalina.core.AprLifecycleListener.isAprAvailable(AprLifecycleListener.java:102) at org.springframework.boot.web.embedded.tomcat.TomcatServletWebServerFactory.getDefaultLifecycleListeners(TomcatServletWebServerFactory.java:173) at org.springframework.boot.web.embedded.tomcat.TomcatServletWebServerFactory.<init>(TomcatServletWebServerFactory.java:120) at org.springframework.boot.autoconfigure.web.servlet.ServletWebServerFactoryConfiguration$EmbeddedTomcat.tomcatServletWebServerFactory(ServletWebServerFactoryConfiguration.java:76) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:653) at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:638) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1334) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1177) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:564) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:524) at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:213) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.getWebServerFactory(ServletWebServerApplicationContext.java:217) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.createWebServer(ServletWebServerApplicationContext.java:180) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.onRefresh(ServletWebServerApplicationContext.java:160) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:577) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:145) at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:754) at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:434) at org.springframework.boot.SpringApplication.run(SpringApplication.java:338) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1332) at com.vyv.hov2.MinimalApplication.main(MinimalApplication.java:30) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) at org.springframework.boot.loader.Launcher.launch(Launcher.java:108) at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88)

even if it does no seem to generate any fatal error at this point could it be that I need to add the Tomcat libtcnative-1 package to the docker OS for negociating the ssl renewal of the Google SQL instance ? I leave this post for history, I edited the docker file with this package added.

new Docker File :

FROM maven:3.6.3-openjdk-11-slim as BUILD
MAINTAINER [email protected]
WORKDIR /usr/src/app
COPY checkstyle-GlobalRemises.xml .
COPY checkstyle-supressions.xml .
COPY licenseheader.txt .
COPY pom.xml .
RUN mvn -B dependency:resolve dependency:resolve-plugins
COPY src src
RUN mvn -B -e -C -T 1C package  -Dmaven.test.skip=true
FROM openjdk:11-jre-slim
ARG TAG_FOR_GIT
EXPOSE 8081
EXPOSE 8031
EXPOSE 2083
ENV JAVA_TOOL_OPTIONS=""
COPY --from=BUILD /usr/src/app/target/*.jar /opt/target/hov2.jar
WORKDIR /opt/target
ENV TZ="America/Bogota"
ENV SPRING_PROFILES_ACTIVE="gcr"
ENV APP_HTTP_THREADS="4"
ENV DB_MAX_POOL="4"
ENV DB_CONN_TIMEOUT="20000"
ENV DB_HOST=""
ENV DB_PORT=""
ENV DB_NAME=""
ENV DB_USER=""
ENV DB_PASSWORD=""
ENV SERVER_PORT=""
ENV IDLE_TIMEOUT="600000"
ENV MAX_LIFE_TIME="1800000"
ENV GIT_TAG=$TAG_FOR_GIT
ENV LOG_LEVEL="DEBUG"
ENV MAIL_LOG_LEVEL="ERROR"
ENV CRON_ENABLED=""
ENV DLL_HANDLING="update"
RUN apt-get -y update && apt-get -y install apt-utils wget tzdata ttf-dejavu ttf-dejavu-extra libfreetype6  libfontconfig1 libtcnative-1 && apt-get -y clean && mkdir bin && ln -s /usr/lib/x86_64-linux-gnu/libtcnative-1.so /opt/target/bin/libtcnative-1.so
ENTRYPOINT exec java -Djava.security.egd=file:/dev/./urandom -jar hov2.jar $JAR_OPTS $JAVA_TOOL_OPTIONS

@vbonne
Copy link
Author

vbonne commented Jun 25, 2021

Here is the a sample of the failing conection error log with debug enabled on the simple project.
I am putting the log of the exact moment when the conection is lost, the first logs show connection ok, with hikari pool ok, then 5 seconds later: Closing connection org.postgresql.jdbc.PgConnection@d520509: (connection has passed maxLifetime)
and then the next logs show reconection errors.


2021-06-26 14:20:13.456 DEBUG 1 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Before cleanup stats (total=5, active=0, idle=5, waiting=0)
2021-06-26 14:20:13.456 DEBUG 1 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - After cleanup stats (total=5, active=0, idle=5, waiting=0)
2021-06-26 14:20:13.456 DEBUG 1 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Fill pool skipped, pool is at sufficient level.
2021-06-26 14:20:42.745 DEBUG 1 --- [alina-utility-2] org.apache.catalina.session.ManagerBase : Start expire sessions StandardManager at 1624735242745 sessioncount 10
2021-06-26 14:20:42.745 DEBUG 1 --- [alina-utility-2] org.apache.catalina.session.ManagerBase : End expire sessions StandardManager processingTime 0 expired sessions: 0
2021-06-26 14:20:43.457 DEBUG 1 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Before cleanup stats (total=5, active=0, idle=5, waiting=0)
2021-06-26 14:20:43.457 DEBUG 1 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - After cleanup stats (total=5, active=0, idle=5, waiting=0)
2021-06-26 14:20:43.457 DEBUG 1 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Fill pool skipped, pool is at sufficient level.
2021-06-26 14:20:48.037 DEBUG 1 --- [nnection closer] com.zaxxer.hikari.pool.PoolBase : HikariPool-1 - Closing connection org.postgresql.jdbc.PgConnection@d520509: (connection has passed maxLifetime)
2021-06-26 14:20:48.038 DEBUG 1 --- [onnection adder] org.postgresql.Driver : Connecting with URL: jdbc:postgresql:///dbName?cloudSqlInstance=instanceName&socketFactory=com.google.cloud.sql.postgres.SocketFactory&user=postgres&password=123
2021-06-26 14:20:48.043 DEBUG 1 --- [onnection adder] org.postgresql.jdbc.PgConnection : PostgreSQL JDBC Driver 42.2.20
2021-06-26 14:20:48.143 DEBUG 1 --- [onnection adder] org.postgresql.jdbc.PgConnection : setDefaultFetchSize = 0
2021-06-26 14:20:48.143 DEBUG 1 --- [onnection adder] org.postgresql.jdbc.PgConnection : setPrepareThreshold = 5
2021-06-26 14:20:48.243 DEBUG 1 --- [onnection adder] o.p.core.v3.ConnectionFactoryImpl : Trying to establish a protocol version 3 connection to :5432
2021-06-26 14:20:48.243 INFO 1 --- [onnection adder] c.g.cloud.sql.core.CoreSocketFactory : Connecting to Cloud SQL instance [instanceName] via SSL socket.
2021-06-26 14:20:48.943 DEBUG 1 --- [onnection adder] org.postgresql.Driver : Unexpected connection error:
java.lang.RuntimeException: [instanceName] Failed to create ephemeral certificate for the Cloud SQL instance. at com.google.cloud.sql.core.CloudSqlInstance.addExceptionContext(CloudSqlInstance.java:574) at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:515) at com.google.cloud.sql.core.CloudSqlInstance.lambda$performRefresh$0(CloudSqlInstance.java:330) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: java.io.IOException: Error writing request body to server at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source) at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source) at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source) at java.base/java.io.BufferedOutputStream.flush(Unknown Source) at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source) at java.base/java.io.FilterOutputStream.flush(Unknown Source) at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1193) at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46) at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77) at com.google.api.client.util.LoggingStreamingContent.writeTo(LoggingStreamingContent.java:63) at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:50) at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48) at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76) at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:171) at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117) at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84) at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565) at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:513) ... 10 common frames omitted
2021-06-26 14:20:51.343 DEBUG 1 --- [onnection adder] com.zaxxer.hikari.pool.PoolBase : HikariPool-1 - Failed to create/setup connection: Something unusual has occurred to cause the driver to fail. Please report this exception.
2021-06-26 14:20:51.643 DEBUG 1 --- [onnection adder] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Cannot acquire connection from data source
org.postgresql.util.PSQLException: Something unusual has occurred to cause the driver to fail. Please report this exception. at org.postgresql.Driver.connect(Driver.java:285) at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138) at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364) at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206) at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476) at com.zaxxer.hikari.pool.HikariPool.access$100(HikariPool.java:71) at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:726) at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:712) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: java.lang.RuntimeException: [db-instance name] Failed to create ephemeral certificate for the Cloud SQL instance. at com.google.cloud.sql.core.CloudSqlInstance.addExceptionContext(CloudSqlInstance.java:574) at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:515) at com.google.cloud.sql.core.CloudSqlInstance.lambda$performRefresh$0(CloudSqlInstance.java:330) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ... 3 common frames omitted Caused by: java.io.IOException: Error writing request body to server at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source) at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source) at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source) at java.base/java.io.BufferedOutputStream.flush(Unknown Source) at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source) at java.base/java.io.FilterOutputStream.flush(Unknown Source) at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1193) at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46) at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77) at com.google.api.client.util.LoggingStreamingContent.writeTo(LoggingStreamingContent.java:63) at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:50) at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48) at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76) at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:171) at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117) at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84) at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565) at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:513) ... 10 common frames omitted

and previous post error about libtcnative-1 corrected by creating a symlink to indicate to SpringBoot where the libtcnative-1.so is, docker file of previous post updated with correction.

@enocom
Copy link
Member

enocom commented Jun 25, 2021

@vbonne Is this most recent error what you see on startup?

Also, I assume you've configured a serverless VPC access connector?

@vbonne
Copy link
Author

vbonne commented Jun 25, 2021

@enocom at startup there is no error, service runs fine until ephemeral ssl error occurs and this can be from a few hours to a few days after startup.
On startup as such the errors are the following but the project runs fine even with those errors, that are probably normal, like http errors when a connection is closed abruptly.

Error
2021-06-25 10:17:08.208 PET
org.apache.tomcat.jni.LibraryNotFoundError: Can't load library: /opt/target/bin/libtcnative-1.so, Can't load library: /opt/target/bin/liblibtcnative-1.so, no tcnative-1 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib], no libtcnative-1 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]

Error
2021-06-25 10:17:08.772 PET
java.lang.ClassNotFoundException: java.net.UnixDomainSocketAddress
	at java.base/java.net.URLClassLoader.findClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:151)

Error
2021-06-25 10:17:14.853 PET
java.lang.ClassNotFoundException: io.opencensus.impl.trace.TraceComponentImpl
	at java.base/java.net.URLClassLoader.findClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:151)

Error
2021-06-25 10:17:14.893 PET
java.lang.ClassNotFoundException: io.grpc.override.ContextStorageOverride
	at java.base/java.net.URLClassLoader.findClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:151)

No I did not configure any VPC access conector, I just have a cloud run instance comunicating to a postgres SQL instance by its instance name...

@vbonne
Copy link
Author

vbonne commented Jun 25, 2021

Steps to reproduce :
create GCP project
create cloudrun service
create cloud sql postgres instance with one database allow access to services of the project
download https://gitlab.com/vbonne/gcr-gsql-error project
create docker based on this project (docker build -t gcr.io/[GCP-project-name]/gcrerror )
upload docker image to GCP project (should have been done automatically at previous step)
start service with uploaded image on a 1cpu 1gb with variables set as in previous post and DLL_HANDLING="update" for database tables creation and with conection activated for the sql instance
create 1 job with cloud scheduler to a GET to https://[generated-url].a.run.app/login in order to test sql conection every 5 minutes or less to avoid the instances automatic stop/start that would delay the appearance of the error.

test login page manually https://.a.run.app/login you should get a login form
wait , within max 24h max the conection with the sql instance will be lost and the https://[generated-url].a.run.app/login will only display "ERROR".

@vbonne
Copy link
Author

vbonne commented Jun 26, 2021

I updated the error log post with a sample of the simple project log, at the right moment where the conection is lost, the first lines show hikari pool conection ok, then 5 seconds later the conection is lost forever until service restart...

@Maassmensch83
Copy link

Maassmensch83 commented Jun 28, 2021

Hi everyone,

we are probably seeing the same issue in our Spring Boot project:

2021-06-27 04:23:30.160 CEST "HikariPool-1 - Closing connection org.postgresql.jdbc.PgConnection@6c61fe57: (connection has passed maxLifetime)"

2021-06-27 04:23:30.161 CEST "Connecting to Cloud SQL instance [<PROJECT_ID>:europe-west1:<INSTANCE_NAME>] via SSL socket."

2021-06-27 04:23:30.888 CEST "HikariPool-1 - Failed to create/setup connection: Something unusual has occurred to cause the driver to fail. Please report this exception."

2021-06-27 04:23:30.889 CEST "HikariPool-1 - Cannot acquire connection from data source
org.postgresql.util.PSQLException: Something unusual has occurred to cause the driver to fail. Please report this exception.
	at org.postgresql.Driver.connect(Driver.java:285)
	at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138)
	at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:358)
	at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206)
	at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:477)
	at com.zaxxer.hikari.pool.HikariPool.access$100(HikariPool.java:71)
	at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:725)
	at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:711)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: [<PROJECT_ID>:europe-west1:<INSTANCE_NAME>] Failed to create ephemeral certificate for the Cloud SQL instance.
	at com.google.cloud.sql.core.CloudSqlInstance.addExceptionContext(CloudSqlInstance.java:574)
	at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:515)
	at com.google.cloud.sql.core.CloudSqlInstance.lambda$performRefresh$0(CloudSqlInstance.java:330)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
	... 3 common frames omitted
Caused by: java.io.IOException: Error writing request body to server
	at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3653)
	at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3636)
	at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
	at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142)
	at java.base/java.util.zip.DeflaterOutputStream.flush(DeflaterOutputStream.java:282)
	at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1178)
	at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46)
	at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
	at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:50)
	at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
	at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
	at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:171)
	at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
	at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
	at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:513)
	... 10 common frames omitted
" 

This leads to a restart of the service after 20 unsuccessful attempts. The error then disappears.

These are the dependencies we use:

  <parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.4.7</version>
    <relativePath />
  </parent>
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-dependencies</artifactId>
        <version>2020.0.3</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
      <dependency>
        <groupId>com.google.cloud</groupId>
        <artifactId>spring-cloud-gcp-dependencies</artifactId>
        <version>2.0.3</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>
<dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>spring-cloud-gcp-starter-sql-postgresql</artifactId>
    </dependency>
</dependencies>

The database instance uses a public IP. The DataSource is created by Spring Boot.

@enocom enocom assigned shubha-rajan and unassigned enocom Jun 28, 2021
@vincenzo-mazzotta
Copy link

vincenzo-mazzotta commented Jun 28, 2021

Hi to all,
We've same problem.
We've SQL cloud with public IP without VPC.
We are using SQL Auth Proxy to connect from cloud run to cloud sql in a springboot application.

After my springboot app is deployed in GCP it works fine.
HikariCP is filled with 10 connection I can call API rest with access on DB.
After 1 hours HikariCP can't connect on SQL cloud using SQL auth proxy to refresh connections on HikariCP.
The errors are below:

2021-06-29 01:22:15.661 CEST2021-06-29 01:22:15.661 INFO 1 --- [onnection adder] c.g.cloud.sql.core.CoreSocketFactory : Connecting to Cloud SQL instance [lmit-link-utili-itlm:europe-west1:test] via SSL socket.
Default
2021-06-29 01:22:15.661 CEST2021-06-29 01:22:15.661 DEBUG 1 --- [onnection adder] com.zaxxer.hikari.pool.HikariPool : LinkUtili-Hikari-Connection-Pool - Cannot acquire connection from data source
Default
2021-06-29 01:22:15.661 CEST
Error
2021-06-29 01:22:15.662 CESTjava.sql.SQLNonTransientConnectionException: Could not create connection to database server. at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:1006) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:823) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:453) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:246) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:198) ~[mysql-connector-java-8.0.23.jar!/:8.0.23] at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138) ~[HikariCP-4.0.3.jar!/:na] at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364) ~[HikariCP-4.0.3.jar!/:na] at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206) ~[HikariCP-4.0.3.jar!/:na] at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476) [HikariCP-4.0.3.jar!/:na] at com.zaxxer.hikari.pool.HikariPool.access$100(HikariPool.java:71) [HikariCP-4.0.3.jar!/:na] at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:726) [HikariCP-4.0.3.jar!/:na] at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:712) [HikariCP-4.0.3.jar!/:na] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_212] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_212] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_212] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212] Caused by: java.lang.RuntimeException: [lmit-link-utili-itlm:europe-west1:test] Failed to update metadata for Cloud SQL instance. at com.google.cloud.sql.core.CloudSqlInstance.addExceptionContext(CloudSqlInstance.java:574) ~[jdbc-socket-factory-core-1.3.0.jar!/:na] at com.google.cloud.sql.core.CloudSqlInstance.fetchMetadata(CloudSqlInstance.java:483) ~[jdbc-socket-factory-core-1.3.0.jar!/:na] at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) ~[guava-30.1.1-android.jar!/:na] at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) ~[guava-30.1.1-android.jar!/:na] at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) ~[guava-30.1.1-android.jar!/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_212] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_212] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_212] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_212] ... 3 common frames omitted Caused by: java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.8.0_212] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) ~[na:1.8.0_212] at java.net.SocketOutputStream.write(SocketOutputStream.java:155) ~[na:1.8.0_212] at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431) ~[na:1.8.0_212] at sun.security.ssl.OutputRecord.write(OutputRecord.java:417) ~[na:1.8.0_212] at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:879) ~[na:1.8.0_212] at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:850) ~[na:1.8.0_212] at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:720) ~[na:1.8.0_212] at sun.security.ssl.Handshaker.sendChangeCipherSpec(Handshaker.java:1144) ~[na:1.8.0_212] at sun.security.ssl.ClientHandshaker.sendChangeCipherAndFinish(ClientHandshaker.java:1280) ~[na:1.8.0_212] at sun.security.ssl.ClientHandshaker.serverHelloDone(ClientHandshaker.java:1190) ~[na:1.8.0_212] at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:369) ~[na:1.8.0_212] at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037) ~[na:1.8.0_212] at sun.security.ssl.Handshaker.process_record(Handshaker.java:965) ~[na:1.8.0_212] at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1064) ~[na:1.8.0_212] at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367) ~[na:1.8.0_212] at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395) ~[na:1.8.0_212] at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379) ~[na:1.8.0_212] at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559) ~[na:1.8.0_212] at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) ~[na:1.8.0_212] at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:162) ~[na:1.8.0_212] at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:148) ~[google-http-client-1.39.2.jar!/:1.39.2] at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84) ~[google-http-client-1.39.2.jar!/:1.39.2] at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012) ~[google-http-client-1.39.2.jar!/:1.39.2] at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514) ~[google-api-client-1.32.1.jar!/:1.32.1] at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455) ~[google-api-client-1.32.1.jar!/:1.32.1] at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565) ~[google-api-client-1.32.1.jar!/:1.32.1] at com.google.cloud.sql.core.CloudSqlInstance.fetchMetadata(CloudSqlInstance.java:438) ~[jdbc-socket-factory-core-1.3.0.jar!/:na] ... 10 common frames omitted
Default
2021-06-29 01:22:15.662 CEST
Default
2021-06-29 01:22:24.248 CEST2021-06-29 01:22:24.248 DEBUG 1 --- [ool housekeeper] com.zaxxer.hikari.pool.HikariPool : LinkUtili-Hikari-Connection-Pool - Pool stats (total=0, active=0, idle=0, waiting=0)
Default
2021-06-29 01:22:24.248 CEST2021-06-29 01:22:24.248 DEBUG 1 --- [ool housekeeper] com.zaxxer.hikari.pool.HikariPool : LinkUtili-Hikari-Connection-Pool - Fill pool skipped, pool is at sufficient level.

We need to use this solution to allow to use VPC for intranet campany services.
Please fix it this is a serious problem, this can't allow to use SQL Auth proxy as solution of secure data from to DB.
In local (windows) the thing is different, it's working fine but only in my local machine.

Waiting GCP news about this fix.
If you need some details I can see actually application in GCP and log.

Kind regards

@enocom enocom added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. labels Jun 29, 2021
@enocom
Copy link
Member

enocom commented Jun 29, 2021

Thanks @vincenzo-mazzotta @Maassmensch83 and @vbonne. I'll reproduce the error and report back.

@shubha-rajan
Copy link
Contributor

Hi @vincenzo-mazzotta @Maassmensch83 and @vbonne, have these errors only started occurring when using the latest version, or do previous socket factory versions also cause the error to appear?

@vbonne
Copy link
Author

vbonne commented Jun 29, 2021

I can't say anything about appearing date of this error as my project is just been ported to GCP so I started with latest version.
To answer your question I have just started the service with the version 1.2.0 of the postgres-socket-factory.

same error happened only 1h after startup :

java.lang.RuntimeException: [instance name] Failed to create ephemeral certificate for the Cloud SQL instance.
	at com.google.cloud.sql.core.CloudSqlInstance.addExceptionContext(CloudSqlInstance.java:465)
	at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:349)
	at com.google.cloud.sql.core.CloudSqlInstance.lambda$performRefresh$0(CloudSqlInstance.java:207)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Error writing request body to server
	at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source)
	at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source)
	at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source)
	at java.base/java.io.BufferedOutputStream.flush(Unknown Source)
	at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source)
	at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1193)
	at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:45)
	at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
	at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:50)
	at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
	at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
	at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:171)
	at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
	at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:541)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:474)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:591)
	at com.google.cloud.sql.core.CloudSqlInstance.fetchEphemeralCertificate(CloudSqlInstance.java:347)
	... 10 common frames omitted 

but in this case after those errors were "normal" as the attemps were correctly retried and none of the scheduled tasks did actually fail as the conections managed to be re-established fast enough.

@enocom
Copy link
Member

enocom commented Jun 29, 2021

Thanks @vbonne. I'm running a similar experiment and will report if the cert refresh works.

@vbonne
Copy link
Author

vbonne commented Jun 29, 2021

Thanks @vbonne. I'm running a similar experiment and will report if the cert refresh works.

I updated my previous post, with version 1.2.0 the error occurs, but is recovered fast enough not to generate actual failure of any REST request or scheduled task. With version 1.3.0 when the error occurs it does not get recovered. I will then proceed with my plateform tests with version 1.2.0. But it would be great if this error could not occur at all ;-)

One other question, in my depency error check, I found that the postgres-socket-factory package v 1.2.0 contains in itself inconsistencies that i had to correct at my project level. The google-auth-library-oauth2 and the google-http-client have version conflicts within the postgres-socket-factory package. Is that normal ? Should I force the latest version of those 2 packages at my project level, or leave this google package with its own dependence inconsistencies ?.

@vbonne
Copy link
Author

vbonne commented Jun 30, 2021

I guess it will not add much information because most of this post is already in the previous post but by comparing the 2 projects i have, one running 1.2.0 and 1.3.0 i could see that the period when the error occurs are identical within the day. Once the error stops, they stop on both projects and the 1.3.0 then also recovers normal oprations. I mean that by oposition to what i already stated, the v1.3.0 does not stay locked out the SQL connection for ever.
For instance, today an error period occured from 15:55pm to 4:45pm (EST) meaning than the 1.3.0 project just kept failing during those 50 minutes, during the exact same 50 minutes the 1.2.0 project loggued connections errors from which it managed to recover and perform the requested tasks.
The fact that those 2 services are independant and logged the same connections difficulties during the same 50 minutes tends to indicate that the issue comes from the SQL instance itself, probably when the the ssl certificate is refreshed (even if the v1.2.0 project recovered from them) but it is quite strange that for both project the duration of the incident is identical, like if the the error was generated by the SQL instance itself.

@Maassmensch83
Copy link

Hi @vincenzo-mazzotta @Maassmensch83 and @vbonne, have these errors only started occurring when using the latest version, or do previous socket factory versions also cause the error to appear?

I can't answer this question because we are currently moving our application from on-prem to GCP and haven't done any dependency updates yet. We import the com.google.cloud:spring-cloud-gcp-dependencies:2.0.3 BOM and this gives us com.google.cloud.sql:jdbc-socket-factory-core:1.3.0.

@enocom
Copy link
Member

enocom commented Jul 6, 2021

@Maassmensch83 The errors are perhaps alarming, but as long as your app has no problem connecting with the database, they can be ignore. See #310 for context.

Thanks for the additional report @shanawaspm. For now, the recommendation is to pin to v1.2.3 while we determine the issue in v1.3.0.

@shanawaspm
Copy link

Thanks for the additional report @shanawaspm. For now, the recommendation is to pin to v1.2.3 while we determine the issue in v1.3.0.

thanks, I have now downgraded both postgres-socket-factory and jdbc-socket-factory-core to 1.2.3. So far it is working fine eveenthough the log has some errors regarding refreshing metadata.

@shubha-rajan
Copy link
Contributor

shubha-rajan commented Jul 7, 2021

Update: I have opened a draft PR with a possible solution that adds an IOExceptionHandler to the API calls, allowing the SQLAdmin Client to retry the requests that fail with IOExceptions. I've deployed two versions of the tiny app with and without the change, which have been running for ~9 hours. Still waiting to see whether the change reduces or eliminates the errors.

Although it looks like there are multiple types of errors that we're dealing with. I'm not sure if the root cause for the SSLException and IOException are the same, but it's possible they might be related.

@kurtisvg
Copy link
Contributor

kurtisvg commented Jul 8, 2021

Adjusting this to P2 since we have a workaround for the shortterm.

@kurtisvg kurtisvg added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed 🚨 priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Jul 8, 2021
@kurtisvg kurtisvg changed the title Connection lost to Google SQL from Cloud run "java.io.IOException: Error writing request body to server" error during create ephemeral certificate Jul 8, 2021
@vincenzo-mazzotta
Copy link

Hi to all,
some news about this bug on cloud run and sql cloud auth proxy?
Our app is in Java 8 not Java 11.
I attach our error logs that are a lot (10000 entries):
downloaded-logs-20210712-111732.zip
And attach logs all (10000 entries):
downloaded-logs-20210712-112217.zip

@vincenzo-mazzotta
Copy link

vincenzo-mazzotta commented Jul 12, 2021

Pardon I've seen now to downgrade library mysql-socket-factory to 1.2.3 version.
I added -Djavax.net.debug=all on JVM option to see errors about ssl and connections.
I will inform you in next hour.
For now I will show log with javax.net.debug enabled where you can see TLSv1.2 used in SSL connection opened connection as normally happened in 1.3.0 latest version of library.
In 1 hour I will attach log if there are errors or I will send a message to reply about this version 1.2.3 to fix lost connections in cloud run.
Actual log all correct at some minutes to go up Springboot as in log file:
(UPDATED FILE I WAS UPLOAD OLD FILE PARDON)
Using mysql-socket-factory 1.2.3 version downloaded-logs-20210712-125937.zip

Waiting more of 1 hour to see if continue to works fine.

@vincenzo-mazzotta
Copy link

The bug isn't solved using mysql-socket-factory 1.2.3 version please see attachment log error JVM8 can't acquire connections to SQL Server by sql auth proxy when it need to refresh connections:
Using mysql-socket-factory ONLY ERROR CONNECTION ACQUIRE downloaded-logs-20210712-132339.zip

I attach log messages around errors from 2021-07-12 13:19:48.956 CEST to 2021-07-12 13:22:50.046 CEST to see the problem:
Using mysql-socket-factory v1_2_3 verbose logs around errors downloaded-logs-20210712-133028.zip

@kurtisvg
Copy link
Contributor

@vincenzo-mazzotta This issue is about the error "java.io.IOException: Error writing request body to server". Your issue looks unrelated "java.io.EOFException: SSL peer shut down incorrectly".

@enocom
Copy link
Member

enocom commented Jul 12, 2021

@vincenzo-mazzotta If you're seeing this error repeatedly (for example across a few hours) and it affects connectivity, would you mind opening a new issue? Otherwise, you might be seeing a transient error.

@vincenzo-mazzotta
Copy link

I opened a CASE on GCP support because this error is present, as explained by GCP support, because ephimeral certificates are security certificates that are valid around 1h and then they need to refresh.
This problem happen on cloud functions too.
If we will have a fix by GCP support team I will inform you.

@enocom
Copy link
Member

enocom commented Jul 13, 2021

Thank you @vincenzo-mazzotta. Meanwhile, we're working on getting to the root cause of this issue. There's a chance your issue is related.

@enocom
Copy link
Member

enocom commented Jul 13, 2021

We've just cut a new release. I've been running it for almost 24 hours now and don't see any of these errors. Would you like to see if the new version fixes this issue?

cc @vbonne and @vincenzo-mazzotta

@enocom
Copy link
Member

enocom commented Jul 15, 2021

After running the new release, I'm still seeing the IOException. We're working on fixing this.

Presently, the workaround suggested in #528 (comment) (setting the max lifetime to zero) is probably the way to go until we find and fix this bug.

@enocom
Copy link
Member

enocom commented Jul 22, 2021

After some digging and a number of conversations on our team, I've managed to reproduce this issue in a much smaller scale and have found the root of the problem.

In short, this is not a bug in the socket factory. Instead, it's a result of running the socket factory on Cloud Run.

From the Cloud Run docs in the section Avoiding background activities:

When an application running on Cloud Run finishes handling a request, the container instance's access to CPU will be disabled or severely limited. Therefore, you should not start background threads or routines that run outside the scope of the request handlers.

Running background threads can result in unexpected behavior because any subsequent request to the same container instance resumes any suspended background activity.

On a first request, the socket factory will initiate a connection to the Cloud SQL instance. It will also schedule a refresh of the associated credentials for that connection ~55 minutes in the future. What we're seeing here is that refresh operation failing on account of being throttled by Cloud Run.

The best solution in my opinion is to run the socket factory using a Unix socket. Note: in this approach you will need to create a Cloud SQL Connector.

@enocom enocom closed this as completed Jul 22, 2021
@kurtisvg
Copy link
Contributor

Let's leave this open until we update our README to reflect this

@kurtisvg kurtisvg reopened this Jul 22, 2021
@enocom
Copy link
Member

enocom commented Jul 22, 2021

Sounds good. I'll make that change.

@vbonne
Copy link
Author

vbonne commented Jul 25, 2021

The best solution in my opinion is to run the socket factory using a Unix socket. Note: in this approach you will need to create a Cloud SQL Connector.

Hi @enocom, I confirm, not a single error in 2 days working with unix socket and version 1.3.1 . Thanks all for your investigation and solution.

@shubha-rajan
Copy link
Contributor

I'm going to close this issue for now. We just merged #561 which greatly improves the application's ability to recover from errors when using TCP sockets, and also recommend using Unix sockets on Cloud Run and Functions if you can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
8 participants