-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RSDK-8292] Create Single App Connection and Use for Net Appender and Restart Checker #4746
Changes from 7 commits
09186d8
e18211c
1306520
5e5a791
7543b72
b333e48
8827cd4
67d8e55
1394587
88e576b
c9d46a9
a974c16
1fd26c1
0511a68
b87a1b5
5c4f7d3
c4799cf
11aa86f
92ceb64
a8a5af3
1baabf6
eb11c9e
6f3f163
a3b9956
6ecf9ef
6303b44
c8c1b5d
43d718f
1963013
5e2b24b
59101de
e2a592e
c626935
51a9b3d
2e9033b
31dee57
f1615a2
f035a59
93b7314
48dc67b
f0c8f07
40e87cf
3ab3a53
ded9d43
ff6953e
3da6860
8f67114
725d4d9
24ba32f
728d692
78fa924
b477b5a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
package grpc | ||
|
||
import ( | ||
"context" | ||
"net/url" | ||
"time" | ||
|
||
"github.com/pkg/errors" | ||
|
||
"go.viam.com/rdk/config" | ||
"go.viam.com/rdk/logging" | ||
"go.viam.com/utils/rpc" | ||
) | ||
|
||
type AppConn struct { | ||
ReconfigurableClientConn | ||
|
||
// Err stores the most recent error returned by the serialized dial attempts running in the background. It can also be used to tell | ||
// whether dial attempts are currently happening; If err is a non-nil value, dial attempts have stopped. Accesses to Err should respect | ||
// ReconfigurableClientConn.connMu | ||
Err error | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we use this anywhere? If not, would prefer to not have it at all There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right now, no. I'll get rid of it. I only included it in this PR bc I felt bad throwing away the error every time but agree that it has no use atm. |
||
} | ||
|
||
func NewAppConn(ctx context.Context, cloud *config.Cloud, logger logging.Logger) (*AppConn, error) { | ||
grpcURL, err := url.Parse(cloud.AppAddress) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
dialOpts := dialOpts(cloud) | ||
|
||
if grpcURL.Scheme == "http" { | ||
dialOpts = append(dialOpts, rpc.WithInsecure()) | ||
} | ||
|
||
appConn := &AppConn{} | ||
|
||
// a lock is not necessary here because this call is blocking | ||
appConn.conn, err = rpc.DialDirectGRPC(ctx, grpcURL.Host, logger, dialOpts...) | ||
if err != nil { | ||
if errors.Is(err, context.DeadlineExceeded) { | ||
go func() { | ||
for { | ||
appConn.connMu.Lock() | ||
|
||
ctxWithTimeOut, ctxWithTimeOutCancel := context.WithTimeout(context.Background(), 5*time.Second) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can't use the context.Context passed to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. using the context passed in would be good because then a context cancellation stemming from killing the viam-server will propagate properly |
||
|
||
appConn.conn, err = rpc.DialDirectGRPC(ctxWithTimeOut, grpcURL.Host, logger, dialOpts...) | ||
if errors.Is(err, context.DeadlineExceeded) { | ||
appConn.connMu.Unlock() | ||
|
||
// only dial again if previous attempt timed out | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. context.DeadlineExceeded the only error you get if you're in low-connectivity environments? or if you were out and then in of wifi range? I would be ok with this routine running in the background for as long as viam-server is running no matter what the error since this connection is so important There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (the routine running in the background for the entire lifetime of the viam-server while we don't have a connection) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. Should we log this error? Only log if it's not a timeout? Always log? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok to always log at debug level, may reassess if it's really annoying |
||
continue | ||
} | ||
|
||
ctxWithTimeOutCancel() | ||
|
||
appConn.Err = err | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For now, I'm handling errors returned by the background dials by simply storing them in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see my other question around this, I'm not sure if we need this field (yet) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. eventually we'd want to expose whether the connection is connected or not, but that can come later (the connection being not nil does not tell us whether it is disconnected or not) |
||
|
||
break | ||
} | ||
|
||
appConn.connMu.Unlock() | ||
}() | ||
} else { | ||
return nil, err | ||
} | ||
} | ||
|
||
return appConn, nil | ||
} | ||
|
||
func dialOpts(cloud *config.Cloud) []rpc.DialOption { | ||
dialOpts := make([]rpc.DialOption, 0, 2) | ||
// Only add credentials when secret is set. | ||
if cloud.Secret != "" { | ||
dialOpts = append(dialOpts, rpc.WithEntityCredentials(cloud.ID, | ||
rpc.Credentials{ | ||
Type: "robot-secret", | ||
Payload: cloud.Secret, | ||
}, | ||
)) | ||
} | ||
return dialOpts | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,6 +24,7 @@ import ( | |
"go.viam.com/utils/rpc" | ||
|
||
"go.viam.com/rdk/config" | ||
"go.viam.com/rdk/grpc" | ||
"go.viam.com/rdk/logging" | ||
"go.viam.com/rdk/resource" | ||
"go.viam.com/rdk/robot" | ||
|
@@ -191,13 +192,16 @@ func RunServer(ctx context.Context, args []string, _ logging.Logger) (err error) | |
// Start remote logging with config from disk. | ||
// This is to ensure we make our best effort to write logs for failures loading the remote config. | ||
if cfgFromDisk.Cloud != nil && (cfgFromDisk.Cloud.LogPath != "" || cfgFromDisk.Cloud.AppAddress != "") { | ||
ctxWithTimeout, ctxWithTimeoutCancel := config.GetTimeoutCtx(ctx, true, cfgFromDisk.Cloud.ID) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would actually create the ctx with timeout inside the NewAppConn function, since I would consider the lifetime of AppConn to be the lifetime of the viam-server. so it'd make more sense to just pass in the server context and then AppConn can branch off of it if needed |
||
appConn, err := grpc.NewAppConn(ctxWithTimeout, cfgFromDisk.Cloud, logger) // TODO(RSDK-8292): [q] what logger should I pass here? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [qu] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a sublogger, possibly |
||
ctxWithTimeoutCancel() | ||
netAppender, err := logging.NewNetAppender( | ||
&logging.CloudConfig{ | ||
AppAddress: cfgFromDisk.Cloud.AppAddress, | ||
ID: cfgFromDisk.Cloud.ID, | ||
Secret: cfgFromDisk.Cloud.Secret, | ||
}, | ||
nil, false, logger.Sublogger("networking").Sublogger("netlogger"), | ||
appConn, false, logger.Sublogger("networking").Sublogger("netlogger"), | ||
) | ||
if err != nil { | ||
return err | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that an
AppConn
instance attempts to establish a connection in a background Goroutine, all calls to its methods would require a nil check and the acquisition of a lock. This code/functionality is already written out in theReconfigurableClientConn
, so I figured it'd be best to just reuse that.