Skip to content

Support "unix:///" scheme in Dial #1741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
menghanl opened this issue Dec 14, 2017 · 8 comments
Closed

Support "unix:///" scheme in Dial #1741

menghanl opened this issue Dec 14, 2017 · 8 comments
Assignees
Labels
P2 Type: Feature New features or improvements in behavior

Comments

@menghanl
Copy link
Contributor

menghanl commented Dec 14, 2017

As defined in https://github.com./grpc/grpc/blob/master/doc/naming.md

@dfawley
Copy link
Member

dfawley commented Feb 2, 2018

@menghanl can you provide an ETA on or work estimate for this feature request?

@aequitas
Copy link

I think I am stumbling on this. I try to get a embedded Etcd server running over unix domain sockets in (https://github.com./purpleidea/mgmt/) but running into this error:

WARNING: 2018/02/20 13:24:57 grpc: addrConn.createTransport failed to connect to {clients.sock:0 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: lookup clients.sock: no such host". Reconnecting...

I have boiled it down to:

return dialContext(ctx, "tcp", addr)

Where a TCP dialer is created if none exists yet, even though a correct dialer is passed in via: https://github.com./coreos/etcd/blob/3903385d1b50c26ec0c18e99fc55be33ee0d97e3/clientv3/client.go#L272 afaict, but my go knowledge is yet lacking in this part.

Changing this from "tcp" to "unix" makes it work, but obviously breaks TCP support. I have not yet found a way to properly inspect the opts passed into DialContext in order to make a logical switch here, also target is passed in without scheme by Etcd, but adding it makes no difference.

A similar error message is generated when running Etcd directly using unix sockets:

# etcd --listen-peer-urls unix://etcd:0 --listen-client-urls unix://etcd:1 --advertise-client-urls unix://etcd:1

WARNING: 2018/02/20 13:34:20 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp: lookup etcd on 10.5.8.1:53: no such host"; Reconnecting to {etcd:0 0  <nil>}

Is this due to lack of support for unix domain sockets in grpc? Or is Etcd/mgmt invoking grpc wrongly in this point?

@menghanl
Copy link
Contributor Author

menghanl commented Feb 20, 2018

@aequitas What's the target you are dialing to?
I assume it is "unix://etcd-address" for unix domain socket and "http://etcd-address" for tcp, is that right?

If what I assumed is correct. I think the problem here is that, the etcd dialer you pointed to assumes gRPC doesn't parse the target, so it will still receive host as "unix://etcd-address".
But this is no longer true because gRPC parses the target and passes "etcd-address" empty string without the scheme to it.

Please try this workaround and see if it works: use "passthrough:///unix://etcd-address" as the target you are dialing to.

@aequitas
Copy link

@menghanl thanks for the quick response.

Afaict Etcd passes only the 'host:port' part to the DialContext. When giving this url to Etcd: unix://etcd-address:0 only etc-address:0 is passed (etcd requires host:port format even for unix adresses due to input validation, the socket filename created will be host:port in the current directory).

I tried adding the unix:/// and passthrough:///unix:// prefix to host on this line: https://github.com./coreos/etcd/blob/master/clientv3/client.go#L352 and sctx.addr in https://github.com./coreos/etcd/blob/master/embed/serve.go#L195 but it did not completely solve the problem. It has now moved from resetTransport to createTransport:

WARNING: 2018/02/20 20:46:23 grpc: addrConn.createTransport failed to connect to {clients.sock:0 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: lookup clients.sock: no such host". Reconnecting...

I missed the difference in these error messages when debugging earlier, and also that there was a second entrypoint in etcd/embed/serve.py so I will pick up my investigation with the information. If you have any more pointers I'm glad to know.

@menghanl
Copy link
Contributor Author

@aequitas At this stage, I think it would be better to file an issue in etcd repo and ask about this. It's not clear to me what might have caused this problem.
Please leave a comment here if anything is needed from gRPC. Thanks.

@gyuho
Copy link
Contributor

gyuho commented Feb 20, 2018

@aequitas Can you file an issue to etcd?

I was planning to debug this as well, while testing etcd with unix sockets.

Thanks.

@aequitas
Copy link

@menghanl will do, thanks for the help so far.

@gyuho sure thing: etcd-io/etcd#9340

@lyuxuan
Copy link
Contributor

lyuxuan commented Jun 13, 2018

The issue here should be covered by #1911. Closing this now.

@lyuxuan lyuxuan closed this as completed Jun 13, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Dec 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P2 Type: Feature New features or improvements in behavior
Projects
None yet
Development

No branches or pull requests

5 participants