Skip to content

Optionally listen on a UNIX socket instead of TCP #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 31, 2016
Merged

Optionally listen on a UNIX socket instead of TCP #126

merged 4 commits into from
Dec 31, 2016

Conversation

muggenhor
Copy link
Contributor

When you now pass --unix $SOCK_NAME it'll bind to that path as a Unix
socket permitting Cucumber to use that as a connection point. For that
the 'unix: ' option needs to be put in the cucumber.wire file.

PS it could be implemented without the shared pointers on C++11 by taking advantage of move support. But I suspect the aim is to support C++98 given the minimal boost versions required by the CMakeLists.txt.

@paoloambrosio
Copy link
Member

Before reviewing the code... did you forget to commit tests for the new functionality you've implemented?

@muggenhor
Copy link
Contributor Author

@paoloambrosio: no I didn't write any at that time. Hopefully I can pick that up this week otherwise it'll have to wait another two.

@muggenhor
Copy link
Contributor Author

@paoloambrosio I've added a connection test to WireServerTest now. It's pretty basic, but so is the functionality.

When you now pass --unix $SOCK_NAME it'll bind to that path as a Unix
socket permitting Cucumber to use that as a connection point. For that
the 'unix: ' option needs to be put in the cucumber.wire file.
@paoloambrosio
Copy link
Member

Really liked most of your code, and the fact that you have even rebased this PR 👍

Didn't like though the coupling between TCP and Unix servers. Have made a commit in my personal repository refactoring your code introducing a class template to separate the two socket implementations. I think tests are a bit cleaner as well.

I've also changed the logic on when to remove the socket, simplifying it. It will remove it on a clean exit, but it won't try to remove it if it exists (bind will fail).

Feel free to add that commit to this PR, or let me know what you think.

@muggenhor
Copy link
Contributor Author

muggenhor commented Dec 29, 2016

I like the basic idea of your approach: using two separate types for the two different socket types. But I don't think two different base types (which the template generates) are necessary. I think that a base class like this provides for some slightly nicer code (public interface only):

struct SocketServer {
    virtual void acceptOnce() = 0;
};

Basically what I'm trying to achieve here is a reasonable amount of type erasure.

It's mostly an abstract idea right now, but let me come up with something, probably somewhere during the day tomorrow I'll submit something.

@paoloambrosio
Copy link
Member

I tried to use a single base class instead of the template class, but couldn't get it to look clean.

listen and listenEndpoint could be moved to the subclasses, removing the need for EndpointType, but I'm not sure I know how to remove ProtocolType::acceptor and ProtocolType::iostream without making a mess :-)

Hope you have a better idea!

@muggenhor
Copy link
Contributor Author

Turned out it was pretty easy to do without the template class by just using two template member functions. And those only to reduce duplication, not to solve any other kind of problem.

Tell me what you think.

Copy link
Member

@paoloambrosio paoloambrosio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a better look later, but the refactoring looks good!

Note: coding style needs fixing (upper camel case methods).

boost::filesystem::remove(unixPath);

DoListen(acceptor, stream_protocol::endpoint(unixPath));
socketToRemove = unixPath;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you have not pulled my simplified socket removal logic (remove socket when the server is destroyed), and kept your quite aggressive one (on top of mine, remove on startup any socket already present and then remove it again).

Do you disagree on when the file should be removed? I think it's bad practice to remove something that is already there, as it could belong to another instance already running. I believe common practice is to fail if the socket already exists.

My commit was also removing socketToRemove that becomes irrelevant given that if can be inferred by the endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the socket file on startup is actually pretty common practice for unix sockets (I'm not sure whether it's more or less prevalent than failing on startup though). Necessarily so, because something like reuse_addr (permit binds on existing endpoints that nothing is currently bound to) doesn't exist for unix sockets. Not doing this requires manual intervention, which for server startup is probably not desired. I do have the file-type checking there to prevent accidental deletion of files containing data though.

Note that interference with other programs shouldn't happen if the socket-file-name is chosen wisely (which is a task for the user in this case, choosing a unique name) and interfering with other instances of the same cucumber-cpp program can only happen if it's started twice, which is also something a user shouldn't do.

I missed the removal of socketToRemove, added it now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done some digging and it is more common to remove what is there (sometimes not even checking is it's a socket). So I'm happy to remove it if present before binding and at the end.

BTW, SO_REUSEADDR in TCP is related to the TIME_WAIT state so I would not compare the two.


// when
(void)stream_protocol::iostream(server->listenEndpoint()); // connect and disconnect
UnixSocketServerTest::TearDown();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you create a protected method

void tearDownServer() {
    delete server;
    server = NULL;
}

then the TearDown method can be implemented only in SocketServerTest, removing both implementations on subclasses and the warning on when it must be called.

Copy link
Contributor Author

@muggenhor muggenhor Dec 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I already felt the ordering stuff placed in the inherited TearDown was nasty. This is nicer.

Additionally I added the ability to cancel from another thread, that way it doesn't depend on a connect/disconnect cycle.

@muggenhor
Copy link
Contributor Author

Hmm, I don't know why, but for some reason on OSX one of the two test runs fails almost all the time. But rarely the same one, it's swapping between GMock 1.7 and 1.8 for failing. And every time on the same test case. But not in an explainable way. This feels really race-condition-like to me, but through code inspection I cannot find it, nor do I have any OSX machines myself to test it (and using Travis for debugging is just not a decent option).

@paoloambrosio
Copy link
Member

As written in another comment, I'm happy to remove the Unix socket if present before binding and at the end, but I'm really confused by all the logic that you have put around removing it.

Questions:

  • How can you create an abstract socket using the command line? I'd rather not introduce feature bloating, and in this case I don't think Cucumber-Ruby supports it, even if we were to introduce it in Cucumber-CPP
  • Why is removeUnixSocket() called in UnixSocketServer::listen given that it should not do anything? When listen(...) is called, the acceptor should not be open. I'd rather forbid listening if the acceptor is open, to prevent it from being called twice.

@muggenhor
Copy link
Contributor Author

muggenhor commented Dec 30, 2016

Regarding abstract sockets: I agree, there's no point in supporting it from the command line. But do we want to terminate in a bad way when someone's created them (by implementing their own main)? I.e. fs::remove() will throw if you try to delete an abstract socket with it. So that's the only reason for detecting them when removing the socket. That's also the only reason why a test case is necessary for that one.

As for the call to removeUnixSocket() from listen: true, that's only necessary when you want to interpret a new listen() call as rebinding. If you remove that rebinding and instead throw when trying to do that removeUnixSocket's contents can be moved completely to the destructor.

@paoloambrosio
Copy link
Member

This class is not distributed as part of an external library but only as part of Cucumber-CPP. We should work in an agile way and provide the implementation that fits our purpose. If there is a need for abstract sockets in the future, someone can send a PR to target that specifically.

Even for the re-binding, I'd rather do the simplest thing first and change it later if needed. This also has the advantage of minimising the effort writing and reviewing PRs, as well as reducing the amount of code that needs to be maintained. Of course it's on a case-by-case basis :-)

@paoloambrosio
Copy link
Member

Good(-ish) news about OSX: I can reproduce it on a Macbook. I'll try and find why it is failing.

acceptor.async_accept(*stream->rdbuf(),
boost::bind(&SocketServer::processStream,
this, boost::asio::placeholders::error, stream));
}
Copy link
Member

@paoloambrosio paoloambrosio Dec 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that you changed this. Why was it needed? Doesn't it make code more complicated?

I've dedicated hours to this PR and if things keeps changing for no reason it is less likely to get merged... and I'd like to merge it ASAP.

Copy link
Contributor Author

@muggenhor muggenhor Dec 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be able to call .cancel() on the acceptor. Which is because you can only cancel asynchronous requests, not synchronous ones.

This is only necessary for the single test case that verifies socket file deletion. I agree this is messy, I'll revert to the connect, then immediately disconnect version.

@muggenhor
Copy link
Contributor Author

muggenhor commented Dec 30, 2016

The last version (b162c41) doesn't contain the abstract namespace stuff. Only the single extra unix-socket-file deletion testcase and throws when attempting to bind twice. No other differences.

@paoloambrosio
Copy link
Member

It looks good. I'll take a look at the build error on OSX hopefully tomorrow.

@paoloambrosio
Copy link
Member

Wasn't able to understand exactly why we have random failures on OSX. It looks like a race condition of some sort in the tests, given that the first UnixSocketServerTest works by adding serverThread->yield() after client.connect(server->listenEndpoint());.

Wasn't able to get the second one to work though without having traffic flowing in the socket. So I've created a test that exercises the entire journey. I ran it on a loop without errors thousands of times in isolation, and dozens with the other tests.

Can you pull my commit into this PR? Please do not squash commits as it's important to keep track of what was done.

@muggenhor
Copy link
Contributor Author

I noticed it was throwing from the accept() call as well. But for as far as I can see at that point we don't access any shared datastructure, and the sockets themselves, being OS-level primitives, should be safe to access from a different thread than the one that created it. As far as I can see it's never accessed before being created and bound anyway (we call listen before even creating the thread). So this just leaves me confused. I suspect that if we'd use only asynchronous IO inside SocketServer and a single thread this problem would disappear. But without using coroutines (available only in a much more recent Boost version) that would complicate the code quite a bit (loads of callbacks and binding them) only to the advantage of the tests.

Anyway thanks for your time and help with this. Polishing this proved to be a bit more time consuming than I expected.

@paoloambrosio paoloambrosio merged commit edb9060 into cucumber:master Dec 31, 2016
@paoloambrosio
Copy link
Member

Merged! Well done! 👍

It might have looked like a simple PR but it was not. It's clear that you know what you are doing, but it takes time to learn how to collaborate remotely on a project, especially given that we are doing it on our free time.

@muggenhor muggenhor deleted the unix-socket-support branch December 31, 2016 19:29
@muggenhor
Copy link
Contributor Author

True, it's been a while since remote communication was so important for me out of work (where video conferences and IM make things a lot easier). And while I usually know what I've done, most of the time it's still a discovery process to find the right solution. Your feedback helped a lot with that, so thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants