The Qpid Proton C transport code is responsible for receiving and sending all the protocol bytes into and out of the Proton-C engine. It encapsulates all the protocol processing. Effectively turning the wire level protocol into more abstract AMQP model state changes and vice versa turning AMQP model state changes into wire level protocol to send.
The transport code is divided into 3 layers that divide up the protocol processing:
- SSL
- SASL
- AMQP
Loosely speaking, SSL is mostly associated with encrypting the connection, SASL with authenticating the connection, and of course AMQP carries the actual messaging protocol we're "really" interested in. However despite common knowledge, actually SSL and SASL can both do encryption and authentication of the connection. It's important to note that the SSL and SASL layers are optional in the sense that if authentication and/or encryption are not required then they need not be used - This will be covered in more detail later on.
In the current Proton codebase (0.8) the SSL layer is fairly comprehensively implemented (using the OpenSSL and Windows SChannel implementations) and allows access to both encryption and authentication (although SSL authentication is somewhat complex to set up and use).
Conversely the SASL layer is very rudimentary and essentially punts to the application to handle the SASL protocol interchange. The SASL code does have some built in capability to use the ANONYMOUS mechanism and some code to simplify implementing the PLAIN SASL protocol exchange. Using the Proton-C 0.9 SASL codebase requires the application to directly read the SASL protocol bytes from the peer process them and send raw protocol bytes back to the peer.
I have been re-implementing the SASL layer using the Cyrus-SASL library which supports many SASL mechanisms via a range of plugins (including ANONYMOUS, EXTERNAL, PLAIN, DIGEST-MD5, CRAM-MD5, GSSAPI etc.) and which is extensible to new SASL mechanisms by writing new mechanism plugins.
This work has led me re-evaluate some of the transport API as a whole as well as more specifically the SASL API which more directly needs reworking because of these changes.
Specifically I'd like to somewhat simplify using authentication and encryption with the transport by unifying the most commonly used concepts into the transport code and only requiring delving directly into the SSL/SASL APIs for more complex uses.
API Changes
From the authentication point of view the API flow is much simpler than before:
Instead of the Proton-C 0.9 situation where the application has to directly read and write AMQP SASL frames using pn_sasl_recv()
, pn_sasl_send()
etc. the meat of the protocol interchange happens behind the scenes; the application only has to set up authentication parameters, and then allow Proton-C to handle SASL without application intervention.
At the client the parameters that can be set up include username/password (with pn_connection_set_user/pn_connection_set_password()
), if necessary the SASL mechanisms used can be set (pn_sasl_allowed_mechs()
).
By default the server will adapt to whatever layers the client uses to communicate. But the server application can specify that the connection must be encrypted or authenticated (using pn_transport_require_encryption(
) and pn_transport_require_auth()
). Also if required the server application can force the SASL layer to use the ANONYMOUS mechanism or to exclude some mechanisms that might be installed on the system (by using pn_sasl_force_anonymous()
or pn_sasl_exclude_mechs()
). The location and name of the configuration file used by the SASL implementation can be changed by using the pn_sasl_config_name()
and pn_sasl_config_path()
APIs.
The outcome of the authentication process at both server and client sides is indicated by a transport event. In the case of authentication failure it will be a PN_TRANSPORT_ERROR
event, and the case of success it will a PN_TRANSPORT_AUTHENTICATED event. Once the authentication event has been received the server can discover who it is talking to by using the pn_transport_get_user()
API. If required the mechanism used can be recovered using pn_sasl_get_mech()
.
Note that on the server you can also use the PN_CONNECTION_REMOTE_OPEN event to signal that authentication has succeeded if you don't need to deal with authentication by itself.
SASL
The biggest functional changes happen in the SASL layer code where the API has largely changed, very little backwards compatibility has been kept, because it is fairly clear that very few (if any) people have been using the current API to implement their own SASL mechanisms.
If this turns out to be untrue, then we could add some further measure of backwards compatibility as required.
Functions Removed
void pn_sasl_client(pn_sasl_t *sasl)
void pn_sasl_server(pn_sasl_t *sasl)
These functions were used to specify the SASL layer as either a client or server, they have been deprecated since functionality went into the transport code to specify whether it is an authentication client or server. Currently, on creation, the SASL layer will determine from the transport whether it should be a server or a client. As there is a good deal of SASL churn in these current changes this is a good opportunity to remove these deprecated APIs.
pn_sasl_state_t pn_sasl_state(pn_sasl_t *sasl)
size_t pn_sasl_pending(pn_sasl_t *sasl)
ssize_t pn_sasl_recv(pn_sasl_t *sasl, char *bytes, size_t size)
ssize_t pn_sasl_send(pn_sasl_t *sasl, const char *bytes, size_t size)
These functions are the ones which support the client directly reading and writing the SASL protocol frames. They are not needed as this functionality has been removed and superseded by internal functionality.
void pn_sasl_mechanisms(pn_sasl_t *sasl, const char *mechanisms)
const char *pn_sasl_remote_mechanisms(pn_sasl_t *sasl)
These functionality here has been replaced by pn_sasl_exclude_mechs()
and pn_sasl_get_mech()
.
void pn_sasl_plain(pn_sasl_t *sasl, const char *username, const char *password)
This function is a helper function that constructs and sends a SASL PLAIN mechanism frame. As directly sending the frames is no longer the responsibility of the application it is not needed, however it is broadly replaced with pn_transport_set_user_password()
which allows the client to set the authentication username and password.
void pn_sasl_allow_skip(pn_sasl_t *sasl, bool allow)
This function is replaced with
although the sense is reversed. By default anonymous connections are allowed. This is not really a change as previously there was no API guard against the SASL ANONYMOUS mechanism which is no better authenticated than not using SASL at all.pn_transport_require_auth(pn_transport_t *transport, bool required)
Functions Added
const char *pn_sasl_get_user(pn_sasl_t *sasl)
This is usually used at the the server end to find the name of the authenticated user. On the client it will merely return whatever user was passed in topn_transport_set_user_password()
.
The returned value is only reliable after thePN_TRANSPORT_AUTHENTICATED
event has been received. If the SASL layer was not negotiated then 0 is returned If the ANONYMOUS mechanism is used then the user will be "anonymous" Otherwise a string containing the authenticated user is returned.const char *pn_sasl_get_mech(pn_sasl_t *sasl)
Return the selected SASL mechanism: The returned value is only reliable after thePN_TRANSPORT_AUTHENTICATED
event has been received.void pn_sasl_allowed_mechs(pn_sasl_t *sasl, const char *mechs)
Specify SASL mechanisms that are to be considered for authentication. This can be used on either the client or the server to restrict the SASL mechanisms that may be used.
Multiple allowed mechanisms are space separated in the string that is passed into this function.
As a special case on the client if the only mechanism allowed is "ANONYMOUS" then this will activate a mode ("Pipelined mode") where the client will assume that the server will accept anonymous SASL connections and will not wait for the SASL protocol interchange, but will just send all the relevant SASL frames unconditionally.
However an important note is that if this is set on a client and the connected server really does require authentication then the server will fail and disconnect the connection with no useful feedback to the application as the SASL layer state will already be assuming it has authenticated.
There is no similar server pipelined mode as it is not possible together with detecting and adapting to whatever AMQP protocol layers the connecting client is using.void pn_sasl_config_name(pn_sasl_t *sasl, const char *name)
This is used to construct the SASL configuration filename. In the current implementation ".conf" is added to the name and the file is looked for in the configuration directory.
If not set it will default to "proton-server" for a sasl server and "proton-client" for a client.void pn_sasl_config_path(pn_sasl_t *sasl, const char *path)
This is used to tell SASL where to look for the configuration file. In the current implementation it can be a colon separated list of directories.
The environment variable PN_SASL_CONFIG_PATH can also be used to set this path, but if both methods are used then this pn_sasl_config_path() will take precedence.
If not set the underlying implementation default will be used.
Transport
Functions Added
const char *pn_transport_get_user(pn_transport_t *transport)
This is usually used at the the server end to find the name of the authenticated user, On the client it will merely return whatever user was passed in to
pn_connection_set_user()
.The returned value is only reliable after the PN_TRANSPORT_AUTHENTICATED event has been received.
If a the user is anonymous (either no SASL layer is negotiated or the SASL ANONYMOUS mechanism is used) then the user will be "anonymous", otherwise a string containing the user is returned.
void pn_transport_require_auth(pn_transport_t *transport, bool required)
Set whether a non authenticated transport connection is allowed.
There are several ways within the AMQP protocol suite to get unauthenticated connections:
Use no SASL layer (with either no SSL or SSL without client certificates)
Use an SASL layer but the ANONYMOUS mechanism
The default if this option is not set is to allow unauthenticated connections.
bool pn_transport_is_authenticated(pn_transport_t *transport)
Returns whether the transport is authenticated.
This property may not be stable until after the PN_CONNECTION_REMOTE_OPEN event is received.
void pn_transport_require_encryption(pn_transport_t *transport, bool required)
Set whether a non encrypted transport connection is allowed.
There are several ways within the AMQP protocol suite to get encrypted connections:- Use SSL
- Use SASL with a mechanism that supports security layers
bool pn_transport_is_encrypted(pn_transport_t *transport)
Returns whether the transport is encrypted.
This property may not be stable until after the PN_CONNECTION_REMOTE_OPEN event is received.
Connection
Functions Added
void pn_connection_set_user(pn_connection_t *connection, const char *user)
Set the authentication username for a client transport
If not set then no authentication will be negotiated unless the client sasl layer is explicitly created (this would be for something like Kerberos where the credentials are implicit in the environment, or to explicitly use the ANONYMOUS SASL mechanism).
void pn_connection_set_password(pn_connection_t *connection, const char* password)
Set the authentication password for a client transport. Note that there is no way to retrieve the password from the API as them implementation is supposed to use it and then zero it out of memory as soon as it is finished with it.
pn_connection_get_user(pn_connection_t *connection)const char *
Get the authentication username for a client transport.
New Events
- PN_TRANSPORT_AUTHENTICATED
This event is sent when a server successfully authenticates a client connection (or when it accepts an unauthenticated connection that does not need to be authenticated).On the client it is sent when the client successfully authenticates to the server. If encryption is configured for the connection, it will be also be established by the point at which this event is received. If the authentication or encryption handshake is unsuccessful then both server and client applications will receive aPN_TRANSPORT_ERROR
event.
Authentication and Encryption Combinations
This table summarises the possible combinations of SSL and SASL use. The implicit assumption of the table is that there is no point in doing double encryption or double authentication so this is not captured in the table. Double encryption would be using both SSL and SASL to encrypt the connection and likewise double authentication would be using both for authentication.
Classically in protocols where SASL was not optional the way to avoid double authentication was to use the EXTERNAL SASL mechanism. With AMQP, SASL is optional, so if SSL is used for client authentication the SASL layer could be entirely omitted and so using EXTERNAL is not necessary.
Similarly in protocols where the SASL layer is not optional the ANONYMOUS mechanism is used when client authentication is not required. With AMQP as the SASL layer can just be entirely omitted, and the SSL layer if present used only for encryption (and server authentication).
↓Auth\Encryp→ | None | SSL | SASL |
---|---|---|---|
None | 1. | Privacy only | |
SSL | 2. | ||
SASL | Auth only | 4. | 3. |
In this table "auth" means client authentication.
- Entries in pink aren't allowed/don't make any sense
- All other entries have some use
The numbered boxes are the combinations that are reasonably important:
- Anonymous, "public" connection
- Encrypted connection with client certificate
- Kerberos or DIGEST-MD5
- Usually SSL encryption with PLAIN authentication.
- All of the SSL encryption possibilities have some use and all could be used to authenticate the server for the client to avoid "man-in-the-middle" attacks.
- The right 2 columns correspond to an encrypted connection
- The bottom 2 rows correspond to an authenticated connection
Current State of Code
A substantial portion of this work is complete and ready to be included for the 0.10 Proton release (subject to code review of course). You can follow along with my changes by looking in my Github Proton repository:
https://github.com/astitcher/qpid-proton/commits/PROTON-334
The items of work that will remain to be completed are:
- Tie the SSL code into the more unified transport API:
The current code base has no changes in the SSL code for these changes. - Consequently,
pn_transport_require_encryption()
is not implemented. - There needs to be a way to inject a byte stream that already comes from SSL without autodetection
as this is how the qpidd broker will be able to use SSL with this SASL implementation. - Using SASL for encryption is currently not implemented.
- Example code.
- An implementation of the PLAIN SASL mechanism that can be used when Cyrus SASL is not available.
This will involve some application interaction as the application will have to authenticate the user, password combination itself and signal to proton whether the incoming connection is authenticated or not.
The likely mechanism will be to send a transport event and for the application to use pn_sasl_done() to signal the outcome.
27 Comments
Robbie Gemmell
I'm going to post my comments here and via email, as I dont think many (except maybe you) will actually see them on the wiki ;)
Assuming I've read the email thread correctly, you do plan on implementing EXTERNAL so that clients can be authenticated using SSL client auth + EXTERNAL. I think that is a good idea, as even though it can be ommitted not all brokers or clients will want to support doing so. There may also be cases where implementations handle the SSL by themselves and only want to do SASL via proton (albeit by reading/writing bytes currently). Possibly less so on the C side, but almost everyone seems to do that with Proton-J. Mentioning Proton-J, are there any plans there?
The completion of the SASL stage only being indicated by an event, combined with removal of the old read/write sasl bytes methods, seems to end any prospect of not using the events if you want SASL (at least, not without moving to intercepting the raw SASL frames before the transport). Does this mark the end of the old non-events API?
The method for excluding certain server mechanisms seems a bit odd to me. A method to configure the mechanisms to be used feels like it might be a better fit (either ignoring mechanism that arent installed, or throwing to indicate they arent present, or making that a choice), what do you think?
How does pn_transport_require_auth interact with old 'allow skip' functionality that let transports with SASL configured permit non-SASL connections? The two have different defaults, as skipping wasnt allowed previously (hence the old method being added), whereas apparently it now would be by default. That itself feels a little wrong to me, I think people should have to opt-in to that behaviour as they did previously.
You mention removing deprecated APIs in the form of pn_sasl_client/pn_sasl_server. I think its worth mentioning these were only to become deprecated in 0.9.
Andrew Stitcher
You can ignore the specific authentication event and at any point where you know you got an incoming AMQP connection you know it must perforce have been authenticated - if authentication had failed for any reason the server would just not be seeing the new AMQP connection.
Robbie Gemmell
3. Do you really mean to have only ANONYMOUS for 0.9 for Proton-J? That would render it fairly useless if the existing transport/sasl API is removed since that would prevent people using any existing implementation they already have (ANONYMOUS, PLAIN, CRAM-MD5 thus far for the JMS client, with EXTERNAL sitting on the TODO list).
4. I do still think there is a change (by having to handle the SASL previously, you would always know the authentication result before the connection ever happened, and you had to call done on the server to say it was, whereas now you get a transport event in either case), but I take your point that you could adjust to waiting for the successful connection open state/event to be reached and both approaches would then appear to work the same from that point.
5. I'd say the positive list of mechs to use is more straight forward in many scenarios, and the exclusion list could work better in some others. Offering both options would let people do whatever they want.
- The exclusion list might be something you used initially to e.g prevent use of ANONYMOUS, but you have that scenario covered by pn_transport_require_auth already.
- After that, it is presumably to be used for things like disabling mechs that are either not considered secure enough (e.g PLAIN, because you are requiring everying uses SCRAM-SHA-<foo> today because its Friday), or are installed but not configured/usable (e.g GSSAPI as you mentioned). Between those, you would need to know the name of every mechanism that could ever be present, and then exclude them all in case they should be installed (possibly later). Or, you might better get what you want by being able to specify only the mechs you want to be enabled and have already installed as a result.
- On what to do to do if a positively-requested mechanism isnt available, as I said we could posibly give the choice of of throwing to indicate the requirement couldn't be met or potentially making that optional, because if you positively-specify more than one you probabyl dont care which is used too much.
6. I saw the deprecation, but I wasnt aware the default had already been changed for allow skip, was that ever mentioned anywhere, e.g a JIRA raised to cover a significant change in behaviour? I think its the wrong decision personally, and code using the engine should have to opt-in to permitting non-SASL connections (and ANONYMOUS). If the rest of the current SASL API is going away and the default remains changed, then it seems allow skip should probably just be removed too, since its very likely the only people using it will be configuring the new behaviour and there is a new API for anyone wanting to configure the historical default behaviour when they realise (this will presumably be release-noted?) it becomes necessary to secure it with the new release.
Andrew Stitcher
[Ok, I'm not enjoying the confluence discussion abilities as I can't easily quote what I'm replying to, but at least all the discussion is all together, sigh, soldiering on]
I'm glad you've raised these Proton-J related issues, as I've given it very limited attention up to now (except to fiddle with the stubs to make the tests carry on working). I'm not going to address them here.
5. The exclusion list is not meant to be used to exclude ANONYMOUS in any way, as ANONYMOUS won't even be offered if authentication is required.
I think having both exclusion and inclusion in the API is the worst option, because then the coder has to understand how the two lists interact (and in the past exactly this sort of issue has been a security risk). As it stands, with the cyrus sasl code you do have both options, just at different levels, you can include mechs by using the cyrus sasl configuration files, and exclude mechs using the API.
I think the idea of forcing a specific single mech has some merit - a generalisation of the proposed "force ANONYMOUS" option - it still leaves open the question of interaction with the exclusion list though.
6. I don't (didn't?) think the change of default is that significant as all the server code in the proton tree previously allowed (or even required) anonymous , in that code even though SASL was required the ANONYMOUS mech was specified. All that the new default has done is to remove a level of unnecessary processing. Of course the situation in the Java tree might be different.
In any case, I find it hard to believe that someone writing a production secured server wouldn't think about security enough to know whether they wanted authenticated and/or encrypted connections and specify them as such.
I will emphasize that the old "skip SASL" option was not in itself related to authentication or not as the ANONYMOUS SASL mech was still allowed. The new code unifies the default across the layers and allows you to skip an unnecessary layer - This change should definitely be release noted.
Your point about just removing
pn_sasl_allow_skip()
is noted (and I agree), given my previous reasoning forpn_sasl_client()
andpn_sasl_server()
it should go (and it will force anyone using it to understand the change in default too)Robbie Gemmell
Looks like you figured the quoting out from your other comments.
I realise, I was just using that as an example of one of reasons I might use an exclude list, and highlighted that it wouldnt be required in that case because of the alternative method.
That said, ANONYMOUS presumably will be offered if the 'force anonymous' method is also in use?
I dont think we need both either, just the positive/inlusion list, but I also dont see it needing to be that confusing if we did, with the simplest case just being to make specifying them mutually exclusive.
I think if you generalise the forcing then it simply becomes a 'positive/inclusion' list that can take 1 or more entries, which removes any need for wondering about interaction with an exclusion lists or specific methods for forcing ANONYMOUS etc (which itself already has interesting interaction issues with 'requires auth' and the exclusion list, since someone might put ANONYMOUS in there even if that isnt whats intended).
We may be discussing things from different viewpoints. I dont care about the server code in the proton tree, since we can change that to do whatever we like based on what the engine does. That code is not the only use of the engine.
That you believe it to be so certain to be specified is more reason in my mind to default to the previous behaviour.
As above, it might have been allowed in our code, but that is not the only code to consider.
Unfortunately the only people likely to be using it will be the people who wouldnt really need to understand the change in default.
Andrew Stitcher
I think we are talking somewhat at cross purposes. It seems that you're talking about the Java code, and I'm talking about the C code.
To the extent that we feel the APIs need to be identical that is important, but the current usage and the history of the use of the code is different.
I'm fairly confident that the authentication issues you bring up above are just not relevant to the C code as no one (at all) has implemented anything beyond the existing ANONYMOUS and PLAIN mechanisms, and so the change of default is irrelevant, as no one who cares is using unencrypted AMQP and then they could only be using PLAIN or client certificate authentication with it. That is all the Messenger API code can do, and all the in tree API examples explicitly use ANONYMOUS.
Also note that the default and the API naming are strongly linked so if the default changes from unauthenticated allowed/unencrypted allowed the the API needs to changed too:
pn_transport_require_auth() → pn_transport_allow_unauthenticated()
pn_transport_require_encryption() → pn_transport_allow_unencrypted()
Also note that it makes no sense for the default for encryption and authentication to be different (from a naming and a security point of view)
I can see that for Java where some users have implemented further mechanisms on top of the existing API that we will need to bring those mechanisms and/or a plugin API to support extra mechanisms or use a Java library with similar capability to the C Cyrus SASL library. So either we would need to retain the existing API (and tests) for Java and make the C skip them, or delay landing the work until the Java work was at the same stage as the C work - in the interest of getting the code out there I'd favour leaving the Java API as is for the moment.
Robbie Gemmell
I'm talking about the C code changes, with mind of the history of the C engine as I know it (which has proven to be wrong at least once due to unannounced changes ), and some interest to how it might affect the Java code down the line.
As I mentioned, I don't really care about the downstream code in the Proton tree, which we can adapt to do whatever we like based on the engine behaviour at that time. I am more interested in what the engine itself is doing and how changes might affect other users of the engine. The change in default is relevant in that context since ANONYMOUS will be offered by default when it wasnt before, and non-SASL connections will be permitted by default when they werent before (prior to that recent change). Yes there are API changes, and yes people should notice such changes, but there are always some. In both cases I think the old default was better anyway personally.
It may be probable noone has implemented anything else with the C engine, but I'm not sure we can say so definitively. I'm more focused on suitability of the new API to replace what theyve done though so I dont hugely care (If I really did I'd be suggesting we dont break their existing code so abruptly, even if they are only doing PLAIN+ANONYMOUS).
I dont personally think we should need to rename them like that given their boolean arguments, but I'd also be fine with doing that if considered necessary.
As would I. Certainly it seems overly late to start changing such things if the existing SASL work does go into 0.9, and even if it was to go into a quick-turnaround 0.10 instead it would likely still need/warrant more time.
Andrew Stitcher
Cut-n-paste from proton mailing list from Alan Conway:
Andrew Stitcher
[My reply cut-n-pasted from mailing list]
(Hardly an ignorant question!) You make a very good point, and this
design may indeed be a little simplistic - largely because I've not
implemented the encryption side yet!
1. I doubt that max ssf is all that useful in practice.
2. Effectively pn_transport_require_encryption() is the same as setting
min ssf >1, but is simpler to understand! An alternative might be
pn_transport_require_ssf(int) however that isn't as clear and it's not
obvious how to choose the ssf value. Perhaps the '1' should be
configurable differently.
Some input from those who did the similar work in qpidd might be useful.
Just some random wittering.
Andrew Stitcher
[Cut-n-paste from Jakub Scholz on the proton mailing list]
I'm definitely not a Proton expert, so please excuse me if I missed
something.
But I find this part a bit dangerous:
"Classically in protocols where SASL was not optional the way to avoid
double authentication was to use the EXTERNAL SASL mechanism. With AMQP,
SASL is optional, so if SSL is used for client authentication the SASL
layer could be entirely omitted and so using EXTERNAL is not necessary."
I understand the idea and I would even agree that this is the proper way
how to do it in the long term. But I'm not sure whether all brokers support
this concept. For example, I'm not sure whether you can configure the Qpid
C++ broker in a way to accept AMQP 1.0 connections with SSL Client
Authentication without SASL EXTERNAL while at the same time accepting AMQP
0-10 connections only with SASL EXTERNAL. Therefore I would be afraid that
allowing SSL Client Authentication only without SASL could cause some
serious incompatibilities - I think both should be possible / supported.
Regards
Jakub
Andrew Stitcher
[My reply cut-n-pasted from the mailing list]
On Wed, 2015-02-25 at 10:27 +0100, Jakub Scholz wrote:
> ...
> But I find this part a bit dangerous:
> "Classically in protocols where SASL was not optional the way to avoid
> double authentication was to use the EXTERNAL SASL mechanism. With AMQP,
> SASL is optional, so if SSL is used for client authentication the SASL
> layer could be entirely omitted and so using EXTERNAL is not necessary."
>
This is really just a statement about how AMQP 1.0 works - if you like -
it is an aside praising the good protocol design sense of the standard's
authors (you know who you are!).
> I understand the idea and I would even agree that this is the proper way
> how to do it in the long term. But I'm not sure whether all brokers support
> this concept. For example, I'm not sure whether you can configure the Qpid
> C++ broker in a way to accept AMQP 1.0 connections with SSL Client
> Authentication without SASL EXTERNAL while at the same time accepting AMQP
> 0-10 connections only with SASL EXTERNAL. Therefore I would be afraid that
> allowing SSL Client Authentication only without SASL could cause some
> serious incompatibilities - I think both should be possible / supported.
And both are supported.
The qpidd 0-10 support is not going to change. The qpidd 1.0 support is
on a different code path so there is little bleed over in functionality.
The proton server code can auto detect which protocol layers the client
is using, and subject to it being an allowed protocol configuration,
authenticate it.
Other AMQP 1.0 implementations may not support leaving out the SASL
layer and so you can certainly always tell the client to use it (even if
it adds no useful functionality as in the ANONYMOUS and EXTERNAL cases).
So as far as the current plans for proton go if you require SSL client
authentication it will happen whether or not a SASL layer is there.
As EXTERNAL and better SSL integration with the transport code is not
yet implemented there may be something significant I've missed in this
analysis, in which case it's all subject to change!
Andrew Stitcher
[cut-n-pasted from Rafi Schloming on proton mailing list]
This seems like a sensible direction in general. I'm definitely interested
in seeing some example usages, but it looks like that's on your list
already.
A couple random comments on the API changes.
I noticed in your description of the API you have added
pn_transport_set_remote_hostname. It's a bit odd to have the setter but no
getter. Secondly there is a terminology conflict here. In general the API
uses the convention that remote_blah refers to information supplied by the
remote peer. You are not using it that way, you are using it to communicate
expected information about the remote peer, and that is actually local
information, not remote information. It really doesn't ever make sense to
have a public set_remote_blah anywhere given the API conventions since the
only place you are ever going to set those fields is with information
supplied to you over the wire by the remote peer.
Also, I'm not convinced you actually need this at all since AMQP defines
the connection hostname and there is already a setter for this. The value
that goes into the hostname frame of a connection is defined to be the
expected/desired hostname that the client will authenticate against (think
vhosts). It is also required to be the same as the one specified in the
sasl-init frame if present. That means pn_connection_set_hostname and
pn_transport_set_remote_hostname as you have defined it are effectively
aliases which is a confusing state of affairs on its own even without the
general conventions around what the remote prefix means.
Another getter/setter asymmetry is the the get_user which doesn't have a
corresponding set_user directly, but does have an indirect one through
set_user_password.
I know the getter/setter stuff is a bit niggling, but beyond being nice to
follow the general conventions, it's actually quite awkward when it comes
to the bindings. In general get/set_foo is mapped into a property, e.g.
obj.foo/obj.foo = blah, and this breaks down if you start having set_foo
without get_foo (write only attributes are kind of ugly/unexpected).
I'd suggest you have get/set_user with set_user being independent of
setting the password. That will map more naturally into an ordinary "user"
property in the bindings, and the setter can return an error code for a
server transport if you want it to be read-only. I also think this is nicer
in general since I expect there are cases where you may want to supply the
password in a different place than you configure the user, and with them
both bundled into the same call, that gets awkward.
Andrew Stitcher
Philosophically I don't agree that you need to make all properties read/write. I see no particular reason to make these properties readable since they will never change once they are set, or in the case of the password should actually not be accessible once set (because the implementation should be scrubbing the bytes from memory after use). In my view if the application needs the value again it already has it.
In the case of the read-only property authenticated user I definitely think that needs to be read only.
Having said that, I don't feel that strongly about getters for the client username and hostname.
I take your point - I'm happy to remove "remote" from the API name - would "connected" be all right?
pn_transport_set_hostname()
just doesn't seem specific enough to me - it might just as well be telling the transport what our hostname is.pn_transport_set_<something>_hostname()
Yes these are definitely redundant, and I would need to deprecate the connection version and set it from the transport when binding them together - good catch.
The transport version must be primary, as (in principle at least, if not in the current implementation) you don't need the connection object until you have authenticated the transport and authentication and encryption may to know need the hostname you are connecting to. I think it would have to be an error to rebind (on reconnect) a connection with a different hostname than the transport hostname.
That would make sense. However from this conversation I'm wondering actually if we should more carefully distinguish the client and server ends. And so have a client only API to set user/password and a server only API to extract the authenticated username.
So in conclusion how about:
pn_transport_set_remote_hostname()
→pn_transport_set_connect_hostname()
(connect/connected/connection?)pn_transport_get_connect_hostname()
(connect/connected/connection?)pn_connection_set/get_hostname()
in favour ofpn_transport_set/get_connect_hostname()
Actually changing the
pn_transport_bind()
code would be required too.pn_transport_set_user_password()
andpn_transport_get_user()
pn_transport_set_client_user()
,pn_set_client_password()
,pn_transport_get_client_user()
andpn_transport_get_authenticated_user()
.You'll note I've added in the getters that I don't think are necessary, but that you do (hostname and client_user), but I've maintained the password property as write only and the authenticated user as read only.
Andrew Stitcher
[Pasted from email from Rafi]
>
> 1. Setters with no getters
> Philosophically I don't agree that you need to make all properties
> read/write. I see no particular reason to make these properties readable
> since they will never change once they are set, or in the case of the
> password should actually not be accessible once set (because the
> implementation *should* be scrubbing the bytes from memory after use).
> In my view if the application needs the value again it already has it.
> In the case of the read-only property authenticated user I definitely
> think that needs to be read only.
> Having said that, I don't feel that strongly about getters for the
> client username and hostname.
>
> There's actually an important point here worth noting. With reactive use,
I don't think it's true, pragmatically speaking, that the application has
the value again when it needs it. When your primary means of operating on a
connection is through handling events, the only state you have easy access
to is what is provided by those events. Taking your suggestion, if I wanted
to do something simple like log a debug statement from an event handler and
include the hostname and/or username of the connection in that info, my
only recourse would be to malloc some sort of ancillary struct and attach
it to the connection and fill it in with the very same hostname that the
connection is holding internally, or alternatively access some sort of
global state that holds a map from the connection back to that same
information. If your point is that this is possible then of course that's
true, but it doesn't seem at all reasonable.
>
> 1. inconsistency with existing use of "remote" in API
> I take your point - I'm happy to remove "remote" from the API name -
> would "connected" be all right? pn_transport_set_hostname() just
> doesn't seem specific enough to me - it might just as well be telling the
> transport what *our* hostname is.
> 2. Redundancy of pn_connection_set_hostname() and
> pn_transport_set_<something>_hostname()
> Yes these are definitely redundant, and I would need to deprecate the
> connection version and set it from the transport when binding them together
> - good catch.
> The transport version must be primary, as (in principle at least, if
> not in the current implementation) you don't need the connection object
> until you have authenticated the transport and authentication and
> encryption may to know need the hostname you are connecting to. I think it
> would have to be an error to rebind (on reconnect) a connection with a
> different hostname than the transport hostname.
>
> This isn't consistent with how the connection and transport are actually
used. With the reactive API, the user creates the connection endpoint first
and configures it with all the relevant details that it cares about. The
transport isn't created at all until the client actually opens the
connection (which could be somewhere completely different from where it
configures the connection). It's also important to note that the user
doesn't actually create the transport at all. The default handlers do this
when the PN_CONNECTION_LOCAL_OPEN event occurs. The users don't even need
to be aware that a transport object exists at all if they don't care to
customize it. This is a nice property and would be difficult to maintain if
you start pushing connection level configuration like hostname into the
transport.
I think if you flip things around it actually makes more sense. As a server
you are going to have a transport prior to having a connection, and in this
case you want to access the hostname-that-your-remote-peer desires for
vhost purposes, but it makes no sense to actually set it. As a client, a
transport is pretty much useless until it is bound to a connection, as you
can't safely do much more than send the protocol header, so the natural
sequence is to create the connection first and set the hostname you want to
connect to, and not worry about the Transport.
>
> 1. Having a separate set_user/set_password
> That would make sense. However from this conversation I'm wondering
> actually if we should more carefully distinguish the client and server
> ends. And so have a client only API to set user/password and a server only
> API to extract the authenticated username.
>
> So in conclusion how about:
>
> - Changing pn_transport_set_remote_hostname() →
> pn_transport_set_connect_hostname() (connect/connected/connection?)
> - Adding pn_transport_get_connect_hostname()
> (connect/connected/connection?)
> - Deprecating pn_connection_set/get_hostname() in favour of
> pn_transport_set/get_connect_hostname()
> Actually changing the pn_transport_bind() code would be required too.
> - Removing pn_transport_set_user_password() and pn_transport_get_user()
> - Replacing them with pn_transport_set_client_user(),
> pn_set_client_password(), pn_transport_get_client_user() and
> pn_transport_get_authenticated_user().
>
> It might make sense to follow a similar pattern as the hostname here, i.e.
in the client scenario you configure the connection with
pn_connection_get/set_user() prior to the existence of the transport, and
in the server scenario the transport simply provides
pn_transport_remote_user(...).
> You'll note I've added in the getters that I don't think are necessary,
> but that you do (hostname and client_user), but I've maintained the
> password property as write only and the authenticated user as read only.
>
I wasn't suggesting that you need to provide an accessor for the password,
I was suggesting that if you don't provide one, you shouldn't call the
"setter" pn_transport_set_password(...) since that will violate
expectations by not having a corresponding pn_transport_get_password(...).
--Rafael
Andrew Stitcher
This is exactly why all event based systems have a way to attach some arbitrary application context. Obviously I take your point that it's a lot easier if you don't have to use it, and being able to extract values you passed into the API is a good way to achieve this. So I concede this point for many properties.
However, I totally disagree, with the idea that a property is some sort of typographical concept. As far as I can tell the in modern programming languages (C#, Python are the ones I know well enough to state this about) properties are associated with getters and setters, but can still be read-only or write-only as well as read-write.
So I cannot agree that having a foo_set_blah() implies a foo_get_blah() must exist. I agree that in a C style API with no language support for properties foo_set_blah() implies that blah is a property of foo, but that is all, so if password is a write only property (and if not then what is it) ..._set_password() seems to be the correct thing to call it.
The C API naming scheme is noun_verb or noun_verb_noun (isn't it - is this documented somewhere? Emails don't count as documentation) so pn_connection_password() doesn't fit - what does password as a verb mean?. And if it is an noun, without consulting documentation, does it set the password or return it?
Incidentally, I hate the whole convention that adds these 'set' and 'get' elements to API names, but in C it's hard to much else.
Ah, Thank you for this insight, as fas as I can tell none of the test code dealing with SASL works (obviously) like this, and so I'd not considered that the client usage is significantly different from the server usage - I'm going to have to mull this over a bit.
vhost functionality is not my use case at all, and I'm not sure it fits in the transport code in any picture - it seems a purely AMQP level construct (meaning I don't think this proposal needs to care).
I think that the remote hostname is only important at the client end (for authentication and encryption purposes at least) so I guess the API I'm talking about here needs to be on the connection object then - So it would seem the transport should take it upon binding to a connection and we don't need an extra API.
Similarly given the above understanding it makes sense to make user/password properties of the connection (as they are also purely set in the client)
Is it always true (without exception) that you need a connection before connecting using the transport at the client end?
Rafael H. Schloming
I think the one way that vhosts intersect with this area is that in order for them to be useful, you need to be able to somehow make authentication be scoped to the vhost. The vhost (the host the client believes it is attempting to connect to) is actually carried in three places on the wire. One place is inside TLS somewhere (maybe an extension), the other place is one of the sasl frames (sasl-init I think) and finaly of course in the open frame of the connection.
It's likely as a server that you would want access to the host the client is trying to connect to quite early on, possibly before you have bothered constructing a connection. I think it's reasonable to assume that each layer (TLS, SASL, and AMQP) would each want to be able to have completely independent configuration for different hosts.
I think that all makes sense, with the possible exception of the vhost scenario mentioned above.
Strictly speaking you don't have to create the connection first on the client side, but there is very little the transport can do without it. The transport supports processing an incoming open frame while unbound in order to provide early access to certain transport related things in the open frame (max frame size, idle timeout, etc), however because the open frame can contain stuff that needs to be supplied by the application, the transport has to wait in all cases until it is bound to send the outgoing open frame.
Andrew Stitcher
The connected to hostname is only used by the SASL client, the SASL server doesn't have any access to it - at least not for the Cyrus SASL library AFAIK, so it's going to have to come from SSL or the AMQP open frame.
I think this answer is equal to yes for the purposes of my question! As the question was just about getting the connected to hostname and user/password.
Gordon Sim
I had hoped to see an API that allowed the application to more easily control the SASL exchange itself, with particular integrations layered on to of that. The current approach seems to be even more of an all-or-nothing approach; you get the built in cyrus-sasl integration or you do it all yourself. Having events and methods to control a SASL dialogue would be nice. This may still be coming at least in simplified form if I understand the following:
However that does raise the question whether PLAIN without cyrus-sasl is currently possible with your work from the client side?
Andrew Stitcher
Could you explain why you would want easier control of the SASL interchange (where a Cyrus SASL implementation is available) rather than just having it taken care of? In the case where the Cyrus SASL implementation is unavailable my plan would indeed be to have a built in PLAIN implementation - although this is not currently implemented on either client or server side - the client side implementation is straightforward; the server side requires an extra event to tell the server to authenticate the user/password pair and set the authentication outcome.
Gordon Sim
Better layering and less of an all-or-nothing approach. I may want to use some other sasl library for example.
Gordon Sim
In handling a PN_TRANSPORT_ERROR, how would one get sufficient detail about the issue (e.g. sasl authentication error, certificate problem with ssl, invalid peername with ssl etcl? The event would allow access to the transport, but how would one get at any associated ssl or sasl object?
So for SASL, on the server this event is dispatched when the outcome is ready to be sent to the client? And on the client it is dispatched when the outcome has been received? What would be the intended uses of this event? I.e. what sorts of things might an application do here? Could/should the opening of a connection be delayed to this point?
If SSL is used beneath SASL, and SSL based client authentication is turned on, do we get two different authenticated events? What about for SSL server authentication?
Andrew Stitcher
You can always retrieve the ssl object using
pn_transport_ssl(
) and the sasl object usingpn_transport_sasl()
. You may think of these as constructors, but a more accurate description would be accessor with lazy construction. It's very likely that the precise error indications need to be thought through.The PN_TRANSPORT_AUTHENTICATED event is dispatched just once, when the transport can tell that the transport constraints have been met. To be more concrete: when the AMQP 1.0 header is encountered in the protocol stream without any error (authentication passed one way or another, we have adequate ssf etc.) then this event is sent. Because it's more general than just authentication I think I will rename this to PN_TRANSPORT_AUTHORIZED or perhaps PN_SECURITY_AUTHORIZED.
Currently I can't think of a real use for this on the client side, but adding it makes client and server more symmetric. On the server side this is currently redundant as you can use the PN_CONNECTION_REMOTE_OPEN to indicate almost the same state, however in principle you don't need to create a
pn_connection
on the server side until you have a good authenticated transport connection. Currently this isn't possible because you need thepn_connection
object to receive events.As the event is triggered by reaching a specific point in the protocol interchange you will only get this event once. If there is redundant authentication (and you care about it for some reason) you will need to go and inspect the underlying SSL/SASL objects after the event is received.
Gordon Sim
Do you anticipate any changes to the configuration of SSL from how it is now (which from my limited experience is fairly intuitive yet rich)?
Andrew Stitcher
I don't anticipate removing any API, however I would by default use the higher level connected hostname and user name instead of requiring them to be set using the SSL API - I haven't looked at this in detail yet. After looking at the detail I might think that deprecating the SSL versions of those APIs makes sense.
The other aspect of that work is to extract the SSF and client authentication for use in EXTERNAL/avoiding double encryption.
Gordon Sim
Repeating here a point made on the list, it would be very useful to have some examples to understand the intended usage for different scenarios including e.g.
If this is already available somewhere in your code, a link to that would suffice.
Andrew Stitcher
Some of this will come in the modifications to the current code. However this is more explicitly the "Example code" work item.
Andrew Stitcher
Note that I've updated the main text to reflect the latest state of the code, which addresses many/most of the comments - specifically the ones from Rafi and Robbie.