Background
Geode allows querying data stored in the regions through the usage of a query language, namely OQL
. This query language provides, between other things, the ability to invoke methods on the objects stored within the region as part of the query itself.
This particular ability was not limited to specific calls, but it allowed any method invocation on the objects, including setters
(which, by definition, change the object itself) and, through reflection, internal Geode and JDK methods available on the classpath. Due to the range of methods that could be invoked, several security risks were identified that could impact the integrity of the data, the integrity of the region, and/or the integrity of the platform running Geode. As part of the effort to mitigate these security flaws, GEODE-3247 was created and fixed in Geode 1.3.0 such that the OQL
engine now operates through an all or nothing approach: when a SecurityManager
is enabled Geode throws a NotAuthorizedException
whenever a method, not belonging to the following whitelist, is invoked:
- String object: any method.
- Boolean object: booleanValue.
- Region.Entry object: getKey or getValue.
- Any Object: equals, compareTo, or toString.
- Number object: intValue, longValue, shortValue, etc.
- Date or Timestamp object: after, before, getNanos, or getTime.
- Map, Collection, or Region object: keySet, entrySet, values, containsKey or get.
This approach has proven to be way too restrictive (no getter
methods can be invoked on deployed objects, as an example) and lacks the ability the customize the behavior, leaving the user stuck with the all or nothing configuration.
Objectives / Goals
- Enhance the OQL
engine to support customizing the whitelist of allowed methods.
- Enable the cluster configuration service to store, delete and change the customized whitelist through gfsh
commands.
Current Behavior
There's a single internal interface, MethodInvocationAuthorizer
, responsible of verifying whether a method can be invoked on a certain object for a particular region. The authorizer is tied to the QueryService
and accessed / invoked by the classes MethodDispatch
and AttributeDescriptor
every time the query engine needs to invoke a object's method as part of the OQL
execution.
Geode instantiates the DefaultQueryService
class every time the user requests a QueryService
for executing OQL
queries, within the initialization:
- If the security manager is not enabled, or if the flag
QueryService.allowUntrustedMethodInvocation
is set astrue
, theMethodInvocationAuthorizer
is configured as a no-op class. - If the security manager is enabled and the flag
QueryService.allowUntrustedMethodInvocation
is set asfalse
(default), theRestrictedMethodInvocationAuthorizer
is configured. Whenever invoked, this class first verifies that the method is included within the list of whitelisted methods and, if it is, it checks whether the user has the required privileges to execute queries on that particular region (DATA:READ:regionName
).
Proposal
Query Engine
Split the MethodInvocationAuthorizer
in two different interfaces:
RegionInvocationAuthorizer
: internal with a default implementation provided out of the box. It should just check whether the user can execute queries on the region (DATA:READ:regionName
).MethodInvocationAuthorizer
: public, the default implementation out of the box will beRestrictedMethodInvocationAuthorizer
. This interface is intended to be implemented by users whenever they want a more fine grained control about which methods should be whitelisted, it can extend the default implementation or totally override it. The interface will have only one method and it should throwNotAuthorizedException
(non-checked exception) whenever it detects that the user should not be allowed to execute that particular method during theOQL
execution. The whole information about the method to invoke and the target class can be obtained from theMethod
class itself, so the signature won't have the region as part of the parameters.
public interface MethodInvocationAuthorizer extends Declarable {
/**
* Checks whether the method is authorized to be executed or not.
* The implementation should only consider the method and the object on which it will be executed,
* region access has been verified earlier.
*
* @param method The method to authorize.
* @throws NotAuthorizedException If the method can't be executed due to security restrictions.
*/
void authorizeMethodInvocation(Method method) throws NotAuthorizedException;
}
The DefaultQueryService
will use an empty implementation of both interfaces if security is disabled or the flag QueryService.allowUntrustedMethodInvocation
is set as true
(as it works right now). If security is enabled the default RegionInvocationAuthorizer
will be used, and the MethodInvocationAuthorizer
will be set as the default (blacklists everything) or, if present, the custom implementation will be used. In runtime, the RegionInvocationAuthorizer
will be always invoked first; if it succeeds and the user has privileges to query the region, the configured implementation of MethodInvocationAuthorizer
will be executed.
Once the split is done, Geode could provide several implementations out of the box to match common use cases (like whitelisting methods specified by custom patterns), and even integrate some or all of those implementations with gfsh
commands to manage the behavior in runtime.
Configuration
Add <query-service>
as a new element to the schema definition.
The <query-service>
element must be defined at the cache level and it will contain any configuration related to OQL
, including the custom MethodInvocationAuthorizer
. This will also allow further additions and configuration options in the future, or even replace the current system properties used to configure the QueryService
with new child elements and/or attributes in the <query-service>
. The element would look like the following (the properties are just examples):
<cache>
<query-engine>
<queryVerbose>true</queryVerbose>
<queryHeterogeneousObjects>true</queryHeterogeneousObjects>
<allowUntrustedMethodInvocation>false</allowUntrustedMethodInvocation>
<method-authorizer>
<class-name>com.customer.MyMethodAuthorizer</class-name>
<parameter name="url">
<string>jdbc:mysql://myHost/whitelistDatabase</string>
</parameter>
</method-authorizer>
</query-engine>
<cache>
This new element and its properties should be stored as part of the cluster configuration service, and should also be modifiable through gfsh
commands.
14 Comments
Jinmei Liao
I like having the notion of method authorizer to have user define what methods can invoked and what not. It's the configuration that's a bit complicated. User needs to deploy a class to the server (or start the server with that jar on the classpath) first and then configure it using gfsh or cache.xml. I don't think we want user to change this configuration after the server is running. What about making query.service.method.authorizer a gemfire system property?
Juan Ramos
Hello Jinmei Liao,
Having the system property is doable, but that also requires deploying the class to the server and modifying the startup scripts (or java source code) to set the environment accordingly anyway, so there's no gain in usability using that approach. Having a separate configuration element for the query engine seems the right approach if you ask me, even if we might not want to change the method authorizer in runtime at the moment, that could change in the future and, moreover, we could use the element to entirely configure other aspects of the query engine in the future (new features or flags as example), so introducing the new element now will ease the addition of new flags/features.
Just my two cents.
Anthony Baker
Have you considered using annotations instead of configuration?
Juan Ramos
No I haven't... do you mean creating a new annotation, let's say
MethodAuthorizer
, and scan every deployed jar to check whether the class is supposed to be used as the custom method authorizer implementation?. Or adding a new annotation, let's sayAuthorizedMethod
, that a user must add to the allowed methods on their domain model and somehow keep that information on the objects so the OQL engine knows that the method can be invoked in runtime?.Anthony Baker
This really comes down to where we draw the trust boundary. At the same time we need to balance usability and security.
I'm not sure that allowing any
get*
method invocation is safe. On the other hand forcing the user to specify configuration via XML, callbacks, or annotations is an additional usability cost.Can we automatically detect that a jar was deployed by a privileged user and therefore trusted? I think trying to protect against malicious code deployed by a user who has appropriate privileges is unattainable.
Juan Ramos
Hello Anthony Baker,
I guess we could try detecting all jars deployed by the user and directly trust any
get*
method on those objects, but what if the user has some custom accessor methods, no just plainget*,
that they want to execute during anOQL
query?, a simple calculation based on the object's attributes as an example, or anything else they want to allow during the query execution. These use cases are completely left out if we choose to trust onlygetters
.Agreed about the usability... however, the two currently offered approaches will still be there: allow everything or deny almost everything (
RestrictedMethodInvocationAuthorizer
), and they require practically nothing on the user side. This new configuration should only be added whenever the user wants a more fine grained control about the method authorization, and we can deliver an out of the box implementation (like with PDX) to cover some basic use cases (allow only methods from objects deployed throughgfsh deploy
, allow methods that match a regular expression, etc.).Cheers.
Anthony Baker
We can separate the implementation approach from the end-user UX. As a developer I want zero-config, no intrusion on my domain objects. Queries should just work .
I agree that
get*
is an arbitrary and probably insufficient convention that is also potentially exploitable. What I'm suggesting is just trusting user-deployed code. Any method (maybe exceptinggetClass
) is invokable as long as we know the code was deployed by a user with correct privs.Juan Ramos
Hello Anthony Baker,
Sounds good to me, but I still believe some uses cases are left out if we don't provide a way for users to inject their own implementation of the
MethodInvocationAuthorizer
interface... as Jason Huynhpreviously said: customers can deploy a bunch of domain classes to the servers and store them within the regions, but they might also want to restrict the list of methods from those objects that can be executed within an OQL (queries are widely used throughPULSE
andgfsh
by operators instead of developers).What about adding an out of the box implementation of the
MethodInvocationAuthorizer
to only trustget*
on user-deployed code and those methods already whitelisted inRestrictedMethodInvocationAuthorizer,
using this implementation as the default whenever asecurity-manager
is enabled?. This will guarantee zero-config + no intrusion on customer's domain objects, and queries will just work out of the box using the defaults. If the user wants to customize further the behavior, they can just provide a new implementation ofMethodInvocationAuthorizer
and do whatever is needed.Thoughts?.
Jason Huynh
I am a bit worried about allowing any method on a deployed class. Mostly due to setters and any mutators. A trusted user may deploy a domain object with getters/setters but not necessarily expect a user with read privileges to be able to invoke a setter.
I do like the annotations idea, allowing a developer to annotate methods they allow anyone with read/query privileges to invoke. Not sure if there are any downsides to that approach.
Jason Huynh
How would annotations work if the user never intends to load the class, like for pdx? If the user did puts with pdx objects and does not have the class on the server, the oql engine would not have annotations to determine what fields are callable. I guess that's one downside of annotations?
Anthony Baker
If the domain classes don't exist on the server, we don't need to worry about preventing access right? The trust boundary continues to exclude the domain classes because they don't exist. However, queries that use PDX serialized data should still work because
PDXInstance
is inside the trust boundary. It's only when the domain objects are present that we need to define the trust boundary.John Blum
It is a moot point anyway since you cannot invoke a method on a PDX serialized object without causing a deserialization first.
For example, I might have a Person class with an "age" property defined as...
Clearly the "age" property is not going to have a "field" in the PdxInstance. So, in order to determine a person's age in a query...
Then PDX must deserialize PDX bytes to a `Person` type first.
Juan Ramos
Jason Huynh Anthony Baker : any updates on this?, how should we proceed regarding the proposal?.
Anthony Baker
I don't think we've got consensus yet on what the new trust boundary should be and whether that needs to allow user configuration.