This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Child pages
  • Proxying a UI using Knox
Skip to end of metadata
Go to start of metadata
PLEASE NOTE THAT THIS IS A WORK IN PROGRESS

This page covers adding a UI proxy service to Apache Knox.

You may want to review the Apache Knox User’s Guide and Developer’s Guide before reading this. Also the excellent article written by Kevin Minder Adding a Service to Knox that is used as a template here should be read so that you have the basics under your belt with regards to REST services, service definition files, rewrite files, rewrite rules and how they all play together.

If you are new to Knox you may also want to check out ’Setting up Apache Knox in three easy steps’.

The example UI we will use to proxy via Knox can be found here

There is a README that should help you get started. The crux of it is this if you have Node.js and npm installed already:
git clone https://github.com/sumitg/angular2-quickstart.git my-proj
cd my-proj
npm install
npm start

This will get you to a very basic hello world page on localhost:3000. So point your browser to this URL:
http://localhost:3000/
We now want to get to a URL like the one below which goes through Knox. 

 

https://localhost:8443/gateway/sandbox/example/

 

So the fundamental job of the gateway is to translate the effective request URL it receives to the target URL and then transfer the request and response bodies. We will at first ignore the request and response bodies and focus on the request URL and then so some interesting things to the response body to make it all look right. For all this to happen we will create a new Service called 'EXAMPLEUI'.
 
Lets take a look at how the two request URLs are related.
SourceURL
Gatewayhttps://localhost:8443/gateway/sandbox/example
Directhttp://locahost:3000


We can start by breaking down the Gateway URL and understanding where each of the URL parts come from.

PartDetails
httpsThe gateway has SSL/TLS enabled: See ssl.enabled in gateway-site.xml
localhostThe gateway is listening on 0.0.0.0: See gateway.host in gateway-site.xml
8443The gateway is listening on port 8443: See gateway.port in gateway-site.xml
gatewayThe gateway context path is ‘gateway’: See gateway.path in gateway-site.xml
sandboxThe topology file that includes the EXAMPLEUI service is named sandbox.xml
example The unique root of all EXAMPLE UI service URLs. Identified in service’s service.xml  

In contrast we really only care about one part of the service’s Direct URL.
PartDetails
http://localhost:3000The network address of the service itself.

 

Now we need to get down to the business of actually making the gateway proxy this service. To do that we will be using the new configuration based extension model introduced in Knox 0.6.0. That will involve adding two new files under the <GATEWAY_HOME>/data/services directory and then modifying a topology file.

Note: The <GATEWAY_HOME> here represents the directory where Apache Knox is installed.

First you need to create a directory to hold your new service definition files. There are two conventions at work here that ultimately (but only loosely) relate to the content of the service.xml it will contain. Below the <GATEWAY_HOME>/data/services directory you will need to create a parent and child directory exampleui/0.0.1. As a convention the names of these directories duplicate the values in the attributes of the root element of the contained service.xml.

Create the two files with the content shown below and place them in the directories indicated. The links also provide the files for your convenience.

 

<GATEWAY_HOME>/data/services/exampleui/0.0.1/service.xml

<service role="EXAMPLEUI" name="exampleui" version="0.0.1">
    <policies>
        <policy role="webappsec"/>
        <policy role="authentication" name="Anonymous"/>
        <policy role="rewrite"/>
        <policy role="authorization"/>
    </policies>
    <routes>
        <route path="/example">
        </route>
        <route path="/example/**">
        </route>
    </routes>
    <dispatch classname="org.apache.hadoop.gateway.dispatch.PassAllHeadersDispatch"/>
</service>


<GATEWAY_HOME>/data/services/exampleui/0.0.1/rewrite.xml

 
<rules>
    <rule dir="IN" name="EXAMPLEUI/exampleui/inbound/root" pattern="*://*:*/**/example/">
        <rewrite template="{$serviceUrl[EXAMPLEUI]}/"/>
    </rule>
    <rule dir="IN" name="EXAMPLEUI/exampleui/inbound/path" pattern="*://*:*/**/example/{**}">
        <rewrite template="{$serviceUrl[EXAMPLEUI]}/{**}"/>
    </rule>
</rules>
 

Once that is complete, the topology file must be updated to activate this new service in the runtime. In this case the sandbox.xml topology file is used but you may have another topology file such as default.xml. Edit which ever topology file you prefer and add the… markup shown below. If you aren’t using sandbox.xml be careful to replace sandbox with the name of your topology file through these examples.


<GATEWAY_HOME>/conf/topologies/sandbox.xml


 
<topology>
  ...
  <service>
    <role>EXAMPLEUI</role>   
    <url>http://localhost:3000</url>
  </service>
</topology>

With all of these changes made you must restart your Knox gateway server. Often times this isn’t necessary but adding a new service definition under [<GATEWAY_HOME>/data/services requires restart.

You should now be able to URL from way back at the top that accesses the UI via the gateway.

 

https://localhost:8443/gateway/sandbox/example/

 

*Please note* the slash at the end in the above URL. We need to do some more work to get rid of that dependency and that is explained in a section further. 
For now you should be able to access the page after getting past some certification warning if you don’t have a ‘proper’ certificate installed.
Now that the new service definition is working lets go back and connect all the dots. This should help take some of the mystery out of the configuration above. The most important and confusing aspect is how values in different files are interrelated. I will focus on that.

service.xml

The service.xml file defines the high level URL patterns that will be exposed by the gateway for a service. If you are getting HTTP 404 errors there is probably a problem with this configuration.

 

<service role=“EXAMPLEUI"

  • The role/implementation/version triad is used through Knox for integration plugins.
  • Think of the role as an interface in Java.
  • This attribute declares what role this service “implements”.
  • This will need to match the topology file’s <topology><service><role> for this service.

 

<service name=“exampleui"

  • In the role/implementation/version triad this is the implementation.
  • Think of this as a Java implementation class name relative to an interface.
  • As a matter of convention this should match the directory beneath <GATEWAY_HOME>/data/services
  • The topology file can optionally contain <topology><service><name> but usually doesn’t. This would be used to select a specific implementation of a role if there were multiple.

 

<service version="0.0.1"
  • As a matter of convention this should match the directory beneath the service implementation name.
  • The topology file can optionally contain <topology><service><version> but usually doesn’t. This would be used to select a specific version of an implementation there were multiple. This can be important if the protocols for a service evolve over time.

 

<service><routes><route path=“/example/**"

  • This tells the gateway that all requests starting starting with /example/ are handled by this service.
  • Due to a limitation this will not include requests to /example (i.e. no trailing /) so we need another rule for that
  • The ** means zero or more paths similar to Ant.
  • The scheme, host, port, gateway and topology components are not included (e.g. https://localhost:8443/gateway/sandbox)
  • Routes can, but typically don’t, take query parameters into account.
  • In this simple form there is no direct relationship between the route path and the rewrite rules!

<policies>
        <policy role="webappsec"/>
        <policy role="authentication"name="Anonymous"/>
        <policy role="rewrite"/>
        <policy role="authorization"/>
    </policies>
  • This sets up the policies (providers) to be used by this specific service. This overrides the topology level providers for the same role. Here for instance the "Anonymous" authentication provider is key. If you do not add this list of policies here, you need to either have a topology file with an Anonymous authentication provider specified or get challenged for authentication in the browser depending on what authentication mechanism you choose.

 

    <dispatch classname="org.apache.hadoop.gateway.dispatch.PassAllHeadersDispatch"/>

 

  • This service specifies a special Dispatch that passes through all the headers and unlike the default dispatch that is used for REST API invocations, this dispatch does not attempt to do any authentication or kerberos handshake on behalf of the original request. 

 

rewrite.xml

The rewrite.xml is configuration that drives the rewrite provider within Knox. It is important to understand that at runtime for a given topology, all of the rewrite.xml files for all active services are combined into a single file. This explains some of the seemingly complex patterns and naming conventions.

 

<rules><rule dir="IN"
  • Here dir means direction and IN means it should apply to a request.
  • This rule is a global rule meaning that any other service can request that a URL be rewritten as they process URLs. The rewrite provider keeps distinct trees of URL patterns for IN and OUT rules so that services can be specific about which to apply.
  • If it were not global it would not have a direction and probably not a pattern in the element.

 

<rules><rule name=“EXAMPLEUI/exampleui/inbound"

  • Rules can be explicitly invoked in various ways. In order to allow that they are named.
  • The convention is role/name/<service specific hierarchy>.
  • Remember that all rules share a single namespace.

 

<rules><rule pattern="*://*:*/**/example/{**}"

  • Defines the URL pattern for which this rule will apply.
  • The * matches exactly one segment of the URL.
  • The ** matches zero or more segments of the URL.
  • The {**} matches zero or more query parameters and provides access to them by name.
  • The values from matched {…} segments are “consumed” by the rewrite template below.

 

<rules><rule><rewrite template="{$serviceUrl[EXAMPLEUI]}/{path=**}?{**}"
  • Defines how the URL matched by the rule will be rewritten.
  • The $serviceUrl[EXAMPLEUI]} looks up the <service><url> for the <service><role>EXAMPLEUI. This is a implemented as rewrite function and is another custom extension point.
  • The {**} extracts any “unused” parameters and uses them as query parameters.

sandbox.xml

 

<topology><service><role>EXAMPLEUI
  • This causes the service definition with role EXAMPLEUI to be loaded into the runtime.
  • Since <name> and <version> are not present, a default is selected if there are multiple options.

 

<topology><service><url>http://localhost:3000

  • This populates the data used by {$serviceUrl[EXAMPLEUI]} in the rules with the correct target URL.

 

 

Taking care of slash’ness

So now the question of the trailing slash comes in. With that comes the opportunity to talk about rewriting parts of the body of the response. 
So again, the goal is to get this URL to work (no trailing slash)
The problem is that the page has links as the ones below (this can be seen by viewing the page source in the browser).
<link rel="stylesheet" href="styles.css">

<!-- Polyfill(s) for older browsers -->

<script src="node_modules/core-js/client/shim.min.js"></script>

<script src="node_modules/zone.js/dist/zone.js"></script>

<script src="node_modules/reflect-metadata/Reflect.js"></script>

<script src="node_modules/systemjs/dist/system.src.js"></script>

<script src="systemjs.config.js"></script> 
All these ‘href' and ‘src' tags work great if the main page is at the root of the URL, like http://localhost:3000, but once you have a path in the URL like the additional path needed by the gateway e.g. ‘gateway/sandbox/example’ then they don’t resolve too well without the slash. For instance, href=“styles.css” without the trailing slash will resolve to 
To remedy this, we are going to rewrite these ‘href' and ‘src’ tags so that path is fully qualified i.e. styles.css should become ‘/gateway/sandbox/example/styles/css’.
Here are the new rules we need.

new rewrite.xml rules

 
<rule dir="OUT" name="EXAMPLEUI/exampleui/outbound/systemjs" pattern = "systemjs.config.js">
    <rewrite template="{$frontend[path]}/example/systemjs.config.js"/>
</rule>
<rule dir="OUT" name="EXAMPLEUI/exampleui/outbound/styles" pattern="styles.css">
    <rewrite template="{$frontend[path]}/example/styles.css"/>
</rule>
<rule dir="OUT" name="EXAMPLEUI/exampleui/outbound/nodemodules" pattern="node_modules/{**}">
    <rewrite template="{$frontend[path]}/example/node_modules/{**}"/>
</rule>
 
Now the rules include rewriting the response in the OUTBOUND direction, going out from Knox to the browser. These rules leverage a Rewrite function $frontend[path] to get the ‘/gateway/sandbox’ portion of the URL.
Please note that every time you change the rewrite or service files you need to redeploy the topology. This can be either done by touching the topology file or by deleting the deployed topology directory and restarting Knox.
Now the links shown previously for the main page should look like this:
<link rel="stylesheet" href="/gateway/sandbox/example/styles.css">

<!-- Polyfill(s) for older browsers -->

<script src="/gateway/sandbox/example/node_modules/core-js/client/shim.min.js"></script>

<script src="/gateway/sandbox/example/node_modules/zone.js/dist/zone.js"></script>

<script src="/gateway/sandbox/example/node_modules/reflect-metadata/Reflect.js"></script>

<script src="/gateway/sandbox/example/node_modules/systemjs/dist/system.src.js"></script>

<script src="/gateway/sandbox/example/systemjs.config.js"></script> 
The page however still doesn’t render correct. This is because while we can now load the JS files, they contain unresolvable paths in the javascript itself. If this were a static page of old, we would be done a while ago!
By navigating to the URL https://localhost:8443/gateway/sandbox/example/systemjs.config.js you can see the issue being in this snippet:
...
 System.config({
    paths: {
      // paths serve as alias
      'npm:': 'node_modules/'
    },
    // map tells the System loader where to look for things
    map: {
      // our app is within the apps folder
      app: 'apps',
...
Both ‘node_modules/‘ and ‘apps’ needs the proper context. Lets go back to the rewrite file and fix this.

Final rewrite.xml changes

Here are the final additions you need. A neat trick here is the use of a Rules Filter to contain the rules to the content type of javascript.
Here is the final file Rewrite File
<rule dir="OUT" name="EXAMPLEUI/exampleui/outbound/apps">
    <rewrite template="example/apps"/>
</rule>
<rule dir="OUT" name="EXAMPLEUI/exampleui/outbound/nodemodule">
    <rewrite template="example/node_modules"/>
</rule>

<filter name="EXAMPLEUI/exampleui/outbound/app">
    <content type="application/javascript">
        <apply path="apps" rule="EXAMPLEUI/exampleui/outbound/apps"/>
        <apply path="node_modules" rule="EXAMPLEUI/exampleui/outbound/nodemodule"/>
    </content>
</filter>


Tying it back to service.xml

The filter needs to be tied back to the service file. So now the service file looks like this in its entirety.
Here is the final file Service File
 
<service role="EXAMPLEUI" name="exampleui" version="0.0.1">
    <policies>
        <policy role="webappsec"/>
        <policy role="authentication" name="Anonymous"/>
        <policy role="rewrite"/>
        <policy role="authorization"/>
    </policies>
    <routes>
        <route path="/example">
        </route>
        <route path="/example/**">
            <rewrite apply="EXAMPLEUI/exampleui/outbound/app" to="response.body"/>
        </route>
    </routes>
    <dispatch classname="org.apache.hadoop.gateway.dispatch.PassAllHeadersDispatch"/>
</service>
 

Now the page should load up fine. If you were to check the JS snippet now, you should see this
...
 System.config({
    paths: {
      // paths serve as alias
      'npm:': 'example/node_modules/'
    },
    // map tells the System loader where to look for things
    map: {
      // our app is within the example/apps folder
      app: 'example/apps',

... 

 

Hopefully all of this provides an introduction to adding a UI service to Apache Knox. There are certainly advanced situations as UIs can get complex. One source of inspiration would be to look at existing service and rewrite files. A good example is Ambari’s UI service as it has to deal with a fair amount of complexity.
 
If you have more questions, comments or suggestions please join the Apache Knox community. In particular you might be interested in one of the mailing lists
  • No labels