JAX-RS Search


Advanced Search Queries

CXF supports mapping the advanced query expressions to the typed Search API with the help of query language specific parsers.   

Supported Query Languages

Feed Item Query Language

Feed Item Query Language(FIQL) is supported since CXF 2.3.0.

For example, the following query

?_s=name==CXF;version=ge=2.2

lets users search for all the Apache projects with the name 'CXF' and the version greater or equal to '2.2'. The initial '=' separates the name of the query '_s' from the FIQL expression, while '==' and '=ge=' convey 'equals to' and 'greater or equals to' respectively.
An expression such as "name==CXF*" can be used to do the partial equality check (example: the name should start from "CXF").

More complex composite expressions can also be expressed easily enough, examples:

// Find all employees younger than 25 or older than 35 living in London
/employees?_s=(age=lt=25,age=gt=35);city==London

// Find all books on math or physics published in 1999 only.
/books?_s=date=lt=2000-01-01;date=gt=1999-01-01;(sub==math,sub==physics)

Here is a summary of FIQL operators:

Operator

Description

"=="

Equal

"!="

Not Equal

"=lt="

Less Than

"=le="

Less or Equal

"=gt="

Greater Than

"=ge="

Greater or Equal

";"

AND

","

OR

The last two operators, ","(OR) and ";"(AND) are used to concatenate and build composite (possibly nested) expressions, while the first 6 operators are used to build so called primitive expressions.

From CXF 2.7.7: single '=' operator can be optionally supported instead of '==': set a "fiql.support.single.equals.operator" contextual property to "true".

As you can see FIQL is rich enough for the service implementations to offer a more interesting search experience around the well-known data, while still keeping the complexity of URI expressions under control which makes it simpler to share such URI queries as well as use the same query language no matter what data store is used internally by the service.

Note, when passing the FIQL queries via URI query parameters, either '_search' or '_s' query parameter has to be used to mark a FIQL expression for it not to 'interfere' with other optional query parameters. Starting from CXF 2.7.2 it is also possible to use the whole query component to convey a FIQL expression, example,

// Find all books on math or physics published in 1999 only.
/books?date=lt=2000-01-01;date=gt=1999-01-01;(sub==math,sub==physics)

Note that no "_s" or "_search" query parameter is available, the whole query string starting after "?" represents an actual FIQL expression.
Please use "search.use.all.query.component" contextual property for this option be supported.

Alternatively the expressions can be encoded as URI path segments, see the sections below for more information.

Open Data Protocol

CXF 3.0.0-milestone2 supports the $filter query defined as part of Open Data Protocol, courtesy of Apache Olingo.

The $filter query can have a number of the logical operators, here is a summary of the operators supported in scope of Search API:

Operator

Description

"eq"

Equal

"ne"

Not Equal

"lt"

Less Than

"le"

Less or Equal

"gt"

Greater Than

"ge"

Greater or Equal

"and"

AND

"or"

OR


Please see the specification text for some examples.

Please note that OData protocol is not supported by CXF Search API, only the $filter query is supported (only logical operators for now) for querying the application data with CXF Search API. Users should work directly with Apache Olingo to get the OData protocol supported as part of the application flow.

Some of the following examples on this page may often refer to FIQL due to the fact FIQL has been supported for a long time, but the same examples will work with OData $filter expressions. For example, replace the "_s=name==CXF" query with "$filter=name eq CXF".

When to use advanced queries.

Consider a typical query expression such as "a=avalue&c=cvalue". This can mean either "find all resources with 'a' and 'c' properties equal to 'avalue' and 'cvalue'" or "find all resources with 'a' or 'c' properties equal to 'avalue' and 'cvalue'". It is application specific on whether it is "and" or "or" as far as the combination of multiple query properties is concerned.

It is also to capture conditional expressions with the custom language, example, "find all resource with 'a' property less than 123" when a number of properties is large or the entities which can be searched are created dynamically.

Use FIQL or OData for capturing simple or medium complexity queries, typically in cases where a set of properties that a user can specify is well-known. Example, a book store resource will let users search books given a number of useful properties(those of Book and/or Library a given book is available in, etc).

Furthermore, consider using FIQL/OData and SearchConditionVisitor for the purpose of generalizing the search code, when the number of properties and entities is large, dynamic, etc.

Dependencies and Configuration

The following dependency is required starting from CXF 2.6.0:

<dependency>
    <groupId>org.apache.cxf</groupId>
    <artifactId>cxf-rt-rs-extension-search</artifactId>
    <version>2.6.0</version>
</dependency>

<!-- If working with OData -->
<!--
<dependency>
    <groupId>org.apache.olingo</groupId>
    <artifactId>olingo-odata2-core-incubating</artifactId>
    <version>1.1.0</version> 
</dependency>
-->

Additionally, starting from CXF 2.6.0, SearchContextProvider needs to be registered as jaxrs:provider.

Working with the queries

SearchContext needs be injected into an application code and used to retrieve a SearchCondition representing the current FIQL/OData query. This SearchCondition can be used in a number of ways for finding the matching data.

In this section we assume that the data to be matched are already available in memory. The follow-up section on converting the queries will show how the queries can be converted to some other query language typed or text expression.

So, suppose a list or map of Book instances is available. Here is one possible approach:

@Path("books")
public class Books {

    private Map<Long, Book> books;
    @Context
    private SearchContext context;

    @GET
    public List<Book> getBook() {

        SearchCondition<Book> sc = searchContext.getCondition(Book.class);
        // SearchCondition#isMet method can also be used to build a list of matching beans

        // iterate over all the values in the books map and return a collection of matching beans
        List<Book> found = sc.findAll(books.values());
        return found;
    }
}

Note that a searchContext.getCondition(Book. class) call may return an arbitrary complex SearchCondition, it can be a simple primitiveexpression or a more complex, composite one.

Capturing the queries

For the query expression to be captured, a bean like Book.class is instantiated and has all the search properties injected into it. A complex composite expression will be 'injected' into a number of Book instances - something that may have to be optimized.

Note that by default, a bean such as Book class needs to have a matching property per every property name found in the FIQL expression, for example, given a 'name==b;id==123' expression, the Book class would need to have 'name' and 'id' properties available. The reason for this strict mode being enabled by default is that ignoring a property which can not be captured may lead to a false or unexpected match, for example, if Book 'name' property has been renamed to 'title' then ignoring the 'name' property will lead to a wider match. Thus, if the property does not exist, org.apache.cxf.jaxrs.ext.search.PropertyNotFoundException will be thrown; capturing it can let returning an empty response or retry with the more lax mode, see the next paragraph.

When a more lax parsing of FIQL expressions is expected, for example, where the primitive expressions are joined by "OR", using SearchBean (see one of the next subsections) or setting a contextual property "search.lax.property.match" will help. The former option is better when you need to know the list of all the properties which have been used in the expression, even those which will not be possible to use for the actual search; the latter option will simply have the unrecognized properties ignored.

Note that a "search.decode.values" property can be used to have the 'reserved' characters such as FIQL ',' or ';' characters passed as percent-encoded characters as part of the search property values.

Mapping of query properties to bean properties

As noted above, when a 'typed' bean such as Book.class is used to capture the expressions, a property found in the query expression that can not be mapped to a specific Book property will lead to an exception being reported or it can be optionally ignored. In the reality, there is a number of reasons why the direct match between properties found in query expressions and in capturing beans may not be ideal:

  • Capturing beans may evolve independently of the actual queries; for example, a working query such as "name==b" will break if a Book 'name' gets renamed to 'title' which will make it difficult to have the queries bookmarked.
  • Direct match will simply not work for cases where an actual bean property does not belong to the capturing bean itself but to one of its child properties; for example, a JPA2 Book entity may have an OwnerInfo bean with Name bean property which does contain a primitive 'name' property.

The preferred approach, when working with typed beans, is to register a bean properties map, using a "search.bean.property.map" contextual property or directly with SearchContext. For example, given

public class Book {

    private int id;
    private OwnerInfo ownerinfo;
    //setters and getters omitted for brevity
}

@Embeddable
public class OwnerInfo {

    private Address address;
    private Name name;
    //setters and getters omitted for brevity
}

@Embeddable
public class Name {

    private String name;
    //setters and getters omitted for brevity
}

and the following map:

<map>
 <!-- 'oname' is alias for the actual nested bean property -->
 <entry key="oname" value="ownerinfo.name.name"/>
</map>

will let users type and bookmark queries (and without seeing them producing unexpected results) like this one:

//Find all the books owned by Fred with id greater than 100
/books?_s=id=gt=100;oname=Fred

Note, a property name such as "ownerinfo.name.name" uses '.' to let the parser navigate to the actual Name bean which has a 'name' property. This can be optimized in cases where the owner bean is known to have either a constructor or static valueOf() method accepting the 'name' property, for example, given

public class Name {

    private String name;
    public Name() {
    } 
    public Name(String name) {
        this.name = name;
    }
    //setters and getters omitted for brevity
}

the mapping between "oname" and "ownerinfo.name" will work too.

You can also have many to one mappings, for example

<map>
 <!-- 'oname' and 'owner' are aliases for the 'ownerinfo.name.name' bean property -->
 <entry key="oname" value="ownerinfo.name.name"/>
 <entry key="owner" value="ownerinfo.name.name"/>
</map>

Dealing with mistyped property names

Consider a case where a documented search property is named as 'address' (lower case) and a query contains a mistyped 'Address' instead. In this case, unless a "search.lax.property.match" property is set, PropertyNotFoundException will be thrown.

Supporting case-insensitive property mapping is easy, register a "search.bean.property.map" (mentioned earlier) map as Java TreeMap

with a case-insensitive String.CASE_INSENSITIVE_ORDER Comparator.

However it will not help if the 'address' property was mistyped as 'adress'. In this case, "search.bean.property.map" might still be useful with having few more keys supporting some typical typos, example, 'adress' - 'address', 'addres' - 'address', etc.

Starting from  CXF 3.1.5, org.apache.cxf.jaxrs.ext.search.PropertyNameConverter  is available and might be used for a more sophisticated conversion of mistyped property names to correct names. 

The implementation can be registered as a "search.bean.property.converter" endpoint contextual property.

Parser properties

The parser properties are the ones which tell the parser how to treat the conversion of Date values and the unrecognized search property names.

As explained above, "search.lax.property.match" can be used to tell the parser that it should ignore the search property names which have no corresponding bean properties.

"search.date.format" and "search.timezone.support" tell the parser how to convert the date values, see "Using dates in queries" section.

More properties may be supported in the future.

All of these properties can be set as endpoint contextual properties or directly with SearchContext.

Mapping of query properties to column/field names

When converting FIQL queries to SQL or other untyped query language expressions, as well as when using Lucene converter, it can be useful to be able to map between an actual query parameter and the column or field name. All FIQL converters shipped with CXF have constructors accepting a map for mapping the queries to columns/fields. See the next "SearchBean" section for one example.

Note this property is not the same as the one described in the "Mapping of query properties to bean properties" section. The latter (the one described in the previous section) is required for getting FIQL queries captured into typed, domain specific beans like Book, and it can be sufficient for JPA2 which also has annotations like @Column.

SearchBean

org.apache.cxf.jaxrs.ext.search.SearchBean is a utility bean class which can simplify analyzing the captured FIQL expressions and converting them to the other language expressions, in cases where having to update the bean class such as Book.class with all the properties that may need to be supported is not practical or the properties need to be managed manually. For example:

// ?_s="level=gt=10"
SearchCondition<SearchBean> sc = searchContext.getCondition(SearchBean.class);

Map\<, String\> fieldMap = new HashMap\<String, String\>();
fieldMap.put("level", "LEVEL_COLUMN");

SQLPrinterVisitor<SearchBean> visitor = new SQLPrinterVisitor<SearchBean>(fieldMap, "table", "LEVEL_COLUMN");
sc.accept(visitor);
assertEquals("SELECT LEVEL_COLUMN FROM table 
              WHERE LEVEL_COLUMN > '10'",
              visitor.getQuery());

Converting the queries

SearchCondition can also be used to convert the search requirements (originally expressed in FIQL/OData) into other query languages.
A custom SearchConditionVisitor implementation can be used to convert SearchCondition objects into custom expressions or typed objects. CXF ships visitors for converting expressions to SQL, JPA 2.0 CriteriaQuery or TypedQuery, Lucene Query.

SQL

org.apache.cxf.jaxrs.ext.search.sql.SQLPrinterVisitor can be used for creating SQL expressions. For example:

// ?_s="name==ami*;level=gt=10"
SearchCondition<Book> sc = searchContext.getCondition(Book.class);
SQLPrinterVisitor<Book> visitor = new SQLPrinterVisitor<Book>("table");
sc.accept(visitor);
assertEquals("SELECT * FROM table 
              WHERE 
              name LIKE 'ami%' 
              AND 
              level > '10'",
              visitor.getQuery());

Note that SQLPrinterVisitor can also be initialized with the names of columns and the field aliases map:

// ?_s="level=gt=10"
SearchCondition<Book> sc = searchContext.getCondition(Book.class);

Map<String, String> fieldMap = new HashMap<String, String>();
fieldMap.put("level", "LEVEL_COLUMN");

SQLPrinterVisitor<Book> visitor = new SQLPrinterVisitor<Book>(fieldMap, "table", "LEVEL_COLUMN");
sc.accept(visitor);
assertEquals("SELECT LEVEL_COLUMN FROM table 
              WHERE LEVEL_COLUMN > '10'",
              visitor.getQuery());

The fields map can help hide the names of the actual table columns/record fields from the Web frontend. Example, the users will know that the 'level' property is available while internally it will be converted to a LEVEL_COLUMN name.

Warning: Using the SQLPrinterVisitor may leave your service open to SQL injection attacks. Please take appropriate steps to avoid these attacks (for example validating queries using a custom PropertyValidator, or manually escaping the input values).

JPA 2.0

CXF 2.6.4 and CXF 2.7.1 introduce org.apache.cxf.jaxrs.ext.search.jpa.JPATypedQueryVisitor and org.apache.cxf.jaxrs.ext.search.jpa.JPACriteriaQueryVisitor which can be used to capture FIQL/OData expressions into
javax.persistence.TypedQuery or javax.persistence.criteria.CriteriaQuery objects.

For example, given:

public class Book {

    private String title;
    private Date date;
    private OwnerInfo ownerinfo;
    //setters and getters omitted for brevity
}

@Embeddable
public class OwnerInfo {

    private Address address;
    private Name name;
    //setters and getters omitted for brevity
}

@Embeddable
public class Name {

    private String name;
    //setters and getters omitted for brevity
}

@Embeddable
public class Address {

    private String street;
    //setters and getters omitted for brevity
}

the following code can be used:

import javax.persistence.EntityManager;
import javax.persistence.TypedQuery;

// init EntityManager as required
private EntityManager entityManager;

// Find the books owned by Barry who lives in London, published starting from the first month of 2000 
// ?_s="date=ge=2000-01-01;ownername=barry;address=london"

// this map will have to be set as a contextual property on the jaxrs endpoint
// it assumes that Book bean has nested OwnerInfo bean with nested Address and Name beans, 
// with the latter containing 'street' and 'name' property respectively

Map<String, String> beanPropertiesMap = new HashMap<String, String>();
beanPropertiesMap.put("address", "ownerInfo.address.street");
beanPropertiesMap.put("ownername", "ownerInfo.name.name");

// the actual application code
SearchCondition<Book> sc = searchContext.getCondition(Book.class);
SearchConditionVisitor<Book, TypedQuery<Book>> visitor = 
    new JPATypedQueryVisitor<Book>(entityManager, Book.class);
sc.accept(visitor);

TypedQuery<Book> typedQuery = visitor.getQuery();
List<Book> books = typedQuery.getResultList();

Using CriteriaQuery is preferred in cases when the actual result has to be shaped into a bean of different type, using one of JPA2 CriteriaBuilder's shape methods (array(), construct() or tuple()). For example:

// Find the books owned by Barry who lives in London, published starting from the first month of 2000 
// ?_s="date=ge=2000-01-01;ownername=barry;address=london"

// this map will have to be set as a contextual property on the jaxrs endpoint
Map<String, String> beanPropertiesMap = new HashMap<String, String>();
beanPropertiesMap.put("address", "ownerInfo.address.street");
beanPropertiesMap.put("ownername", "ownerInfo.name.name");

// the actual application code
// Only Book 'id' and 'title' properties are extracted from the list of found books
 
SearchCondition<Book> sc = searchContext.getCondition(Book.class);
JPACriteriaQueryVisitor<Book, Tuple> visitor = 
    new JPACriteriaQueryVisitor<Book, Tuple>(entityManager, Book.class, Tuple.class);
sc.accept(visitor);

List<SingularAttribute<Book, ?>> selections = new LinkedList<SingularAttribute<Book, ?>>();
// Book_ class is generated by JPA2 compiler
selections.add(Book_.id);
selections.add(Book_.title);

visitor.selectTuple(selections);

TypedQuery<Tuple> query = visitor.getQuery();

List<Tuple> tuples = typedQuery.getResultList();
for (Tuple tuple : tuples) {
  int bookId = tuple.get("id", String.class);
  String title = tuple.get("title", String.class);
  // add bookId & title to the response data
}

Note that JPACriteriaQueryVisitor will automatically set aliases for an expression like "tuple.get('id', String.class)" to work.
JPACriteriaQueryVisitor will be enhanced to support more of JPA2 advanced constructs in time.

Or, instead of using Tuple, use a capturing bean like BeanInfo:

public static class BookInfo {
    private int id;
    private String title;

    public BookInfo() {
            
    }
        
    public BookInfo(Integer id, String title) {
        this.id = id;
        this.title = title;
    }
    //setters and getters omitted for brevity
 }

// actual application code:

SearchCondition<Book> sc = searchContext.getCondition(Book.class);
JPACriteriaQueryVisitor<Book, BookInfo> visitor = 
    new JPACriteriaQueryVisitor<Book, BookInfo>(entityManager, Book.class, BookInfo.class);
sc.accept(visitor);

List<SingularAttribute<Book, ?>> selections = new LinkedList<SingularAttribute<Book, ?>>();
// Book_ class is generated by JPA2 compiler
selections.add(Book_.id);
selections.add(Book_.title);

visitor.selectConstruct(selections);

TypedQuery<BookInfo> query = visitor.getQuery();

List<BookInfo> bookInfo = typedQuery.getResultList();
return bookInfo;

JPA2 typed converters also support join operations in cases when explicit collections are used, for example, given:

@Entity(name = "Book")
public class Book {

    private List<BookReview> reviews = new LinkedList<BookReview>();
    private List<String> authors = new LinkedList<String>();
    // other properties omitted

    @OneToMany
    public List<BookReview> getReviews() {
        return reviews;
    }

    public void setReviews(List<BookReview> reviews) {
        this.reviews = reviews;
    }

    @ElementCollection
    public List<String> getAuthors() {
        return authors;
    }

    public void setAuthors(List<String> authors) {
        this.authors = authors;
    }
}

@Entity
public class BookReview {
    private Review review;
    private List<String> authors = new LinkedList<String>();
    private Book book;
    // other properties omitted    

    public Review getReview() {
        return review;
    }

    public void setReview(Review review) {
        this.review = review;
    }

    @OneToOne
    public Book getBook() {
        return book;
    }

    public void setBook(Book book) {
        this.book = book;
    }

    @ElementCollection
    public List<String> getAuthors() {
        return authors;
    }

    public void setAuthors(List<String> authors) {
        this.authors = authors;
    }

    public static enum Review {
        GOOD,
        BAD
    }
}

the following will find "all the books with good reviews written by Ted":

SearchCondition<Book> filter = new FiqlParser<Book>(Book.class).parse("reviews.review==good;reviews.authors==Ted");
// in practice, map "reviews.review" to "review", "reviews.authors" to "reviewAuthor" 
// and have a simple query like "review==good;reviewAuthor==Ted" instead

SearchConditionVisitor<Book, TypedQuery<Book>> jpa = new JPATypedQueryVisitor<Book>(em, Book.class);
filter.accept(jpa);
TypedQuery<Book> query = jpa.getQuery();
return query.getResultList();

org.apache.cxf.jaxrs.ext.search.jpa.JPALanguageVisitor for converting FIQL/OData expressions into JPQL expressions have also been introduced.

Count expressions

Count expressions are supported at the two levels,

First, one may want to get the count of records matching a given search expression, this actually can be done by checking the size of the result list:

TypedQuery<Book> query = jpa.getQuery();
return query.getResultList().size();

However this can be very inefficient for large number of records, so using a CriteriaBuilder count operation is recommended, for example:

SearchCondition<Book> filter = new FiqlParser<Book>(Book.class).parse("reviews.review==good;reviews.authors==Ted");

JPACriteriaQueryVisitor<Book, Long> jpa = new JPACriteriaQueryVisitor<Book, Long>(em, Book.class, Long.class);
filter.accept(jpa);
long count = jpa.count();


Second, only when using FIQL, a count extension can be used. For example, one may want to find 'all the books written by at least two authors or all the books with no reviews'.
If a collection entity such as BookReview has a non primitive type, then typing "reviews==0" is all what is needed, otherwise a count extension needs to be used, for example: "count(authors)=ge=2"

Lucene

Mapping of FIQL/OData expressions to Lucene (4.0.0-BETA) Query is supported starting from CXF 2.7.1. Please notice that starting from CXF 3.0.2, the Lucene version has been upgraded to 4.9.0 in order to benefit from query builders and other improvements.

org.apache.cxf.jaxrs.ext.search.lucene.LuceneQueryVisitor can be used to support the default (content) field or specific custom field queries.
Queries for specific terms and phrases are supported.

Example, "find the documents containing a 'text' term":

import org.apache.lucene.search.Query;

SearchCondition<SearchBean> filter = new FiqlParser<SearchBean>(SearchBean.class).parse("ct==text");
LuceneQueryVisitor<SearchBean> lucene = new LuceneQueryVisitor<SearchBean>("ct", "contents");
lucene.visit(filter);
org.apache.lucene.search.Query termQuery = lucene.getQuery();
// use Query

Note, "new LuceneQueryVisitor<SearchBean>("ct", "contents");" is a simple constructor which lets create a mapping between the "ct" name used in the query and the actual default content field. It is not required to use this mapping but it is recommended as it keeps the query expression shorter and does not leak the actual internal Lucene field name.

All the FIQL operators have been mapped to related Lucene Query objects. Queries such as "Less than", or "Greater than and less than" will work fine against the typed fields like "org.apache.lucene.document.IntField". The visitor can be configured with a "primitiveFieldTypeMap" map property to help it map a given query name, example "id" to Integer.class.

Phrases are supported too. Suppose you have few documents with each of them containing name and value pairs like "name=Fred", "name=Barry" and you'd like to list only the documents containing "name=Fred":

SearchCondition<SearchBean> filter = new FiqlParser<SearchBean>(SearchBean.class).parse("name==Fred");
LuceneQueryVisitor<SearchBean> lucene = new LuceneQueryVisitor<SearchBean>("contents");
lucene.visit(filter);
org.apache.lucene.search.Query phraseQuery = lucene.getQuery();
// use query

In this example, the visitor is requested to create Lucene org.apache.lucene.search.PhraseQuery against the specified contents field ("contents"). The visitor can also accept a contentsFieldMap map property when different phrases may need to be checked against different contents fields.

Starting from CXF 3.0.2, the typed Date range queries are supported by LuceneQueryVisitor. However, this feature should be used together with 'primitiveFieldTypeMap' in order to hint the visitor which fields are temporal and should be treated as such in the filter expressions. For example:

Map< String, Class< ? > > fieldTypes = new LinkedHashMap< String, Class< ? > >();
fieldTypes.put( "modified", Date.class);

SearchCondition<SearchBean> filter = new FiqlParser<SearchBean>(SearchBean.class).parse("modified=gt=2007-09-16");
LuceneQueryVisitor<SearchBean> lucene = new LuceneQueryVisitor<SearchBean>("ct", "contents");
lucene.setPrimitiveFieldTypeMap(fieldTypes);
lucene.visit(filter);

org.apache.lucene.search.Query query = lucene.getQuery();

LuceneQueryVisitor supports wide range of date formats, still providing the option to customize it using 'search.date-format' property. This property accepts the date/time pattern expression in the SimpleDateFormat format. Also, since CXF 3.0.2, the LuceneQueryVisitor  could be configured to use the Lucene analyzer. The reason to use analyzer is that during Lucene query construction the visitor can use the per-field filters and tokenizers, taking into account stemming, stop-worlds, lower-casing, etc., as such properly processing the filter expression. For example:

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_4_9);

// Lower-case filter and stop-words filter are part of the StandardAnalyzer
SearchCondition<SearchBean> filter = new FiqlParser<SearchBean>(SearchBean.class).parse("contents==pears and APPLES");
LuceneQueryVisitor<SearchBean> lucene = new LuceneQueryVisitor<SearchBean>("contents", analyzer);
lucene.visit(filter);

org.apache.lucene.search.Query query = lucene.getQuery();


LDAP

Mapping of FIQL/OData expressions to LDAP queries as defined by RFC-4515 is supported starting from CXF 2.7.1 with the help of org.apache.cxf.jaxrs.ext.search.ldap.LdapQueryVisitor. Use this visitor when working with LDAP or OSGI.

Here is a summary of LDAP filter operators:

Operator

Description

"="

Equal

"!"

Not Equal

"<="

Less Or Equal

">="

Greater or Equal

"&"

AND

"|"

OR

FIQL "=le=" and "=lt=" will both map to "<=", while "=ge=" and "=gt=" to ">=".

For example:

FIQL

LDAP

"name==bar*"

"(name=bar*)"

"name!=bar"

"(!name=bar)"

"name!=bar;id=gt=10"

"(&(!name=bar)(id>=10))"

"name!=bar;(id=gt=10,id=lt=5)"

"(&(!name=bar)(|(id>=10)(id<=5)))"

The converter is created like all other converters:

// FIQL "oclass=Bar"

// map 'oclass' used in the FIQL query to the actual property name, 'objectClass'
LdapQueryVisitor<Condition> visitor = 
   new LdapQueryVisitor<Condition>(Collections.singletonMap("oclass", "objectClass"));

filter.accept(visitor.visitor());
String ldap = visitor.getQuery();

Note that since CXF 3.2.5 the query values are encoded by default, to prevent possible LDAP injection attacks. If you want to support wildcard searching with the LdapQueryVisitor from CXF 3.2.5 onwards, it is necessary to set the 'encodeQueryValues' property of LdapQueryVisitor to 'false'.

Custom visitors

In cases when a custom conversion has to be done, a converter for doing the untyped (example, SQL) or typed (example, JPA2 TypedQuery) conversions can be provided.

Untyped converters

public class CustomSQLVisitor<T> extends AbstractSearchConditionVisitor<T, String> {

    private String tableName;
    private StringBuilder sb = new StringBuilder();

    public void visit(SearchCondition<T> sc) {
        
        if (sb == null) {
            sb = new StringBuilder();
            // start the expression as needed, example
            // sb.append("Select from ").append(tableName);
        }
        
        PrimitiveStatement statement = sc.getStatement();
        if (statement != null) {
                // ex "a > b"
                // use statement.getValue()
                // use statement.getConditionType() such as greaterThan, lessThan
                // use statement.getProperty();
                // to convert "a > b" into SQL expression
                sb.append(toSQL(statement));         
        } else {
            // composite expression, ex "a > b;c < d"
            for (SearchCondition<T> condition : sc.getSearchConditions()) {
                // pre-process, example sb.append("(");
                condition.accept(this);
                // post-process, example sb.append(")");
            }
        }
    }

    public String getQuery() {
        return sb.toString();
    }
}

Typed converters

import org.custom.search.Query;

public class CustomTypedVisitor<T> extends AbstractSearchConditionVisitor<T, Query> {

    private Stack<List<Query>> queryStack = new Stack<List<Query>>();

    public void visit(SearchCondition<T> sc) {
                
        PrimitiveStatement statement = sc.getStatement();
        if (statement != null) {
                // ex "a > b"
                // use statement.getValue()
                // use statement.getConditionType() such as greaterThan, lessThan
                // use statement.getProperty();
                // to convert "a > b" into Query object
                Query query = buildSimpleQuery(statement);
                queryStack.peek().add(query);                 

        } else {
            // composite expression, ex "a > b;c < d"
            queryStack.push(new ArrayList<Query>());

            for (SearchCondition<T> condition : sc.getSearchConditions()) {
                condition.accept(this);
            }

            boolean orCondition = sc.getConditionType() == ConditionType.OR;
            List<Query> queries = queryStack.pop();
            queryStack.peek().add(createCompositeQuery(queries, orCondition));
        }
    }

    public Query getQuery() {
        return queryStack.peek().get(0);
    }
}

Custom parsing

If needed you can access a FIQL/OData query directly and delegate it further to your own custom FIQL handler:

@Path("/search")
public class SearchEngine {
    @Context
    private UriInfo ui;

    @GET
    public List<Book> findBooks() {
        MultivaluedMap<String, String> params = ui.getQueryParameters();
        String query = params.getFirst("_s"); // or $filter, etc
        // delegate to your own custom handler 

        // note that the original search expression can also be retrieved 
        // using a SearchContext.getSearchExpression() method
}

Converting the queries with QueryContext

QueryContext is the helper context available from CXF 2.7.1 which makes it simpler for the application code to
get the converted query expression, with the actual converter/visitor registered as the jaxrs contextual property, for example:

import java.util.ArrayList;
import java.util.List;
import org.apache.cxf.jaxrs.JAXRSServerFactoryBean;
import org.apache.cxf.jaxrs.ext.search.QueryContextProvider;
import org.apache.cxf.jaxrs.ext.search.SearchBean;
import org.apache.cxf.jaxrs.ext.search.visitor.SBThrealLocalVisitorState;
import org.apache.cxf.jaxrs.ext.search.sql.SQLPrinterVisitor;

import books.BookStore;

// Register the visitor:
JAXRSServerFactoryBean sf = new JAXRSServerFactoryBean();
List<Object> providers = new ArrayList<Object>();
providers.add(new QueryContextProvider());
sf.setProviders(providers);

SQLPrinterVisitor<SearchBean> sqlVisitor = new SQLPrinterVisitor<SearchBean>("books");
sqlVisitor.setVisitorState(new SBThrealLocalVisitorState());
sf.getProperties(true).put("search.visitor", sqlVisitor);

sf.setResourceClasses(BookStore.class);
server = sf.create();

and convert the queries:

@Path("/")
public class BookStore { 
    @GET
    @Path("/books/{expression}")
    @Produces("application/xml")
    public List<Book> getBookQueryContext(@PathParam("expression") String expression, 
                                      @Context QueryContext searchContext) 
        throws BookNotFoundFault {
        String sqlExpression = searchContext.getConvertedExpression(expression, Book.class);
        // pass it to the SQL DB and return the list of Books
    }
}

where the client code may look like this:

String address = "http://localhost:8080/bookstore/books/id=ge=123";
WebClient client = WebClient.create(address);
client.accept("application/xml");
List<Book> books = client.getCollection(Book.class);

Note, given that SQLPrinterVisitor will be shared between multiple requests it has to be made thread-safe by injecting a thread-local
org.apache.cxf.jaxrs.ext.search.visitor.SBThrealLocalVisitorState. This is not required when the visitor is created in the code on the per-request basis.

Custom visitors which are expected to be singletons and have the state accumulating between multiple visit calls have to be thread safe. Utility org.apache.cxf.jaxrs.ext.search.visitor.ThrealLocalVisitorState class can be used.

Plain queries to FIQL conversion

If you'd like to generalize the processing of search queries and use FIQL visitors, you may want to consider setting up a contextual property "search.use.plain.queries" to "true" and get the plain query expressions converted to FIQL expressions internally.

// GET /search?a=a1&a=v2
String exp = searchContext.getSearchExpression();
assertEquals("(a==a1,a==a2)", exp);

// GET /search?a=a1&b=b1
exp = searchContext.getSearchExpression();
assertEquals("(a==a1;b==b1)", exp);

Also, by default, if a query property name ends with "From" then "=ge=" (greater or equals to) will be used, and if ends with "Till" then "=lt=" will be used, for example:

// GET /search?ageFrom=10&ageTill=20
String exp = searchContext.getSearchExpression();
assertEquals("(age=ge=10,age=le=20)", exp);

This can allow the plain query expressions mapped to typed bean properties and further used with all the existing converters.

Search Expressions in URI Path segments

By default, a FIQL expression is expected to be available in either '_s' or '_search' query.
For example, "find all the books with an 'id' property value less than 123":

GET /books?_s=id=lt=123

Starting from CXF 2.6.2, it is possible to work with FIQL expressions included in URI path segments, for example, the same query can be expressed
in a number of ways:

GET /books/id=lt=123
GET /books[id=lt=123]
GET /books(id=lt=123)
GET /books;id=lt=123

//etc, etc

Such expressions can be captured in the code using JAX-RS annotations:

@Path("search")
public class BooksResource {
   @Context
   private SearchContext context;

   //GET /books[id=lt=123]
   @GET
   @Path("books[{search}]") 
   public List<Book> findSelectedBooks(@PathParam("search") String searchExpression) {
       return doFindSelectedBooks(searchExpression);
   }

   //GET /books(id=lt=123)
   @GET
   @Path("books({search})") 
   public List<Book> findSelectedBooks(@PathParam("search") String searchExpression) {
       return doFindSelectedBooks(searchExpression);
   }

   //GET /books/id=lt=123
   @GET
   @Path("books/{search}") 
   public List<Book> findSelectedBooks(@PathParam("search") String searchExpression) {
       return doFindSelectedBooks(searchExpression);
   }

   //GET /books;id=lt=123
   @GET
   @Path("books;{search}") 
   public List<Book> findSelectedBooks(@PathParam("search") String searchExpression) {
       return doFindSelectedBooks(searchExpression);
   }

   public List<Book> doFindSelectedBooks(String searchExpression) {
       SearchCondition<Book> sc = context.getCondition(searchExpression, Book.class);
   
       // JPA2 enity manager is initialized earlier
       JPATypedQuery<Book> visitor = new JPATypedQueryVisitor<Book>(entityManager, Book.class);
       sc.accept(visitor);
   
       TypedQuery<Book> typedQuery = visitor.getQuery();
       return typedQuery.getResultList();
   }

}

Note that if you have an expression added to a URI path segment with a ";" character acting as a separator, example, "/books;id=lt=123",
or if an expression itself includes ";", example, "/books[id=lt=123;id=gt=300]" ("find all the books with id less than 123 or greater than 300")
then a boolean contextual property "ignore.matrix.parameters" has to be set to "true" for the runtime to avoid splitting the path segment into the path value and matrix parameters.

Queries involving multiple entities

Basic queries

Consider the query like "find chapters with a given chapter id from all the books with 'id' less than 123".
One easy way to manage such queries is to make FIQL and JAX-RS work together. For example:

@Path("search")
public class BooksResource {
   @Context
   private SearchContext context;

   //GET /books[id=lt=123]/chapter/1
   @GET
   @Path("books[{search}]/chapter/{id}") 
   public List<Chapter> findSelectedChapters(@PathParam("search") String searchExpression,
                                       @PathParam("id") int chapterIndex) {
       return doFindSelectedChapters(searchExpression, chapterIndex);
   }

   public List<Chapter> doFindSelectedChapters(String searchExpression, int chapterIndex) {
       SearchCondition<Book> sc = context.getCondition(searchExpression, Book.class);
   
       // JPA2 enity manager is initialized earlier
       JPATypedQuery<Book> visitor = new JPATypedQueryVisitor<Book>(entityManager, Book.class);
       sc.accept(visitor);
   
       TypedQuery<Book> typedQuery = visitor.getQuery();
       List<Book> books = typedQuery.getResultList();

       List<Chapter> chapters = new ArrayList<Chapter>(books.size);
       for (Book book : books) {
           chapters.add(book.getChapter(chapterIndex)); 
       }   
       return chapters;
   }

}

Complex queries

In the previous section we had the properties of two entities, Book and Chapter, used in the query. The query was considered 'simple' because it was really only the simple book properties that were checked, and the only chapter property was a chapter id, assumed to be equal to a chapter list index.

Consider "Find all the chapters with id less than 5 for all the books with id greater than 300".

One way to handle is to follow the example from the previous section with few modifications:

@Path("search")
public class BooksResource {
   @Context
   private SearchContext context;

   //GET /books(id=gt=300)/chapters(id=lt=5)
   @GET
   @Path("books({search1})/chapter/{search2}") 
   public List<Chapter> findSelectedChapters(@PathParam("search1") String bookExpression,
                                       @PathParam("search2") String chapterExpression) {
       return doFindSelectedBooks(bookExpression, chapterExpression);
   }

   public List<Chapter> doFindSelectedChapters(String bookExpression, String chapterExpression) {
       // find the books first
       
       SearchCondition<Book> bookCondition = context.getCondition(searchExpression, Book.class);
   
       JPATypedQuery<Book> visitor = new JPATypedQueryVisitor<Book>(entityManager, Book.class);
       bookCondition.visit(visitor);
       TypedQuery<Book> typedQuery = visitor.getQuery();
       List<Book> books = typedQuery.getResultList();

       // now get the chapters
       SearchCondition<Chapter> chapterCondition = context.getCondition(chapterExpression, Chapter.class);
       List<Chapter> chapters = new ArrayList<Chapter>();
       for (Book book : books) {
           chapters.addAll(chapterCondition.findAll(book.getChapters()); 
       }   
       return chapters;
   }

}

In this case two conditions are created and the 2nd condition is used to filter the chapters from the books filtered by the 1st condition.

Perhaps a simpler approach, especially in case of JPA2, is to start looking for Chapters immediately, assuming Chapter classes have a one to one bidirectional relationship with Book:

public class Chapter {
   private int id;
   private Book book;

   @OneToOne(mappedBy="book")
   public Book getBook() {}
}

@Path("search")
public class BooksResource {
   @Context
   private SearchContext context;

   //GET /chapters(bookId=gt=300,id=lt=5)
   @GET
   @Path("chapters({search})") 
   public List<Chapter> findSelectedChapters(@PathParam("search") String chapterExpression) {
       
       SearchCondition<Chapter> chapterCondition = context.getCondition(chapterExpression, Chapter.class);
   
       JPATypedQuery<Chapter> visitor = new JPATypedQueryVisitor<Chapter>(entityManager, Chapter.class);
       chapterCondition.visit(visitor);
       TypedQuery<Chapter> typedQuery = visitor.getQuery();
       return typedQuery.getResultList();
   }

}

Note this code assumes that "bookId" is mapped to "Book.id" property with the help of the contextual "search.bean.property.map" property as explained earlier.

Validation

First option is to have a bean capturing specific property values do a domain specific validation. For example, a Book.class may have its setName(String name) method validating the name value.
Another option is to inject a custom validator into a visitor which is used to build the untyped or typed query.

Finally, avoid letting users to use properties whose values which can not be well validated in the application code. Using a typed capturing bean like Book.class offers a perfect option to limit a number of supported properties to the ones known to be related to Books.

Bean Validation 1.1 can also be used.

Building the queries

FIQL

CXF 2.4.0 introduces SearchConditionBuilder which makes it simpler to build FIQL queries. SearchConditionBuilder is an abstract class that returns a FIQL builder by default:

SearchConditionBuilder b = SearchConditionBuilder.instance();
String fiqlQuery = b.is("id").greaterThan(123).query();

WebClient wc = WebClient.create("http://books.com/search");
wc.query("_s", fiqlQuery);
// find all the books with id greater than 123 
Collection books = wc.getCollection(Book.class);

Here is an example of building more complex queries:

// OR condition
String ret = b.is("foo").greaterThan(20).or().is("foo").lessThan(10).query();
assertEquals("foo=gt=20,foo=lt=10", ret);

// AND condition
String ret = b.is("foo").greaterThan(20).and().is("bar").equalTo("plonk").query();
assertEquals("foo=gt=20;bar==plonk", ret);

// Complex condition
String ret = b.is("foo").equalTo(123.4).or().and(
            b.is("bar").equalTo("asadf*"), 
            b.is("baz").lessThan(20)).query();
assertEquals("foo==123.4,(bar==asadf*;baz=lt=20.0)", ret);

Note, starting from CXF 2.7.1 the following can be used to make connecting multiple primitive expressions simpler:

// AND condition, '.and("bar")' is a shortcut for "and().is("bar")", similar shortcut is supported for 'or'
String ret = b.is("foo").greaterThan(20).and("bar").equalTo("plonk").query();
assertEquals("foo=gt=20;bar==plonk", ret);

More updates to the builder API are available on the trunk:

// OR condition
String ret = b.is("foo").equalTo(20).or().is("foo").equalTo(10).query();
assertEquals("foo==20,foo==10", ret);

// Same query, shorter expression
String ret = b.is("foo").equalTo(20, 10).query();
assertEquals("foo==20,foo==10", ret);

and

// Connecting composite or() and and() expressions will add "()" implicitly:
String ret = b.is("foo").equalTo(20, 10).and("bar").lessThan(10).query();
assertEquals("(foo==20,foo==10);bar=lt=10", ret);

// wrap() method can be used to wrap explicitly:

String ret = b.is("foo").equalTo(10).and("bar").lessThan(10).wrap().or("bar").greaterThan(25).query();
assertEquals("(foo==20;bar=lt=10),bar=gt=25", ret);


Using dates in queries

By default, the date values have to have the following format: "yyyy-MM-dd", for example:

?_search=date=le=2010-03-11

A custom date format can be supported. Use "search.date-format" contextual property, example, "search.date-format"="yyyy-MM-dd'T'HH:mm:ss" will let users type:

?_search=time=le=2010-03-11T18:00:00

If needed, "search.timezone.support" can be enabled to get the timezones supported too.

At the moment, for custom date formats be recognized by SearchConditionBuilder, FIQLSearchConditionBuilder has to be created explicitly:

Map<String, String> props = new HashMap<String, String>();
props.put("search.date-format", "yyyy-MM-dd'T'HH:mm:ss");
props.put("search.timezone.support", "false");

Date d = df.parse("2011-03-01 12:34:00");
        
FiqlSearchConditionBuilder bCustom = new FiqlSearchConditionBuilder(props);
        
String ret = bCustom.is("foo").equalTo(d).query();
assertEquals("foo==2011-03-01T12:34:00", ret);


Relative dates

Date value can be specified as a duration from the current date/time, as its string representation, "PnYnMnDTnHnMnS".
Resulted date will be calculated as a current date + specified duration. For example:

?_search=date=ge=-P90D


This query will search for a date which is 90 days in the past or newer.

Alternative query languages

Custom org.apache.cxf.jaxrs.ext.search.SearchConditionParser implementations can be registered as a "search.parser" contextual property starting from CXF 3.0.0-milestone2.

OData


Please use a "search.query.parameter.name" contextual property to indicate to the runtime that an OData '$filter' query option needs to be checked for the query expression and a "search.parser" property to point to the instance of org.apache.cxf.jaxrs.ext.search.odata.ODataParser, as shown in this test, see the startServers function.

And here is also an XML Spring configuration example (using SearchBean in this specific case):

 <cxf:bus>
  <cxf:properties>
    <entry key="search.query.parameter.name" value="$filter" />
    <entry key="search.parser">
      <bean class="org.apache.cxf.jaxrs.ext.search.odata.ODataParser">
         <constructor-arg value="#{ T(org.apache.cxf.jaxrs.ext.search.SearchBean) }" />
      </bean>
    </entry>
  </cxf:properties>
</cxf:bus>
 


Also note that Apache Olingo offers its own visitor model which can be used to work with JPA2, etc.

Content Extraction

Starting from CXF 3.0.2, the content extraction support has been added in order to complement the search capabilites with text extraction from various document formats (PDF, ODF, DOC,TXT,RTF,...). It is based on Apache Tika and is available in two shapes: raw content extraction (TikaContentExtractor) and Lucene document content extraction (TikaLuceneContentExtractor).

Using TikaContentExtractor

The purpose of Tika content extractor is to provide the essential support of text extraction from supported document formats. Additionally, the metadata is being extracted as well depending on the document format (author, modified, created, pages, ...). The TikaContentExtractor accepts the list of supported parsers and returns the extracted metadata together with the desired extracted content format (by default raw text). For example:

TikaContentExtractor extractor = new TikaContentExtractor(new PDFParser(), true);
TikaContent content = extractor .extract( Files.newInputStream( new File( "testPDF.pdf" ).toPath() ) );

By default, the TikaContentExtractor  also performs the content type detection and validation, which could be turned off using the 'validateMediaType' constructor argument.

Using TikaLuceneContentExtractor

The TikaLuceneContentExtractor is very similar to TikaContentExtractor but instead of raw content and metadata it returns prepared Lucene document. However, in order to properly create the Lucene document which is ready to be indexed, TikaLuceneContentExtractor  accepts an additional parameter, LuceneDocumentMetadata, with the field types and type converters. For example:

LuceneDocumentMetadata documentMetadata = new LuceneDocumentMetadata("contents").withField("modified", Date.class);
TikaLuceneContentExtractor extractor = new TikaLuceneContentExtractor(new PDFParser(), true);
Document document = extractor.extract( Files.newInputStream( new File( "testPDF.pdf" ).toPath() ), documentMetadata  );

At this point, the document is ready to be analyzed and indexed. The TikaLuceneContentExtractor uses LuceneDocumentMetadata to create the properly typed document fields and currently supports DoubleField, FloatField, LongField, IntField, IntField, TextField (for content) and StringField (also used to store dates).

To demonstrate the full power of the CXF 3.0.2 content extraction and search capabiities, the demo project 'jax_rs_search' has been developed and is distributed in the samples bundle. The project could be found in the official Apache CXF Github repository. It integrates together Apache CXF, Apache Lucene and Apache Tika showing off some advanced features related to custom analyzers and different filter criteria (keyword and  phrase search).


  • No labels

0 Comments