User Tagging Design
Allow a user to tag a book
- Example: Allow erik to tag A10 with "lucene"
Allow a user to remove a tag from a book, or all instances of a tag they have used.
Remove all tags that were added by a specific user
Show number of books tagged with each tag, restricted by users current search results and filters.
- Example: user submits a search of "title:lucene", and the resulting tag counts are "lucene(2), solr(2), excellent(1)"
Notice that the count is number of books with that tag, not the number of tags... there are 3 "lucene" tags on the books, but only 2 books are tagged "lucene".
Show number of books tagged tagged by each user.
- Example: user submits a search of "title:lucene", and the resulting tag counts are "erik(2), yonik(1)"
Notice that the count is number of books tagged, not the number of tags on books.
When a user is tagging a book, allow them to type in the first few letters and then give a dropdown list of existing tags to choose from. Sort by tag popularity, optionally show counts.
Tag popularity: number of users using that tag, or number of books
with that tag? Either could work if necessary,
- Example1: user types in "so" into the textbox when tagging a book, and
they are automatically shown "solr(2), solrflare(1)" (uses #books tagged)
- Example2: user types in "so" into the textbox when tagging a book, and
they are automatically shown "solr(3), solrflare(1)" (uses #tag instances)
User selects an existing tag to narrow their search results by.
Any displayed results (including facet counts) must have all tags that
have been selected by the user.
- Example: narrow search results by the tag "solr"
Show all books a specific user tagged with a specific tag, or restrict search results by the same.
- Example: restrict matches to books with erik's "solr" tag => restricts to A11
Allow the user to narrow their search results by typing in
a tag instead of selecting it from a list. When the user has typed
one or two letters, automatically pop up a list of tags starting
with that prefix. Optionally sort tags by number of books it applies to
in the current search results.
- Example: search "title:lucene", user types "so" and is presented with solr(1)
Restrict books to those tagged by a specific user.
- Example: search "title:*" restrict to books tagged by erik => A10,A11,A12
- Example2: search "title:*", facet by tag, restrict to books tagged by erik:
(note that this does *not* restrict shown tag counts to erik's tags)
Restrict *tags* to those of a specific user.
- Example: search "lucene", facet by tag, restrict to erik's tags:
Restrict books to those tagged by a specific users.
- Example: search "title:*" restrict to books tagged by erik or yonik => A10,A11,A12,A13
- Example2: search "title:*" restrict to books tagged by erik and yonik => A10,A11,A12
When searching for a specific tag, increase the relevance of books that have more instances of that tag.
- Example: search for tag "lucene" and show A11 before A10
Restrict to tags added in the last year (or time period)
Machine Tags or Triple Tags
Flat Schema #1
Add tags directly to the documents as a single user/tag token.
The latter looks simpler, but the former allows phrase queries to match different components of a tag with a single query. A Lucene PhraseQuery across multiple fields would also work for the latter if this capability is needed.
Relevancy Calculations for Tags
To leverage Relevancy calculations, you'd include the tag as part of the regular fulltext search (q), vs. just adding it as a filter (fq).
If multiple users have tagged a document with "lucene", then that field's density for the term will be higher, so Relevancy will also tend to be higher. However, another document with only 1 tag, which happens to be 'lucene', will likely still rank higher than a heavily tagged document with only 40% of the tags equal to 'lucene', given Lucene's default relevancy formulas.
More advanced relevancy models would need more sophisticated implementations, for example perhaps a custom Similarity class.
q="utag:~erik*", get set of documents, remove all tags starting with ~erik
q="title:lucene" facet.field=utag2 facet.prefix=#
q="title:lucene" facet.field=utag2 facet.prefix=~
- Example1: facet.field=utag2 facet.prefix=#so
- Example2: not easily doable... would require more work within solr to count up tf's
- OR fq="utag2:"~erik #solr"
q=title:lucene facet.field=utag2 facet.prefix=#so
q=lucene fq=utag2:~erik facet.field=utag facet.prefix=~erik
- Example: q=title:* fq=utag2~erik OR ~yonik)
- Example2: q=title:* fq=utag2+~erik +~yonik)
??? reserve another prefix for fields like time