This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Child pages
  • Open Relevance Viewer
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

Crowd Sourcing

Crowdsourcing is the act of outsourcing tasks, traditionally performed by an employee or contractor, to a large group of people or community (a crowd), through an open call [Reference].

Purpose

To develop and maintain a reusable software application that can be used for crowdsourcing. The software will be compromized of several key sub systems:

  • Importing Data - The system will not support importing data. It will proovide a welld defined API for writing plugins that retrieve data from a 3rd party source such as Solr/Lucene, Google, Bing, Sphinx, etc..
  • Exporting Data - Several formats should be availalbe such as TREC, CSV and XML. The possibe data sets to export should be judged, not judged and all.
  • User Interfaces
    • User can enter queries and then judge the results (as deep as they want, but at a minimum top 10).  All aspects of what they do is captured (the query, the results, the judgments)
    • User can give a whole set of queries (i.e. the TREC ones) and provide judgments.  Capture info as always

High Level Description

Functional Requirements

The functional requirements for crowd sourcing outline what technical hurdles that must be achieved in order for the crowd sourcing application to be effective as a tool for ORP.

Importing Data

Importing data will allow new material such as corpora and the associated annotation sets to be used on the corpora. Possible types of corpora can be text, image, or video based.

Exporting Data

Judging Modules

At this time the Judging module is required to present the following opptions:

  • Relevant
  • Not Relevant
  • Skip This

This set of choices is available for each result returned from a query.

Data Archive

The data archive should contain the following information:

  • Unique Query Identifier
  • Result set based on query
    • Result set should contain a URI for the result
  • Judgement set based on result set that was judged
  • Metrics
    • precision
    • recall
  • User UUID that ties the usage to a registered user or an anonymous user.

Data Visualization

Nothing planned at this time.

Server System Requirements

The software should be run on most operating systems.

Client System Requirements

The software should run inside an end user's web browser. Later versions of the software should support an API for submitting queries, downloading results, submitting judgements and exporting past query/result/judgement sets.

Non-Functional Requirements

Related Materials

TREC Format

  • No labels