InterPublish is a web-based application for the automated monitoring, capture and distribution of internet content, for the purpose of maintaining current awareness among patrons of latest published developments in their field or fields of interest.
Broadly speaking the following steps are required to set up and operate InterPublish for this purpose:
A) InterPublish setup
Setting up InterPublish for the first time will generally require some assistance from Prosentient Systems for the nominated application administrator in the Library. Setup generally requires the specification within the application of the following information:
1) Metadata elements:
the metadata elements you wish to extract from any new documents that you find. These depend on your own specific requirements. Metadata provides context to aid interpretation of a document and is used to catalogue documents you decide to store in a repository. We are talking about things like Author, Date and Title.
2) Target sites:
targets are the internet sites that are to be searched. InterPublish needs to know their URL, how they should be scanned and how it should extract the key metadata (such as Author, Date, Title, etc) from documents on the site. It is possible to monitor just one, a couple or a number of sites. It is also possible to attempt to monitor the internet as a whole, by making an internet search engine site (such as Google) the target site you are monitoring.
3) Email addresses:
these are a simple record of the name and email address of any patron who is to receive document alerts from InterPublish.
4) Email lists:
lists consisting of groups of patrons sharing a common research interest and wishing to receive alerts about the same kinds of new content. A common interest might be based purely on subject matter, eg. 'indigenous issues' or it might be associated with a role, eg. 'senior management'. Mailing lists consist of email addresses and are associated with search groups.
5) Search groups:
a collection of one or more sites, together with the search and filter terms for locating documents of interest on that site. Search group are associated with mailing lists of patrons who wish to receive alerts about new documents that are found by the search group. The search group contains all the business rules for implementing a document scan and notifying interested patrons of the results.
B) InterPublish in operation
When a (scheduled or manual) scan is performed InterPublish uses the site details and the search terms found in the search group to look for new documents that may be of interest to those on the search group's mailing lists. If new documents are located on the search group's site/s, a three-step process is triggered.
1) Initial site scan:
First the new documents are parsed using the site search terms to extract the metadata of each document (eg. Author, Date, Title). A transcript and a PDF of each new document are also generated. This information is saved in InterPublish's own database. The result of this first scan is an initial list of candidate documents, some of which may not be relevant.
2) Filter scan:
Next, InterPublish applies the positive and negative filters (if provided) to further refine the candidate list by eliminating documents that may be off-topic. The positive filter specifies terms that must be present in the document but is able to target these more precisely with the use of logical operators and metadata fieldnames. In a similar way, the negative filter specifies terms that must not be present in the document. This is very useful for reducing false positives that will occur because search terms have necessarily been used that were too broad. The outcome of the filter scan is an initial shortlist of documents for the librarian to review.
3) Review and action:
The librarian will now use the Review functions to visually inspect the shortlist of documents to accept those documents that are interest and reject or pass over those they consider are off-topic.
In the Review and Alert function acceptance by the librarian will trigger an email notification containing the relevant metadata and a hyperlink of the selected documents to all the addressees on the email list associated with the search group.
In the Review and Publish function the metadata of accepted documents will be mapped to MARC and/or Dublin Core and the records will be written to the library's repository application. InterPublish currently supports Koha and DSpace as document repositories but can be integrated with any similar software.
Broadly speaking the following steps are required to set up and operate InterPublish for this purpose:
A) InterPublish setup
Setting up InterPublish for the first time will generally require some assistance from Prosentient Systems for the nominated application administrator in the Library. Setup generally requires the specification within the application of the following information:
1) Metadata elements:
the metadata ...