We are pleased to announce the third alpha release of the Aperture framework.
Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (e.g. file systems, web sites, mail boxes) and the file formats (e.g. documents, images) occurring in these systems.
The most notable feature in this release is a new IcalCrawler. It works with
iCal files generated by many calendaring applications (Apple iCal, Korganizer,
Lotus Notes …). It uses a ical-rdf mapping developed by the W3C Rdf
Calendaring group. Apart from that there are numerous small improvements and
bugfixes. The tutorial has been expanded with more code examples and UML
diagrams to facilitate learning for new users.
This the last release before the switch to the RDF2Go framework.
(The curious can already examine the RDF2Go branch in the cvs).
The project homepage:
Aperture 2006.1-alpha-3 can be downloaded from here:
What’s new in alpha-3?
– new IcalCrawler
– added MIME type detection for many formats:
– improved MIME type detection of MHTML files (web archives)
– introduced HtmlParserUtil, containing large parts of the HtmlExtractor
implementation, as HTML (fragments) may occur in other document types
as well (e.g. saved mails, see MimeExtractor)
– added ThreadedExtractorWrapper class, for catching and interrupting
– added RepositoryAccessData, an AccessData implementation storing its
information in a Repository
– added ability to specify a port number for an IMAP source
– set target platform to Java 5