PDF Print E-mail

Frequently Asked Questions

 

The following questions are presented in no particular order. Use your browser's search facility or // // ]]> the search at the top right of this page to find any particular topic.

 

What do you mean by Email harvesting?

 

We offer the facility to create an Email account entry in the Harvester, including password details, which allows the Harvester to periodically fetch all new Emails from that account.  The harvest cycle is decided by the administrator, and could be anything from 3 hours to a few minutes.  All harvested Emails are stored in the document server and may be extracted or viewed with your preferred Email client.  Attachments are separated and are stored (and indexed) as individual documents in the Lava document server.

 

We process thousands of Emails every day.  What are the practical limits for the Email management system?

 

This number of Emails may be handled either through the Email proxy server, or through multiple Email harvesters.  The proxy server will be able to cope with several thousands of Emails per day without problem.  Each harvester should be able to handle around 1000 Emails per day with acceptable cycle times, so provided sufficient harvesters are deployed this will also work fine.  Each harvester should be run on a separate machine (This can be a workstation, it does not have to be a server.  The harvester itself does not store any data.)

 

Are there facilities to import our old Email data into the system?

 

Provided your Email archives are in PST files, this should not be a problem.  The system includes facilities to import PST files directly into the document server, and will process them quite quickly.

 

If your archive is in another format, you may have to take a detour - your first target is to import your archive into a current version of Outlook, from which you can then export it to a PST file.  In order to do this, you might have to convert your data into a format supported by Outlook.

 

Alternatively, there may be a utility to convert your existing Email archive directly into PST format.

 

Once your archive has been converted to PST, this will import into the document server.

 

You mention "secure document transfer" in your specification. What exactly do you mean by this?


The term secure document transfer refers specifically to transfer of files or documents across the Internet. The majority of Internet transfers are done via FTP or even HTTP, or alternatively e-mail. None of these is secure in any way - firstly, it is too easy to intercept the file in transit. Secondly, arrival of the file at the destination is not guaranteed - transfers often fail. Thirdly (and this applies specifically to e-mail) the time of arrival is highly variable, and may be hours after transmission.

 

The Lava Vault uses a proprietary file transfer protocol completely unrelated to any of these. In addition to using two independent encryption algorithms to ensure data security, the protocol also ensures that every file is tested at the destination for integrity - the file either arrives as a perfect bit-for-bit replica of the original (which is almost always the case) or it is rejected entirely; there is no possibility of a "slightly corrupt" file. Lastly, the transfer is executed on demand - the file arrives as quickly as the bandwidth can support, there are no random delays.

 

Is there a difference between "secure document transfer" and "secure file transfer"?

 

No, the two are exactly equivalent.

 

There is no web interface to the document server - why is this?


The real question is why would you want one? For the minor convenience of being able to access the Lava Vault without having to install an application, you have the major inconvenience of a clumsy and slow browser interface forever. Added to this, browsers are nowhere near as secure as a custom interface.

 

The main reason why almost all software houses use a browser interface for access across the Internet is because they do not have the technology to use a decent Windows application interface. They would if they could. We do have this technology due to the distributed SQL database (Lava Distributed SQL) underlying the Lava Vault, and as a result we can present the same interface to users on the local area network and anywhere on the Internet. The advantages are a vastly improved user experience both in terms of the quality of the interface and the responsiveness of the information, as well as improved security. It's no contest, really.

 

How does the workflow system interact with users?

 

The system automatically displays prompts to users when required - the user does not have to check any list, he/she only has to log in.  All interaction is fully automated and requires no user intervention other than responding to workflow prompt windows.

 

What distinguishes your document server from other document servers?


There are many distinguishing characteristics - a fairly complete list can be found in the Lava Vault Brochure, available from the Information menu. Probably the most significant of these is the combination of secure document transfer across the internet coupled to powerful and fast full-text search. Another important distinction is the fact that the Lava Document Vault does not use a third-party database like Oracle or Microsoft SQL Server - as capable as these databases are, they are general purpose databases with high administrative overhead and no dedicated document handling facilities. Vendors using these databases have to patch document handling onto the database, always a second best choice. The Lava Database is inherently designed to manage, search and transfer documents and files, and does so far better than any generic database will ever do.

 

You state that the full-text search is powerful and fast. What exactly do you mean by full-text search, and how fast is it really?

 

The Lava Vault has the ability to extract the text from most popular document formats, including PDF and Microsoft Word (all versions including DocX). This extraction is done immediately on inserting the document into the Vault, and typically takes less than a second. This text is then indexed - again automatically - for rapid searching. The indexing process not only indexes every word in the document, but also two metaphone equivalents (sounds-like) to compensate for possible spelling errors either in the document or in search terms.

 

The search is very fast indeed. A search comprising 5 or 6 search terms on a Vault containing 100,000 documents will typically execute in less than a tenth of a second. The search facility is also very flexible - the user may search for all words, any words, exact spelling, sounds-like (amongst others) and can also specify words which will exclude documents if found.

 

I cannot justify allocating days of my time to evaluate this system. How long will it take for me to get the system up and running?


The initial installation will take a minute or two at most, and this will provide everything that you need to start up a Document Vault and connect with Secure Mail.

 

The quick -start guide will run you through the simple set of steps to get the default installation working. Within a few minutes you should be transferring and retrieving documents.

 

How does Secure Mail transfer documents and files?

 

It uses a secure and proprietary protocol, completely unrelated to e-mail or FTP - neither of which is secure or guaranteed.

 

Each file transferred is segmented, double encrypted with independent encryption algorithms, and controlled through multiple cyclic redundancy checks (CRC) to ensure perfect transmission of content.

Transfer is immediate and takes only as long as the available bandwidth requires to transfer the file. Transfers are not scheduled, they occur when you request them, and complete as you watch. Multiple files may be flagged for transfer, and will transfer directly one after the other.

 

Can I run the Vault on Oracle instead?

 

No. Conventional databases like Oracle, SQL Server or MySQL simply do not have the facilities or the performance to perform this kind of functionality. The Lava Vault is a highly specialized software system, and no port to any other database is planned.


Is the interface slow when operating across the Internet?


No. Due to innovative distribution technology, the interface operates as fast as if you were directly connected to the document server. Only document transfers are slower due to available Internet bandwidth.

 

How can I find documents on the Document Vault?


Even Secure Mail already provides very powerful means for filtering the total document set to provide reduced lists allowing location of a particular document. Mechanisms for filtering the listed documents include :

  • Selected document group (similar to a document folder on a PC)
  • Originating user
  • Creation date bracketing
  • Document type
  • Document class (a superset of document types, for example both word documents and acrobat pdf are classed as word processor documents)

In addition, the resultant list is presented in an intelligent datagrid which allows sorting, grouping and searching.


If you additionally license Secure Search, this allows extremely fast full-text searching on a number of document types including Adobe Acrobat (pdf) and Microsoft Office documents.


I will need to store well over a million documents. How will the Document Vault cope with this load?


No problem. The document vault will easily scale to a million or more documents, provided that your hard drives are large enough and fast enough (SAS drives would be appropriate for this number of documents).


Can I interface the Document Vault to the Web via a browser?


No, and no such interface is planned or intended. The Web is good for some things, but certainly not for security. The Document Vault is completely independent of all Web services and protocols, and will stay that way.  No browser interface is intended at any time.


Can I interface the Document Vault to my e-mail server?


No, and as for Web interface no such connection is planned. E-mail is neither secure nor reliable, and we have no intention of interfacing to e-mail at any time.


Is it possible to restrict access to certain documents?


Yes. Even in Secure Mail, users may be provided access to certain parts of the document group hierarchy, while excluding others. With Secure Control, there is much flexibility in allowing or denying users access to individual groups or group subtrees, as well as specifying list, fetch, insert and replace permissions on each group.


What operating system can I use for the server / client? What is the recommended operating system?


It is possible to run either server or client on any of the following :

  • Windows XP
  • Windows Vista
  • Windows Server 2003
  • Windows Server 2008
  • Windows Server 2008 R2
  • Windows 7

For all of the above, either 32-bit or 64-bit variant may be used.

 

We recommend Server 2008 or R2 for the Lava Vault, and Windows 7 for Secure Mail. There are significant network and memory management improvements in these releases which improve performance and reliability. Windows Vista works fine as a client, but Windows 7 does seem to be quicker and more stable. If you use Windows XP as a client, ensure that you have service pack III installed. XP is not the best option for a client, though, since neither the network layer nor the memory manager are at the same standard as Windows 7 or even Vista.

 

Older operating systems (Windows 2000 and before) are not supported at all. No Linux or Apple release is planned at this time.


I expect to support over a thousand clients from my database server. How will the Lava Vault behave under this kind of load? What kind of hardware will I need to plan for?


To support more than 1000 clients you will need a fairly powerful server, and most especially a server with multiple processors and fast memory. A server with at least an 8-core processor would be recommended, with as fast a clock speed as can be obtained. Your hard disk subsystem should also be fast - SAS hard drives on a good caching controller would be a good choice. Although both AMD and Intel processors will probably do the job, AMD processors are faster at task switching and will probably do better with this scale of load.

 

Given that you use the right operating system and hardware, you will support your 1000 clients or even more with ease. Peak server load should not exceed 50% under normal conditions, and typical load should be closer to 20%. With the server described you should be able to support around 2000 clients.

 

The scan function does not provide OCR. Why?

 

Optical character recognition is a very specialized technology, and like Hewlett-Packard we have decided that this is not a technology we wish to invest in. However, we are investigating third-party OCR libraries and (as HP have done) and will license an appropriate OCR library for inclusion in the product as soon as user demand justifies this.

 

I have many documents in formats not supported by the indexing function for full-text search, for example Word Perfect. When will additional formats be supported?

 

Each document format requires very specific analysis to extract the embedded text, and there are unfortunately no generic techniques for doing this. We intend to add additional formats to the indexing functionality on an ongoing basis, working from the most popular formats downward. Unfortunately, not all document formats are in the public domain and it is therefore not always easy to add these formats to the indexer. The Word Perfect format is fortunately published by Corel, and it will therefore be possible to extract the text. A release supporting Word Perfect should be available by August or September of 2010. In the interim, you may have to output documents from Word Perfect into PDF, or use a batch converter such as "Word Perfect Document Converter" which will allow multiple documents to be converted to PDF with a single command.

 

Can the workflow system interact with other databases?

 

Absolutely.  Through the use of LavaStream procedure nodes, programming interfaces of arbitrary complexity may be written to derive input from or write output to other databases or systems.  This data can be interfaced to workflow variables using special functions provided, to allow comprehensive interaction between the workflow and third-party systems.

 

 

 
 

Information