3.9 Construction of a Google Drive Database for use in spreadsheets - taking advantage of inbuild OCR features in Google Drive.
Horst Engels and Pedro Santos
Google Drive offers very good possibilities for construction of a (Personal) Database from documents stored in your Google Drive account. You have 15GB of free storage on one account, but if you use your 10 possible different accounts you can use up to 150GB of free storage space for your database. We are constructing at the moment a Database for collections of papers about Biodiversity on the Iberian Peninsula which can be accessed via a Google Drive Spreadsheet. The final database will also permit to build easily Google Fusion Tables in order to visualize the potential migration paths of species in the database.*
Storage of documentation in Google Drive
The process we use for storage of papers and other documentation is as follows:
- Having constructed a spreadsheet, you would like to link the spreadsheet to a bibliography database in Google Drive, for example the spreadsheet “Biodiversity on the Iberian Peninsula (Parte II - Northern Portugal)”.
- Therefore you can build in a second step a directory-structure with folders in Google Drive where you can store search results on documents (from your Google drive account), for example directories for genera, families and orders of mammals. Afterwards you can store always new research results from your Google Drive account into those directories.
What is the advantage?
The advantage is that Google Drive uses very good OCR technology in Google Drive searches that scan PDFs and even JPGs for text. These search results you can transfer to a Google drive directory/ies (normally in form of pointers to these documents) - and in a next step you can link your spreadsheet to those results.
(In the old Google-Drive Version you can do this simply by selecting the documents you have found in the search, press the CTRL-Key and drag the documents to your new folder storing the pointers to these documents, or you use the SHIFT-Key and drag the documents to the directory where you want to store the selected bibliography. If you use the SHIFT-Key you actually move the documents to this directory (That means you cannot store documents simultaneously in more than one directory but you can store pointers (holding pressed the CTRL-key during a drag and drop procedure)) to one document in several places (folders)). Therefore the CTRL-Key option is much better for the purpose. In the new Google-Drive Version the Drag-and-Drop Feature no longer works. But you can add files and folders to multiple locations by using a CTRL-Z Key combination.)
Try the link to “Microtus”
- Now you have to link in a third and last step the “bibliography”-cells in your spreadsheet (via hyperlink) to those directories in Google Drive where you hold your selected bibliography - and you will have access to your selected bibliography out from your spreadsheet.
You can now distribute your spreadsheet and grant selected access to your private or public library out from your spreadsheet. If not all folders in Google drive are public, a user of the spreadsheet has to ask permission for access to the folder in Google Drive (but the questioning happens automatically and the the permission can be granted via email). The data storage in Google Drive has a further advantage, namely that links in your spreadsheet have not to be necessarily actualized if URLs change, as the documents are always stored in your Google Drive account.
A disadvantage is that your selections in Google Drive are not automatically updated. This you have to do still manually (but updates have to be made only in Google Drive, not in your spreadsheet!).
An example of an occurrence map and a map with undirected graphs of potential migration paths with aids of the database and Google-Fusion Tables is given here for the otter (Lutra lutra) - shown are the potential migration paths from or to protected sites in Northern Portugal with respect to Ria de Aveiro. The procedure how to create these graphs will be shown in the blog post “3.10 Construction of Fusion tables for visualization of potential migration paths of animals with aid of a Google sheet.”
Potential migration paths for Lutra lutra from protected sites
in Nord-Portugal to Ria de Aveiro
(Link to Fusion Table “Lutra”)
*It
must be stressed out here that the simple graphs shown in this post can
give only a rudimentary idea of the potential migration paths - as
spatial conditions as topography, ambient conditions and species
specific requirements are not taken into consideration in the straight
line graph (- fish unlikely fly on a straight line from one location to
another). But also, at the moment the necessary data for estimation of
acceptable migration paths (with or without more sophisticated methods
and programs as the program “Maxent”) are not existent or unavailable
(at least not public or free). Furthermore, even much more elaborate
methods as “Maximum entropy” estimation are always bound to use positive information about presence of species. Therefore, absence of species
which might be true absence or only lack of confirmation, cannot be taken adequately into consideration. If information of
presence of species is not available due to a largely unknown
distribution of the species, as is the case for many or most of the
smaller and less “attractive” species (on the Iberian Peninsula and
worldwide), no good estimation of potential migration paths can be made
whatsoever. Other reasons may be lack of attention to endangered species
which are not categorized as in danger because of a lack of data
actualization (as for example due to climate changes).
Therefore the principal purpose of the database is besides collection of papers to show whether a species exists in protected locations and whether migration and/or reintroduction from nearby and hopefully genetically similar and well adapted populations is possible when the species becomes locally endangered or extinct. The second objective is to show (and fill in future) still existing big gaps of information, especially for the less known categories and species rich systematic groups as Invertebrates, lichens and bryophytes. The graph method with aid of Fusion tables is at the moment a by-product which can be however very useful for visualization of relations between locations or species.