ContributorsPublications & Other ResourcesPhotographs & Other MediaExpeditionsSearchAbout This SiteCalifornia Explores the Ocean HomeSan Diego Historical SocietyUniversity of California, San Diego LibrariesCalifornia Explores the Ocean Home
TEXTUAL RESOURCES
Expedition Reports, Fish Bulletin, Fish Bulletin Fish Catch “Landing” Statistics, and Oral Histories were digitized using varying methods and formats. OCR, Rekeying, HTML, SGML, and XML are discussed below.
Expedition Reports
Expedition reports commonly include the track of the vessel, list of personnel and ports of call, the expedition objective, and the scientific results of the expedition. Expeditions with multiple vessels were divided into cruises, with a cruise representing the track of each individual vessel. Lengthy cruises were divided into legs, which represented the work of the vessel between two points. When funding agencies no longer required expedition reports, some scientists continued the practice of writing these reports and others did not. In this project, Scripps expeditions, important for their contributions to geophysics, were selected and expedition reports for each of these specific expeditions have been selected. These expedition reports were used to select photographs, track charts, correspondence, cruise narratives, and other content that illustrate the expedition, scientists and work at sea.

1,000 pages of expedition reports were scanned, OCR’d and encoded by SPI Technologies. For preservation purposes, each page (including the cover, the title page, back matter, and advertisements) was scanned at 600 dpi TIFF Group IV lossless compression and burned to CD-Rom media. The text was encoded in SGML using Level 4, TEI-Lite with 218 accompanying tables, graphs, and photographs embedded throughout as 300 dpi GIF images. The encoded text was checked for accuracy and consistency, parsed, and enhanced. The search and retrieval tool used to deliver the Expedition Reports is available through the Online Archive of California, OAC 2.5, using the DLXS software platform. As a result of this new platform, the SGML files have been converted to XML (full TEI).
Fish Bulletin
The California Department of Fish and Game's Fish Bulletin is a core resource for the study of fish and fisheries in California. Continuously published as a monographic series since 1913, the Fish Bulletin contains in-depth monographs on a variety of topics, primarily marine, and also including some non-fish marine species. Some Fish Bulletin titles are of specialized interest to scientists, state officials, and those with fishery management interests. Many Fish Bulletin titles however are of general public interest, constituting general works on marine species. These general interest titles cover marine fish (including specific titles on sardine, grunion, halibut, tuna, etc), sharks, sea lions, clams/mussels, abalone, squid, historical shore whaling, historical commercial fishing, etc. The Fish Bulletin is uncommonly held among California libraries, particularly in complete runs, and online access will greatly improve access to this resource.

16,500 pages of the Fish Bulletin were scanned, OCR’d and encoded by SPI Technologies from the original source material. For preservation purposes, each page (including the cover, the title page, back matter, and advertisements) was scanned at 600 dpi TIFF Group IV lossless compression and burned to CD-Rom media. The text was encoded in SGML using Level 4, TEI-Lite with 11,165 accompanying tables, graphs, and photographs embedded throughout as 300 dpi GIF images. The encoded text was checked for accuracy and consistency, parsed, and enhanced. The search and retrieval tool used to deliver the Fish Bulletin is available through the Online Archive of California, OAC 2.5, using the DLXS software platform. As a result of this new platform, the SGML files have been converted to XML (full TEI).
Fish Bulletin Fish Catch “Landing” Statistics
The Fish Bulletin also published an important collection of fish catch "landing" statistics for California. Published under various titles as The commercial fish catch of California for the years..., The marine fish catch of California for the (years)..., California marine fish catch for (year), and California marine fish landings for (year), these important statistics, covering the years 1916 through 1986, are not available online to the public, and provide a rich source of information for those who study the utilization and management of California fisheries.

Over 3,000 pages of fish catch statistic tables were triple blind rekeyed by SPI Technologies. Locally, each table was photocopied and magnified for easier legibility. A master codebook and spreadsheet samples were created to describe the over forty various table types. Each table was assigned a specific table type, which included the bulletin number, the table number, the page number, and data elements from the original source. The master codebook, spreadsheet samples, and photocopied tables were delivered to PDCC and then returned as MS Excel spreadsheets. Data from the spreadsheets were closely examined for accuracy and consistency and a thesaurus of terms and a list of data available were then prepared. The spreadsheet data were converted into SAS data sets. Using tools the UCSD Libraries have developed for Web analysis of economic datasets, a user-friendly search and browse front end was developed for the SAS data sets so that users can locate, display, and graph data of interest and/or just download the data needed for use with statistical or spreadsheet software.

The metadata for these datasets is marked up in XML using the Data Documentation Initiative (DDI) Document Type Defintion (DTD).
Oral Histories
The San Diego Historical Society owns dozens of oral histories that document the rise and decline of the California fishing industry. Over 50 oral history transcripts on topics of popular interest such as tuna fishing, lobster harvesting, agar processing, and whaling, have been digitized.

Original typescripts of 52 oral history interviews, ranging in length from 3-68 leaves, were scanned in black-and-white at 200 dpi on an HP Scanjet 5370C. Automated OCR processing was performed with ScanSoft’s OmniPage Pro 11. The interviews were saved as MS Word 2000 documents and encoded in HTML for Web presentation; and they were saved as ASCII text files and burned to CD-ROM media.

CONTENTdm was used to create and store metadata records for the oral history interviews. Fourteen metadata elements were used and mapped to Dublin Core.




© Copyright UCSD, All Rights Reserved. This site may not be reproduced.
UCSD Libraries, 9500 Gilman Drive #0175, La Jolla, CA 92093, 858-534-3336