Previous   Contents   Next
Issues in Science and Technology Librarianship
Winter 2004
DOI:10.5062/F47S7KQ9

URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed.

[Refereed article]

Not Just Full-Text Articles: Comparing the Search Function Among Chemistry Electronic Journals' Web Sites

Song Yu
Chemistry Librarian
Columbia University
sy2133@columbia.edu

Abstract

Nearly every electronic journal's web site offers a "Search" function for its users in addition to supplying tables of contents and full-text access to articles. The purpose of this study was to analyze and compare the search features on five chemistry journal publishers' web sites to which Purdue University has full-text access. We compared the search interface, search capabilities, and the output and display. We found that although the "Search" functions on all of the five web sites are not as powerful as those of commercial databases, they offer an efficient way for the users to perform a citation or bibliographic search.

Introduction

A search on a publisher's web site differs from that of a database or general web search engine in three aspects: coverage of the searchable data, authority of the collection, and complexity of the search. A database is defined as "a usually large collection of data organized especially for rapid search and retrieval (as by a computer)" (Merriam-Webster's Collegiate Dictionary). New items are selected and entered into a database from a number of resources chosen by experts in that discipline. Much time and energy are spent on indexing each item to guarantee the rapid retrieval of the data. Ideally, the same search done in a database at different times will return the same set of results plus new items available after the previous search.

On the Internet, users cannot be assured of the validity and accuracy of many of the web pages created. The World Wide Web expands so rapidly that even the most powerful web search engine cannot claim that it indexes all the web pages, and a search on the same search engine at different times may retrieve quite different sets of results.

Publishers' web sites provide access to only select publications. Some publishers make the entire contents of their journals available while others offer less coverage, for example, a few years of articles. Their collections are relatively small compared to those of major databases or the Internet. Since the publisher's primary concern is to collect quality manuscripts, and to provide access to articles for readers (Tenopir and King 2001), they are less interested in providing efficacious indexing to the items or in building powerful search engines to ensure fast and accurate data retrieval. Unlike the search from a web search engine, users can be sure that the retrieved items are at least from reliable rather than questionable sources.

Chemical Abstracts is the largest scientific database in chemistry and related fields (Ridley 2002). Scientists, researchers, librarians, and information specialists may wonder what the benefit is of searching on publishers' sites since they cover only a small amount of the chemistry literature as compared to Chemical Abstracts. In academic libraries, SciFinder Scholar is a popular search interface used to search Chemical Abstracts, Medline, and other chemical databases. Subscription to SciFinder Scholar is so expensive that many libraries can barely afford a license with one user at off-peak hours. Libraries that subscribe with multiple user licenses still see a large number of users turned away at peak times.

On the other hand, a search on a publisher's web site is free, quick, and always available. Often a user just wants to check, verify, locate a citation from a certain journal, or find articles in several frequently read journals, rather than doing serious literature search. Some people do use the search function on the journal's web site. Mercer (2000) noted that there were 1,169 searches on the Journal of Biological Chemistry's web site from Washington University in December 1999. To many academic librarians and information specialists, these sites are useful supplements to the expensive, heavily used commercial databases.

The purpose of this study was to analyze and compare the search features on five chemistry journal publishers' web sites to which Purdue University has full-text access.

Methods

Purdue University Libraries subscribe to thousands of electronic journals. In the Mellon Library of Chemistry, there are more than 300 current subscriptions, two-thirds of which have electronic full-text access. Five publishers' web sites were selected for comparison. Two are from professional society publishers: American Chemical Society (ACS) Publications and Royal Society of Chemistry (RSC), and three are from commercial publishers: ScienceDirect from Elsevier Science, Springer LINK, and Wiley InterScience. The URLs of these publishers can be found in the Appendix.

The sites chosen represent the major publishers in Chemistry. Their journals are typically part of the chemistry collections in academic libraries. These journals are also representative of the peer reviewed journal titles that chemists read. We can focus on evaluating the performance of the "Search" function instead of worrying about the quality of the journal articles.

No criteria on evaluating a publisher's web "Search" function have been reported in the literature. The criteria of web search engine evaluation by Froehlich (2001) and the CD-ROM evaluation by Harry and Oppenheim (1993) were adapted to use in this study. Since no two publishers can publish the same article, there is no overlap of the contents between any two publishers' web sites and no need to evaluate the contents or the coverage of the sites.

Three major criteria, i.e., the search interface, search capabilities, and the output and display, are selected. The sub-criteria under each category are modified to suit the characteristics of publishers' web site searching.

The criteria to evaluate the "Search" functions on publishers' web sites are listed as follows:

All the web sites were accessed and evaluated in June 2002.

Results and Discussions

The detailed results under each major criteria for the five publishes' web sites are shown in Table 1-3 and discussed below.

Search Interface - Table 1

Generally, it is easy to locate the search pages from the selected web sites. The links are either on every journal's front page or on the navigation bar on the publisher's web site.

Two of the publishers, ACS Publications and RSC Publications, have one search interface while three others have separate basic/advanced search pages. The search interfaces for Springer LINK are unnecessarily complex. There are four different search interfaces in the Expert (advanced) Search: Bibliographic, Command Line, Full Text, and DOI (note 1). This complexity presents potential confusion to users in selecting the right search interface at the outset.

Basic bibliographic data, such as author, title, abstract, and key words, are searchable on all the sites. All the sites, except Wiley InterScience, clearly state that the user can search full-text. Some sites have added other searchable fields like affiliation, ISSN, or reference.

Except for the Springer LINK site for its multiple search interfaces, the other search interfaces are quite understandable. Most of the sites add a brief explanation to the search form, or provide examples for users to follow. There is some confusion about basic vs. advanced search. For the interfaces with forms to help users choose the searchable fields and Boolean operators, some sites call it basic search, while others call it advanced search. The search interfaces that ask users to build up a command line by using Boolean operators and nesting is called the advanced search on some sites, while on others it was called basic search.

All of the five web sites have placed the help button (or help link) in a convenient or obvious place. A user can find the help from the navigation bar of the site, or hints and tips beside the search fields.

Search Capability - Table 2

Besides searching online and pre-print journal articles, ACS publications and ScienceDirect allow searches of archived articles dated back to volume one, issue one (note 2). Being able to search old articles on the publisher's web site is an advantage for users to retrieve more information out of one search, although it doesn't mean that one can get old full-text articles online all the time (it depends on each institution's subscription).

The search mechanism for all the sites is based on text searching. Chemical structure and reaction searching are not available since the publishers do not have proper indexing for structures and reactions in their publications. Chemical formula searching is possible using a simple molecular formula as a searchable text string, but it won't guarantee the best retrieval. Some sites suggest using chemical name searching instead.

All the web sites can use AND, OR, and NOT to combine the search terms. ScienceDirect can even do a proximity search using "w/"- within followed by number of words between two search terms. Except ACS Publications, other web sites don't require case sensitivity when users input the search terms. The commonly used "?", "*", or "!" are also used in truncation or wildcard searching on most of the publishers' sites. Users need to find out which site is using which sign by referring to their help files.

Since users often visit publishers' web sites looking for a known article, it is a very helpful feature for all the web sites to enable users to input the volume, issue numbers, and page numbers to locate articles. ACS Publications and Springer LINK also allow users to input DOI, which is not useful unless the users know how to locate DOI, and are willing to type in a long string that consists of combination of numbers and characters.

Another equally important feature is to be able to locate the articles by Author; a search that can cause a lot of confusion on some of the web sites. In late May of 2002, there was a discussion on the mailing list CHMINF-L (Baysinger 2002) about the search tips for searching ACS journal archives. ACS doesn't specify the last name, first name/initial, and middle name/initial parts of an author name in the record. The search engine will take an inquiry and search every component across the author field. As a result, users may get false results since the last name, first name/initial, and middle name/initial can appear in three different authors' names. To make things more complicated, users need to worry about upper/lower case in names ending with "-ing" because the search engine will cut it off if the word is not started with an upper case character, and do a word stem search. As a result, an author search of "pauling" may retrieve results of paul, Paul, Pauling, Pauline ("Search Tips" 2002). This problem is not unique for ACS Publications. Other publishers' web sites have more or less the same problem.

Realizing the above difficulties, Springer LINK recommends that the user enter just the last name with wildcard for an author search. They have a separate window beside each search result listing all the authors in that article to help clean up the author search. Lack of authority control makes this extra helpful step difficult to use. The author list appearing in the following article is P. Pierobon, L. De, R. Minei, and V. Di.:

"Arachidonic acid as an endogenous signal for the glutathione-induced feeding response in Hydra." P. Pierobon, L. De Petrocellis, R. Minei, V. Di Marzo. Cellular and Molecular Life Sciences. 53(1), 1997, 61-68.

Help files are often poorly developed; the Royal Society of Chemistry's help page, for example, is overly simplistic. It doesn't help the users to understand the search interface better, or to build up an improved search query. Other pages, like the Springer LINK help page, are so long that one almost gets lost while scrolling up and down the page. ScienceDirect has a well-designed help page with sufficient examples.

Display and Output Options - Table 3

Besides ACS and RSC publications, other sites will return the search results with the search query appearing on the top of the "results display" page. Most of the sites offer a way to refine the search by providing a "Refine Search" window on the bottom of the results display page, though it is not as powerful as the original search interface. Users cannot specify which field to search for the added term(s). Thus, the best way to fine-tune the search is to click on the "Back" button of the web browser and modify the search from the original search interface.

All the sites display the search results by showing the bibliographic information (i.e.: title, author, journal title, volume and issue, pages) as well as links to abstract and full-text. One would expect search terms to be highlighted in the result list to get an instant feedback on the search. Only two out of the five sites, i.e., ScienceDirect and Springer LINK have this feature. Even on these two sites, the search terms are not highlighted on the "results display" page but on the abstract page, which requires a user to do more clicking through the site. Without the help of the highlighted search terms in the results, users have to depend entirely on the default ranking system of the site.

Most of the sites sort the results by relevance. ACS Publications has a brief explanation of how its ranking system works. It clearly explains why additions/corrections often appear with a high relevance from an ACS publication search. Wiley InterScience gives a vague explanation for the "Search Score" returned with the search results. Other sites don't mention anything about the ranking mechanism at all. For a simple search, such as searching "gas chromatography", it is quite possible that all the results are ranked 100% by relevance, which doesn't help the users decide which articles to choose.

A search does not necessarily end by retrieving a large number of records. It should also have the possibilities to manage the results. Besides ranking the results, the possibilities include choosing/marking the records, saving queries/records for further reference, and exporting records without the help of "Print" and "Save" functions of the web browser. ScienceDirect and Wiley InterScience are two sites that give users these options. They offer free registration to their sites. Once you log in with your user ID, you have the capability to save your search and retrieve it at a later time. ScienceDirect even goes one step further. As a registered user, you can turn on the search history so that you will have a list of all your search queries in a list, and you can combine, edit, or delete your searches and save the history to your account. You can export search results from ScienceDirect to EndNote.

Conclusion

With the increasing popularity of electronic publishing (Tenopir and King 2000; Wilkinson 1998), the publishers need to pay more attention to building up a fast and reliable search engine on their sites. If well designed, a powerful search engine can bring to users not only highly relevant full-text articles, but also web enhanced information such as discussion forums, technical updates, 3-D images, and so on (Porteous 1997).

The five publishers' search sites all have advantages and disadvantages of their own in the three major criteria discussed above. For example, users can search the full text of the articles, which is often ignored by commercial databases (Huber 2000). Few offer options for exporting search results to citation management software, which most commercial databases are capable of.

The search feature provides a convenient way for the users to locate the information from the e-journal's sites. It is not the top priority for the publishers to build an elaborate search feature as those found on commercial databases. For those users who occasionally need to search the e-journal site, a search feature that is easy to locate, and that has simple and understandable search interface with a handful of searchable fields and powerful capabilities, and which can allow one to locate highly relevant articles and other related information is sufficient.

Appendix

Publishers' Web sites in this article:

Notes

1. DOI: Digital Object Identifier, a persistent and reliable identification of digital objects. For more information, check: http://www.doi.org/

2. ACS Publications has the ACS Archives ( http://pubs.acs.org/archives/) for all its publications with the earliest one dating back to 1879. Elsevier ScienceDirect started its "Backfiles" program in early 2001. All of the Chemistry backfiles were available in early 2002 ({http://www-east.elsevier.com/sdinfo/backfiles/collections/chemistry/index.shtml}).

References

Baysinger, G. 2002. ACS Journal Archives - Search tips. May 21, 2002. In: CHMINF-L [CHMINF-L@indiana.edu]. Archived at: {https://listserv.indiana.edu/cgi-bin/wa-iub.exe?A2=ind0205&L=chminf-l&T=0&P=17095} [July 1, 2002].

Froehlich, T. J. 2001. Case Study 5.1: Developing search engine evaluation criteria. In: Library Evaluation: A Casebook and Can-do Guide, pp. 183-198. Englewood, Colo.: Libraries Unlimited.

Harry, V.; Oppenheim, C. 1993. Evaluations of electronic databases, part I: criteria for testing CDROM products. Online & CDROM Review 17(4): 211-222.

Huber, C. F. 2000. Electronic journal publishers: a reference librarian's guide. Issues in Science and Technology Librarianship Summer 2000. [Online]. Available: http://www.istl.org/00-summer/article2.html [September 23, 2002].

Mercer, L. S. 2000. Measuring the use and value of electronic journals and books. Issues of Science and Technology Librarianship. [Online]. Available: http://www.istl.org/00-winter/article1.html [July 2, 2002].

Merriam-Webster's Collegiate Dictionary. [Online]. Available: {http://www.merriam-webster.com/} [June 20, 2002].

Porteous, J. 1997. Plugging into electronic journals. Nature (London) 389: 137-138.

Ridley, D. 2002. Information retrieval: SciFinder and SciFinder Scholar. Chichester: Wiley.

Search Tips. 2002. American Chemical Society Publications. [Online]. Available: http://pubs.acs.org/archives/popups/searchtips.html [July 2, 2002].

Tenopir, C. & King, D.W. 2000. Towards Electronic Journals: Realities for scientists, librarians, and publishers. Washington, D.C.: Special Libraries Association.

Tenopir, C. & King, D.W. 2001. Lessons for the future of journals: Science journals can continue to thrive because they provide major benefits. Nature (London) 413: 672-674.

Wilkinson, S. L. 1998. Electronic publishing takes journals into a new realm: Publications slip off restrictions of print world and carve out a unique identity. Chemical & Engineering News May 18, 1998. 10-18.

Table 1: Search interface comparison among five publishers' web sites: ACS Publications, RSC Publications, Elsevier by ScienceDirect, Springer LINK, and Wiley InterScience
  ACS Publications RSC Publications ScienceDirect Springer LINK Wiley InterScience
Where to locate the search page? From journal's web page, or from the web site navigation bar. From journal's web page From the web site navigation bar. From journal's web page, or from the web site navigation bar. From the web site navigation bar.
Is there a Basic/Advanced search choice, or one search interface? One search interface. One search interface. Basic/advanced search choice Basic/several advanced search choice Basic/advanced search choice
What are the searchable fields? BBD(a), Full text BBD(a), Addresses, Full text BBD(a), Journal Title, References, ISSN, Affiliation, Full text BBD(a), Affiliation, Full text BBD(a), Affiliation, All fields (b)
Is it easy to find Help for searches? "Search Tips" link at the bottom of the search interface. A new browser window will open. "?" linked to help placed at the end of each search window. From the web site navigation bar A "Help" button beside each search window. From the web site navigation bar
(a) BBD: Basic bibliographic data -- author, abstract, article title, and keywords.
(b): Don't know if it can search full text.

Table 2: Search capabilities of the search function among five publishers' web sites: ACS Publications, RSC Publications, ScienceDirect, Springer LINK, and Wiley InterScience
  ACS Publications RSC Publications ScienceDirect Springer LINK Wiley InterScience
Time coverage From Vol.1, Iss. 1 to preprint articles Online journals only From Vol. 1, Iss. 1 to preprint articles Online journals only Online journals only
Publication date search By month/year By year By year By month/year By month/year
Chemical formula search No No Possible but not recommended Possible but not recommended No
Case sensitivity Yes No No Users can turn this feature on or off. No
Boolean and/or positional operators and, or, not and, or and, or, and not, w/ and, or, but not and, or
Truncation or Wildcards Truncate terms ending with -ing. No Singular/plural, !, * *, ? Truncation in Abstract searching
Other search capabilities Journal titles, citation search, or by DOI(a) Citation search Journal titles, citation search PACS classification (b), journal titles, language, or DOI. Journal titles, citation search
Utility of the Help file Brief with few examples Brief with few examples Well structured help with sufficient examples. Help is very long with some examples. Brief with some examples.
(a): DOI--Digital Object Identifier
(b): PACS -- Physics and Astronomy Classification Scheme. For more information, check this URL: http://www.aip.org/pacs/

Table 3: Display and Output options for search results among five publishers' web sites: ACS Publications, RSC Publications, ScienceDirect, Springer LINK, and Wiley InterScience.
  ACS Publications RSC Publications ScienceDirect Springer LINK Wiley InterScience
Search query displayed No No Yes Yes Yes
Accessibility to search history No No Yes (a) No No
Search terms highlighted No No Yes, on the summary plus page. Yes, on the abstract page No
Sorting options Relevance, Date, Journal Relevance Relevance, Date Relevance Relevance, Date
Capability to choose and Save the results (b) No -- Check box available, but no further actions can be taken. Yes (a). Results are saved in ASCII format. No Yes (a). Can choose, and view the results (only one record can be viewed every time).  
Save searches for future retrieval (b) No No Yes (a). Can also export results to EndNoteŽ No Yes (a)
(a): Need to be a registered member. The registration is free.
(b): The criterion is evaluated without using the "Print" and "Save" function of the web browser.

Previous   Contents   Next

W3C 4.0 
Checked!