Google as shill for fee based services

I was consoling a friend yesterday who is an expert in online and database searching. “Everyone wants to hear about Google” he said “my job is becoming all Google all the time” I paraphrase, but we all know how it is. I’ve become increasingly leery of Google lately as they form more and more partnerships with fee-based publishers and vendors and also index their sites for Google’s master index. Can anyone explain to me why a Google search for jessamyn ineligible academy [backstory] nets me five results, one of which is a PDF, with no accompanying “show as HTML” link, and flavortext that is from the article itself [or its abstract] that is not available via the linked site except through a subscription? I’m sure there’s an obvious explanation — like maybe the article was online for free and now it’s not — but why no HTML link, and where did that text come from if it’s not in the linked page? I sent Google a note and trolled their FAQ for details, but all I can deterrmine is that, according to the current FAQ, Google isn’t supposed to do that. I’d love to hear some reasons why it does.

Note from a reader, apparently Google Scholar may crawl full text, and show the abstract in the results, even if it only allows access to a citation. Is it too much to ask that Google have a way to avoid these fee-based results, or mark them somehow? I know how to remove PDFs from my search results, but not how to remove all non-full test sources. Even my library can do that. Then again, they’re not trying to make money off of their search results.

About Google Scholar crawling the full text from certain publisher sites — here’s what a Google spokesperson told us today: “…where we have permission to crawl a doc we will do so, but will only show an abstract.”

LoC Catalog Enrichment Initiative

The titles that libraries are removing to remote storage facilities often are the same ones that have the least rich library records, thus dooming them forever to being less and less frequently accessed. What to do? Enter the Library of Congress Catalog Enrichment Initiative.

users who rely on browsing the library shelf for the purposes of discovery and selection risk missing more and more material that might be of interest. Anecdotal and transaction log evidence has it that few use the browse feature of library online catalogs, not only because it is uninteresting visually but because the information users need in order to select what they want is not present. Until recently there has been no recourse except to the stacks.

hi – 20nov

Hi. The hardest thing about having a whole life and a whole blog is when you have to make choices between one and the other. I’m away this weekend at a wedding, I’ll be back for a few days and then I’m off to Australia with indeterminate access for two weeks. Of course, these notices were more important in pre-RSS days to keep you from clicking through to my page, getting annoyed that I never updated, and then never coming back. In any case, there is always more to say and I think heading into what we affectionately call the “big blue room” for a few days can’t hurt, can it?

Posted in hi

google scholar, some more perspectives

Jeremy at Digital Librarian has a few more words about Google Scholar [or as some are calling it, schoogle] that sums up a lot of how I feel about it. [see also: metafilter and slashdot]

We need to stop be re-active, and start being proactive. Our vendors are not going to move us forward in the ways we need; they are reactive to our needs, not to our future. It is very easy to be passive as a community, and to let outside forces map our route. It is much harder to take control of the wheel and do the mapping ourselves. But until we do, the “Where do we want to go today?” will continue to be the rhetorical question that is only answered by the company (or vendor community) that asks it.

ranganathan’s laws, updated

My pal Fred from ibiblio said he met Lennart Björneborn this week. I checked out his site and he’s adapted Ranganathan’s five principles of library science to the web world. Even though they are copyrighted [?], I’ll include them here:

  • Links are for use – the very essence of hypertext
  • Every surfer his or her link – the rich diversity of links across topics and genres
  • Every link its surfer – ditto
  • Save the time of the surfer – visualizing web clusters and small-world shortcuts
  • The Web is a growing organism