Austehc logo
[ University of Melbourne | Austehc Home ]
AUSTRALIAN SCIENCE AND TECHNOLOGY HERITAGE CENTRE

Web Indexing Workshop - Introduction

ITC, The University of Melbourne, Tuesday 11 July 2000, 9.30am - 4.30pm
Program | Introduction | Case Study One | Case Study Two | Future Directions

Introductions
Your Austehc workshopers:
Class introductions - introduce yourself, what you want to get out of this workshop and describe your Web experience and/or experiences of the Web.
What is Web Indexing? | Search engines - Enemy or Ally? | Metadata | Subject Gateways
 
What is Web Indexing?
  • From the introduction to the Web INDexers MAILing List:

    "In the most general sense, Web indexing means providing access points for online material which is available through the use of World Wide Web browsing software. The key to this is the use of human intellectual input to analyse and categorise material, rather than reliance on computerised searching tools.

    Some of the issues falling under this heading would be:

    • Uploading of 'traditional' indexes (and the documents to which they refer) on to the Web to provide a wider audience with access to them.
    • 'Micro' indexing of a single Web page, in order to provide users with hyperlinked access points to the material on the page.
    • 'Midi' indexing of multiple pages, largely or wholly contained within a single Web site and falling under the responsibility of a single Webmaster.
    • 'Web-wide' indexing, providing users with centralised access to widely scattered material which falls under a single heading (e.g. every Web page dealing authoritatively with breast cancer).
    • 'Macro' schemes designed to simplify or unify access to large numbers of Web pages falling under many different headings (e.g. every Web page dealing authoritatively with any medical topic).
    • The addition of comments and annotations to provide users with some guidance before they link to selected sites and pages."

  • Does the Indexing Evaluation Checklist apply to web indexing? Are there other criteria?
Return to top
Search engines - Enemy or Ally?
  • Search Tools - a comprehensive guide to web site search tools. [ See: What is a Web Site Search Tool And Why Would I Want One? ]

  • Search Engine Watch - a site for information about search engines, from how they work to how you can use them better. Have a look at information for first timers.

  • Search Engine Showdown - the user's guide to Web searching.

  • Cheap, expandable and fast versus undiscerning and limited to text being searched.

  • Optimal use - can work well until a critical mass is reached - that critical mass depends on the nature of the content and the nature of the searching. Most of the big search engines have turned to some sort of human judgement. [ See: Google - ranks using link analysis and Northern Light - human categorisers ]

  • Can search engines aid the indexer? Do we need to make users smarter?

  • Local search engines - befriend your enemy. In our first case study we will look at how a local search engine and an index can work together.

Return to top
Metadata
  • The metadata movement sprung up as the WWW search engines began returning thousands of pages to queries. Some search engines began to add the use of metadata tags to their algorithms in order to provide better discrimination of results. Then creators started to add misleading metadata tags in order to get their site noticed!

  • The Dublin Core metadata initiative began when a group of 'computer science, librarianship, online information services, abstracting and indexing, imaging and geospatial data, museum and archive control, and other' professionals, got together in Dublin, Ohio in March, 1995 'to address and advance the state of the art in the development and extension of methods, standards, and protocols to facilitate the description, organization, discovery, and access of network information resources.'
    [Ref: OCLC/NCSA Metadata Workshop: The Essential Elements of Network Object Description]

  • From the initial and subsequent meetings has come a set of 15 elements for resource description that make up the Dublin Core Metadata Element Set. Australia has been a key encourager of DC metadata through the library and recordkeeping communities. [ See: MetaMatters and Australian Government Locator Service ]

  • Some metadata examples:

  • Possible roles for indexers -
    • Helping in the creation of metadata - schemas and protocols for DC.Subject, DC.Description, DC.Coverage.
    • Harvesting metadata into indexes - adding value to auto-harvesting via organising and context enhancement.

  • In our second case study we will look at metadata indexing of a resource and how it may be incorporated in a subject gateway.

  • Issues
    • Lots being created but how much is being utilised?
    • Is the real power of metadata being able to make combined searches/indexes across metadata elements? e.g. a subject keyword with a certain date coverage? [ See: Agrigate Advanced Search ]
    • Scalability - how to ensure efficient management and production as resources grow?
    • Complexity - trying to make a simple schema do too much.
    • Part of a solution - not the whole solution.
Return to top
Subject Gateways
  • Individual and collaborative initiatives to locate and index Web resources for an information domain. Subject gateways range from a simple categorised list of links through to distributed databases of content powered by expensive software.

  • Is every Web page a subject gateway?

  • Any subject gateway needs an information model, an appropriate database for collecting and storing information, selection criteria, classification protocols, a way of producing output for the Web, and navigational aids to that Web output.

  • Subject Gateway initiatives

  • In our second case study we will look at metadata indexing of a resource and how it may be incorporated in a subject gateway.

Return to top


© The University of Melbourne 2000. Disclaimer and Copyright Information.
Created: July 2000
Last modified: 10 July 2000
Authorised by: Director, Austehc
Maintained by: Joanne Evans
Email: joanne@austehc.unimelb.edu.au