January 26, 2004

Small World Networks

Mark Buchanan's "Small World: Uncovering Nature's Hidden Network" (2002) explores the mathematics of large networks or graphs. The book is interesting as much for the unexpected mathematics it explains regarding real world large networks as for the application towards say large information architectures like the University of Miami's Richter Library. Essentially, Buchanan's book unpacks Steven Strogatz's and Duncan Watt's pathbreaking 1998 and 2001 Nature articles on the mathematics of networks. In this respect the book presents some higher end mathematics for a wider 'information architecture/web development' audience.

Strogatz's pathbreaking article was titled "Collective Dynamics of 'small world networks" (Nature 393, 440-442 (1998) and describes the unexpected behavior in 'small-world networks'. The unexpected main idea is that a small number of 'weak or random links' in highly clustered or orderly networks radically improve network efficiency and 'information propogation speed'. For example, in a social network where people are closely tied, a weak link from one person to the next who isn't closely tied radically increases 'connection' possibilities of the different networks and creates a 'small world'. With regards to say large library information architectures or 'tree-like networks' the idea is that a couple random links on a homepage that are unexpected radically improve travel time and navigation through a network bridge to create a 'small world'. Strogatz writes this as:

Models of dynamical systems with small-world coupling display enhanced signal-propogation speed, computational power and synchnronizability. . .Their distinctive combination of high clustering (expected nodes) with short characteristic path length cannot be captured by traditional approximations such as those based on regular lattices or random graphs". . . the alarming and less obvious point is how few 'short cuts' are needed to make the world small.

To unpack this, Strogatz deals with idealized "networks" that he terms neither completely ordered nor completely random.
In a later article "Exploring Complex Networks" (Nature 410, 268-276, 2001), Strogatz expands earlier ideas exploring what he calls the complexity of 'network anatomy". (Mathematical knot theory could also be important here but isn't mentioned).

Strogatz delineates in the later paper areas of study for the complexity of networks:

1. Structural Complexity (Intricate Tangle)
2. Network Evolution: the wiring diagram changes over time (take our library website and either internal/externa or 'broken links')
3. Connection Diversity: the links between nodes have deifferent weights, directions and signs (ie. our homepage and a low level information page)
4. Dynamical Complexity
5) Node diversity
6) Meta Complication

Strogatz more clearly though summarizes 1998 results in 2001 thus:

Watts and Strogatz conjectured that 'short paths' and 'high clustering' holds for many natural and technological systems. Furthermore, they conjectured that dynamical systems coupled in this way would display enhanced signal propogation speed, synchronizability and computational power as compared with regular lattices of the same size. The intuition is that the short paths provide the high-speed communication channels between distant parts of the system thereby facilitating any dynamical process (like synchronization or computation) that requires global coordination and information flow."

Fascinating ideas but also dense and therefore the necessity of Buchanan. Buchanan's book unpacks what these 'weak links' could be in technological, social and biological contexts - also historically contextualizng the genesis of Strogatz's ideas. The wider historical contextualization is useful:


On social networks and information transfer: Ego will have a collection of close-knit friends most of whom are in touch with each other - a densely knit clump. In addition, Ego will have a collection of acqaintances, few of whom know each other. Each acquaintance has his own close friends and is enmeshed in a closely knit clump but different from egos The weak tie between ego and his or her acquaintance therefore becomes not merely trivial but rather a crucial bridge between two densely knit clumps. therefore, weak ties bind the larger otherwise isolated community (46).

On Xerox's Internet Ecologies Division: The sheer reach and structural complexity of the Web make it an ecology of knowledge with relationships, information 'food chains' and dynamic interactions that could soon become as rich as, if not ricther than many natural ecosystems." (These ideas are Gregory Batesons,Bateson entry )

On Paul Baran (Rand) and types of Networks: In his early Rand papers, Paul Baran considered two different kinds of 'distributed networks, one looks like a fishing net the other looks like what Baran tearmed "hierachical decentralzed". Baran pointed to the fishnet as being more survivable yet the internet has naturally evolved into the 'hiearchical decentralized' with the 'weak links' providing the strong bonding

There is a lot to reflect on from this book -the proof will be in actual application. Can we build more robust information architectures to 'speed the information propogation speed' through 'weak links'? How exactly is this to be achieved? Buchanan and Strogatz conjecture that the biological models are in place. The question remains as to how to transfer these models to a practical 'web centric' information architecture.

January 21, 2004

Inspiration

Tim Berner Lee's "Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor"(1999) is a visionary text. Berner's Lee, the man behind WWW, HTTP, URL, URI goes through steps - how it happened, where we were and where we are going. To say this text is from the horse's mouth is an understatement. Berners-Lee is not so much concerned about peculiarities and coding languages but conceptualization and explicating big picture thinking- for this alone, the book is worth reading. With regards to the movement of library technologies, there is a lot to be learned - methodologies, larger ways of thinking.

This book is filled with amazing statements and because of this its best to let Berner's Lee speak:

On Web principles: The idea of universality was key: The basic revelation was that one information space could include all documents, giving huge power and consistency. . .The need to make all documents in some way equal was also essential. The system should not constrain the user; a person should be able to link with equal ease to any document wherever it happened to be stored (33).

On Information Systems Planning: One of the beautiful things about physics is its ongoing quest to find simple rules that describe the behavior of very small simple objects. Once found, these rules can often be scaled up to describe the behavior of monumental systems in the real world. . .The art was to define a few basic, common rules of protocol that would allow one computer to talk to another, in such a way that when all computers everywhere did it, the system would thrive, not break down. For the Web, those elements were in decreasing order of importance, universal resource identifiers (URI's, later locators), the Hypertext Transfer Protocol (HTTP) and the Hypertext Markup Language (HTML). . .there was nothing else beyond these three elements. (36)

The fundamental principle behind the web was that once someone somewhere made available a document, database, graphic, sound, video or screen at some stage in an interactive dialogue, it should be accessible by anyone with any type of computer in any country. And it should be possible to make a reference - a link - to that thing, so that others could find it. This was a philosophical change (37).

Librarians and the early web: When Paul returned to the Stanford Linear Acceleratory Laboratory (1991) he shared the web with Louise Addis, the librarian who oversaw the library at SLAC. She saw it as a godsend and a way to make Slac's substantial internal catalogue of online documents available to physicists worldwide. Louise persuaded a colleague to write the appropriate program and under her encouragement SLAC started the first WEB server outside of CERN (46).

On the web as collaboratory medium: the Web, which I designed to be a medium for all sorts of information, from the very local to the very global grew decidedly in the direction of the very global, and as a publication medium but less of a collaboration medium (57).

On Visionary Ted Nelson (Project Xanadu) and Copyright: For Ted, hypertext was the opposite of copyright. The whole idea of Xanadu was driven by his feeling that anybody should be able to publish information, and if someone wanted to use that information, the creator ought to be automatically recompensed (65).

The web is more a social creation than a technical one. I designed it for a social effect - to help people work together - and not as a technical toy. (123)

The new Web must allow me to learn by crossing boundaries. It has to help me reorganize the links in my own brain so I can understand those in another person's. It has to enable me to keep the frameworks I already have, and relate them to new ones. Meanwhile, we as people will have to get used to viewing as communication rather than argument the discussions and challenges that are a necessary part of this process. When we fail, we will have to figure out whether one framework or another is broken, or whether we just aren't smart enough yet to relate them. (207)

January 15, 2004

ASIS&T, SIG VIS - New Millennia Catalog

Recently, I was asked to design the website for the Special Interest Group on Visualization, Images and Sound for the American Society for Information Science and Technology (ASIS&T) SIG-VIS by Diane Neale, Chair SIG VIS, University of Northern Texas. This was largely based on previous years' work at UM's Richter Library (first ARL Library with a site wide Flash interface) and research/theoretical work - this weblog. The challenge was to spur thinking about prototype possibilities for new millennia visual subject headings catalogs. The result, now on the ASIS&T SIG VIS homepage ,was a type of 3D typographic fly-through or 3D cognitive map of the information 'space' of information visualization. This entry goes into detail about this object/application which adapts the source code of Flash Visionary Jared Tarbell. What exactly is this here and why do I think this fly through can give us a few clues and directions to what it can mean to reenvision the larger dataset of a library catalog?







What will the next generation of online library catalog building entail and what are future directions? The idea in writing and trying to think about the potentialthis visualization as ground for a new millennia visual catalog is twofold: on the one hand, this gets ideas down; it also hopefully invites other ASIS&T academics, developers and library webmasters to take up the visual metaphor.

To begin theoretically, the larger quite simple idea is that 'keywords' and 'subject headings' may be 'mapped' to a 3D space (i.e. x,y,z) and Cartesian coordinate system based on 'subject proximity' (See previous Descartes Entry).

22194.gif


As far as I know, this has not been accomplished yet for any online academic library catalog or anywhere to any larger extent. To give credit, there are people on parallel paths, notably Tim Bray, inventor of XML from a library visualization 'semantic' standpoint and more experimentally/aesthetically a group of young gun visionaries of Macromedia ActionScript Programming.

To historically contextualize this, Grand systemizer Melville Dewey's main 'nineteenth century' innovation in the DDC or Dewey Decimal System (1851) is to map the universe of knowledge to a base ten decimal system (ie., 100, 200, 300 = various subject categories, 110, 120, 130 = various divisions of those categories).


Dewey Decimal System

000 Generalities
100 Philosophy & psychology
200 Religion
300 Social sciences
400 Language
500 Natural sciences & mathematics
600 Technology (Applied sciences)
700 The arts
800 Literature & rhetoric
900 Geography & history


The spectrum or universe of knowledge is 'mapped' to a humanly created well-known 'decimal system' categoric/taxonomic division. The next level of innovation or intervention into this history is to map the universe of knowedge not to a base ten numbering system but to a visualized x,y,z Cartesian analytic geometric coordinate system. This allows visualization online and eventually 'networking' to find each other in graphic 'information space'. This also allows inclusion of other media (film, sound, images, datasets) and interactivity (links) within this information space. The above simple prototype 'clickable' model deals with a small subset of a single taxonomic category of this 'catalog' regarding 'information visualization (and related synonyms - parallel subject headings). It is hopefully not too much of a stretch to see entire libraries eventually mapped in this way. This will enable easier and more facile searching, 'pattern' recognition, knowledge creation and also importantly, knowledge 'collaboration'.

cartesian3d.gif


The Path Ahead

In the ASIS&T prototype. this example was dealing with "words" or subject headings in Three D Space. The patron or user flies through categories as a plane flies through a cloud landscape. To make a few specifications: these 'units' or words for a catalog may be envisioned as 'keywords', subject headings (ie. LCSH), 'author names', on another level - books and discreet pieces of information (images, text, video sound). Each visualized 'information unit' represents a book and descriptive details. Significantly, the book and its cover may be mapped into a specific unique geometric coordinate space (x,y,z). When the user clicks or mouses over a specific item, the book (and its details are highlighted or 'zoom out into a larger 'image' context').

Various Media Types: The 'unit' being navigated through here is the subject heading. It is also not difficult to envision this space as containing either graphics of books (jpgs) or on another level static images (for image archives), moving images etc.. An art history archive like Artstor now uses an online web photographic 'contact sheet' metaphor. On another level moving images (Mpgs, Divx, Real) can also easily be mapped to this space (x,y,z) for moving image archives. Hopefully, libraries of the future will contain all of these media types in symbiosis in a larger 'universe' of knowledge or 3D 'information archive'. This space could follow, record or trace 'researchers' information foraging paths and bring researchers around the globe 'informationally' together and aid in our understanding of how the 'textual' component of 'discovery' or invention takes place.

Color and Sound: The next stage of this prototype should also hopefully make use of the semiotics and potentialities of color and sound. When a unit is highlighted or 'flown over' or 'clicked over', it would change color. When a knowledge cloud cluster of books has a similar subject heading or taxonomic place in the information hiearchy say 500 (Natural Science), this could all be coded in a single color cloud so users will know to navigate by color. These models are readily translatable from Cartography and GIS. Sound could also be used here 'harmonically' and semiotically. Users could acoustically differentiate gradients of 'subject heading from subject heading and find semantic 'synch' and harmony with various knowledge configurations and see and hear new yet unseen and unheard polyphonies of knowledge 'coordinate' patterns for discovery.

Views and Zoom: Currently, this information space is built as a prototype using Tarbell's Flash actionscript algorithms and single microcosmic cloud or microcosmic perspectival view. Taking Ben Schneiderman's and the current video games view of navigating information space, we can think of the voyage of finding a book as the metaphor of a plane landing. From an entire 'overview' of the 'knowledgscape' or 'universe' one navigates and 'zooms' to the specifics. From the knowledge cloud skies, the entire 'universe of knowledge is seen 'globally'. One then 'navigates' to one's particular continent of knowledge say 'Natural Sciences'. From here one navigates further down the taxonomic chain to say "Zoology". One then lands into a specific 'cluster' of 'subject headings' or 'book items' say Butterflies in that specific category. This may sound fanciful - try to imagine it with regards to the above model.

In Dewey, the above described methodology is mapped to an incremental decimal system. For example, beginning generally and moving more specifically:

500--Natural Science
590--Zoological Sciences
595--Other invertebrates
595.7--Insects
595.78--Lepidoptera
595.789--Butterflies

In the above described system, for the user, the same navigation would occur as a more intuitive and humanly comprehensible 3d landscape fly through from subject area overview to specific item.

There are many possibilities for the above 'next generation' library catalog model. The one perhaps most practicable and applicable being a hybrid of the 'above model' with the current single "text box" model that is now in place, ie. Umiami libraries. The challenge remains to take the best from both of these models - integrate, synthesize, pave the way for the future.

This entry welcomes constructive criticism with regards to possible directions and innovations. As the above prototype, also is not a 'working application' but a first stab at a 'new metaphor' and paradigm shift it should be taken in this light. I have little interest in hearing why this won't work, what is wrong with the prototype or the value of the 'tried, true and well established'. I am historically well aware of the value of 'card cabinet catalogues. Several deficiences and 'weaknesses' of the present attempt at theoretical synthesis of information mapping and visual online potential now need to be worked out (i.e. zooming velocity). On the other hand, this code does contain an idea in gestation. I am interested in hearing how this idea can be developed, nourished, remixed, remodeled, remade - to flower. A new generation of library interface will eventually be brought to fruition where users' will globally meet that are working in the same 'solution space' and looking for answers through similar questions or information 'foraging'. Their paths will be traceable and they will form 'knowledge hunting and gather tribes' I welcome serious thinking about these ideas to generate dialogue


In these respects, I am grateful for the forward looking leadership of the American Society for Information Science and Technology (ASIS&T). They took the chance with letting me put this project up for 2004 and for the membership to begin to ponder in 2005. They also gave me free reign as to possibilities and dreams and without question hosted and highlighted this model in a prominent position on their site. My hope is that there are other interested parties and Tarbell's code here can be built upon and put into synthesis with the ideas presented here. The collective intelligence of the net, old worthy hacker visionaries, new generation web renegades, overturning established complacencies - anarchically, revolutionarily storming an everchanging reality studio'. What does it take to bring the energy of 'prototypes' and dreams to 'established' or at least an ascendant reality. At the least, I know this project has generated already a little dialogue, a wink, a few smiles - lol!

January 14, 2004

Shhhhh!!! Semiotic Possibilities of Sound

Sound has traditionally been anathema in the library. This is changing in 2004 with the digital library and multimedia semiotic possibilities of sound. Because the music library was recently redone exploring a variety of sound technologies, this entry looks a little more closely at methodology, software, freeware (a lot of it excellent!) and semiotic possibilities and requirements of sound software for a digital library archive or say online 'music library'. Library online terminals can now be easily outfitted with 'headphones'. The new millennia is ripe for this type of innovation.

Sound files, loops, ambient sound, noises etc. - Where to get them? On the net there are a variety of sound search engines, one the best being Altavista -limiting by the top category "MP3/audio". This produces a number of hits regarding 'sound' formats' MP3 and 'MIDI" currently being the most suitable (read smallest file size) and many being 'local' and 'free' for copyright.

(Turn on your speakers now to hear a four minute ambient background loop MP3 created with some of described software and streaming in at a little less than 300k!)






In terms of programs, there is a variety of excellent freeware for sound manipulation of files on a PC. On a Macintosh, the tried and true but very functional Sound Edit 16 still gives the higher end programs a run for their money.

To begin more basically on a standard PC, to get files from into the computer they must be 'ripped'. These may be effectively ripped using the somewhat malevolent sounding "CD Buzzsaw Ripper". Name withstanding, the application does an excellent job. Next, on a PC, "Audacity" (freeware) can very easily import CD tracks to either edit them down to loops, add basic fade in-fade out and other effects and export as the ubiquitous "MP3" format. For this, the "Razor Lame Codec" is needed as an add-on to the program but again, this can be easily found through Google.

Sound formats on the web come in a variety of file formats (i.e. AIFF, WAV, MP3, MIDI, Real Audio etc.) and it is sometimes necessary and actually preferable for file size consideration to convert between formats. Another excellent freeware utility available for this purpose is the "dbpowerAMP Music Converter". Essentially, this converter does on-the-fly translations from say "Real Audio" to "MP3". A separate 'encoder' or decoder component is needed to be downloaded and added to this program with regards to the various file formats and translations but again this is fairly standard and easily used.

With regards to players, there are a number of stand alone audio players out there (ie. WINAMP) but there are also a number of Multimedia players, the most popular being Real Audio One (Helix), Quicktime Player and the Windows Media Player. Each of these have their pluses and minuses beyond the scope of this discussion. To generalize, if one wishes to be 'encoding' for the widest possible audience, encode to "MP3", import into Flash and then use SWF file as the format here. This makes for a small file size and wide browser penetration.

Sound, like the visual, can act as a 'signifier' or an 'audio' cue. The possibilities here have been untapped for libraries. Buttons, menus, interfaces etc. wil do well to begin with drawing from 'musicological' tropes to start bringing aural metaphors into our interface design. While this entry has largely refrained from delineating or speculating on these capabilities, this should be explored further in future entries as these aspects are rich and largely untapped.

January 6, 2004

New Millennia Search Engines

The new millenia search engines that are beginning to come out (2004) are either increasingly taking advantage of visual metaphors (the Grokker) or alternatively take advantage of longer term historical library and information science methodolologies (vivisimo) that have been largely overlooked for the web. What is interesting is that these new "tools" are not rebuilding the search engine or building new ones but building "on top of existing engines" to better organize massive lists of results.

The Grokker can be downloaded for a 30 day free trial and essentially acts like an interface that sits over a search engine (ie. Google, altavista etc) API. Grokker uses all manner of information visualization tools to group search results into stratified categories. While very pretty and colorful and even somewhat useful for more crude searches the grokker is really still in infancy and not quite there. While the animated zooming function is cool, the aid in searching is somewhat questionable. The writing though is on the wall for future applications. What is needed is the same thing that "Microsoft Windows" accomplished with Apple's original Xerox inspired interface - commodication, simplification and maketing normalization for the larger information seeking public.

Closer to being more useful is Vivisimo's "Document Clustering Engine". Here documents are clustered into various groupings dependent on word groupings and a "nested tree" is created on the left side, expandable for clicking into various subtopics for displaying larger amounts of information in a small amount of space.


What both of these tools show are directions for next generation search engines. These will be both visual and will take advantage of somewhat already explored "information science" classification methodologies, bringing these methodologies into a fruitful synthesis with current developments and tools already on the web.