LITA 2011 – recap

This year the focus of LITA was clearly on data and cloud-based IT solutions. I showed up early to give a pre-conference on cloud computing and really enjoyed working with the fourteen attendees. The session built on the preconference I did at Code4Lib in 2011 but this time I added in more on monitoring and server configuration tools. The conference agenda and technical workbok are available under my publications and presentations page.

I also had the chance to present on the summer exploration project that I worked on with Susan Smith. We explored sources of data and applications to support digital humanities research and so we used our time to show off a few of the tools and data sources we found.

Data visualization and digital humanities research: a survey of available data sets and tools LITA National Forum, St. Louis, MO, September 2011.

The strong data theme meant that I got to see institutions that are pursuing flavors of digital humanities or e-science initiatives. Robert Olendorf gave an exciting overview of the e-science/digital humanities program at the University of New Mexico (Managing the data flow: from the spring of ideas to the pool of knowledge, Robert Olendorf, Zoe Chao, Amy Jackson). New Mexico’s project was impressive in part because they have some case studies of how the practice of public data is leading to new discoveries. I was also impressed with their approach to curating data a recursive data model (XDFU).

Jasson Battles, Thomas Wilson and Shawn Averkamp discussed the Univeristy of Alabama appraoch to a Digital Humanities initiative (Building a Habitat for the Digital Humanities: Adding Digital Project Support to the Library Services.
Jason J. Battles, Thomas C. Wilson, Shawn Averkamp). They also discussed success stories and talked about the different skills and roles that are important in these collaborations.

Posted in 2011 LITA | Tagged , , | Leave a comment

Aditi Muralidharan, Wordseer, digital humanities

Last tuesday I attended the first Maryland Institute for Technology in the Humanities Digital Dialogs talk of the fall semester. The talk titled Large Scale Text Analysis in the Digital Humanities: Methods and Challenges featured Aditi Muralidharan who talked about her research and experience creating a linguistic research tool called WordSeer (http://bebop.berkeley.edu/wordseer).

Aditi focused first on Wordseer which includes discovery, annotation and visualization tools and second on some recommendations for others engaged in digital humanities collaborations. WordSeer is based on a database of slave narratives and featured a semantic search that allows the researcher to explore the relationships between words (e.g. God described as good). This type of exploration of relationships builds on the sort of searching that can be done another similar tools and hints at the possibility of creating semantically rich maps of full text content.

Aditi demonstrated some of the visualization tools built into the system including a heat map that shows search terms in context of the entire corpus and a word/sentence tree that shows words in relation to all of the sentences they occur in. During the question and answer time some interesting observations were made about how tools like this are enabling the development of a new research agendas surrounding literary and historical texts in addition to helping to sustain or question previously completed research or thought in humanities fields. I found this to be an intriguing idea, particularly considered within the context of the sort of ‘resource discovery’ research that occurs in library settings where research systems that libraries support are based on a very different type of data.

As an example of the integration of this sort of discovery tool in traditional library discovery services, the integration of Google Book and HathiTrust search results in catalog searches is an interesting start. I wonder what impact full text analysis tools and semantic searching would have on these discovery systems. I was left wondering whether or not traditional library metadata would play a valuable role in these systems or if the syntactic and semantic analysis of the entire text of the library would render traditional access points like subject headings irrelevant. It seems possible on the surface at least that existing metadata would provide the researcher more context for topical analysis and that administrative and technical metadata could provide other useful tools.

As the discussion revolved around how to develop computational research skills in humanities scholars, I wondered about how library research skills might need to be updated to work with these systems. It seemed on one hand that a firm grasp of metadata structures might make interpreting the results of these systems easier and that the ability to carry search strategies across multiple systems might also add value to a research experience. By the same token – I get the sense that these research tools are not part of the general familiarity of librarians so that while librarians might be good at working with these systems and helping users find and work with them, that they are not generally aware of them.

Posted in MITH Digital Discussions | Tagged , | Leave a comment

Creating your own proxy bookmarklet

Creating bookmarklets is not anything new and neither is creating a bookmarklet to try to get acces to resources through a proxy server. Having recently switched institutions however I found myself increasingly frustrated by the need to re-find resources using the University’s SFX server after having stumbled across an article on Google or Mendley.

A quick web search did not yield an existing bookmarklet for the University of Maryland campus proxy so here it is ProxyUM.

Simply drag the link into your toolbar and whenever you want to see if UM has access to a page you are on, hit the bookmarklet to reload the page using the UM proxy.

If you are in my Organization of Information course this fall you might find yourself re-creating this bookmarket during an exercise exploring the connection between document models and information service design!

Posted in Programming | Tagged , , | Leave a comment

Organization of Information class 1 feedback

After our first class in organization of information this fall I decided to look at some different ways of categorizing our class feedback.

First attempt was a tag cloud:
Wordle: lbsc_670_class_1_feedback

We can see that when all of the words in our feedback form are considered we get some common themes. The tag cloud was limited to the top 75 words and common English words were removed (a, and, the….). While there are some good themes here it is hard to really pull anything useful from the cloud.

When we limit just on the text from the “what questions do you have” question and remove the most prevalent word (information). We get a tag cloud that is more representative of the open questions but still had lots of prevalent words that made getting to the core concepts difficult.

When I reviewed the 30 entries under “what questions do you have” and pulled out 2 keywords per entry the central questions of the class started bubbling up.


Wordle: lbsc_670_class_1_feeback_tags

“Can we define metadata, how can we use Buckland’s definitions of information, How will the online class work” are better represented in this tag cloud.

Posted in Uncategorized | Leave a comment

More on HathiTrust overlap

For those who are interested, here are some more data that focus on the location and date distribution of matched titles.

Distribution of publication dates

Distribution of location of matched books

location_name count(location_name)
Cambria Press 1
Director’s Office 1
Docs Librarian’s Office 1
ER Librarian’s Office 1
ITC Collection 1
Philosophy 1
Romance 1
TLC 1
Gale Virtual Ref Library 2
Honors 2
Mandelbaum Reading Room 2
Microtext-NCDocs 2
Nicaragua 2
PROBLEM 2
Art Slide Library 3
Cataloging 3
MOBWIL 3
Lion American Poetry 4
STR 4
Documents Index Table 5
Documents Ref. Desk 5
PDK 5
ACQUISITIONS 6
Mathematics & Computer 7
Theatre in video 9
SYS 10
Career 15
Ebooks from Netlibrary 15
Preservation 16
Reference Website 16
Chemistry 18
Media Collection 19
Classical Music Library 24
Reference Desk 25
NC Documents 27
Physics 33
Reference Office 34
Psychology 36
TECHSERV 41
PERSONAL ORDERS 58
Vienna 80
Documents CD-ROM 86
Browsing 108
University Archives 114
Physical Education 117
NA Wom Letters / Diaries 118
Music 150
CAGE 153
Microtext 192
Oversized 204
Reserve Book 300
Microtext-USDocs 365
Documents Website 367
VEN 383
LONDON 462
SPIL 462
Anthropology 553
RARE 627
Periodicals, Main Stacks 1027
Baptist 1314
Discarded 1383
Education 1437
Military Science 1663
Periodicals, Current 1767
Off-Site Shelving 2578
Ebook Library (Demand Dr) 2672
Reference 4106
Documents 11691
WEB 13898
Circulating main stacks 317687

.

Posted in technology | Tagged , , | 1 Comment