tag:blogger.com,1999:blog-66217171272493250942024-03-06T12:02:17.020-08:00Digital+Research=BlogThis blog discusses research issues involving digital libraries, digital archiving, and almost anything else that pertains to the management of digital information.Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.comBlogger24125tag:blogger.com,1999:blog-6621717127249325094.post-40629529477856549252013-05-05T13:37:00.001-07:002013-05-05T13:37:25.807-07:00Finding a research question<h2>
Introduction</h2>
Students often ask me about finding a research question, since I require one for theses of all sorts, and this blog post is an attempt to provide an answer.<br />
<br />
Many students start with a topic that they would like to research. This is natural, but in some ways secondary to the process of scholarly writing. I recommend that students start with: 1) a method based on some well-established discipline, and 2) a source of data. Let me explain.<br />
<br />
<h2>
Method</h2>
A method is the tool you use for research. If you were a painter, you might choose a variety of scenes to paint, but the brush and oils would be an essential part of how you approached it. As scholars we build up skills using certain discipline-based methods. In effect, we learn how to paint with a particular set of intellectual tools. If we ignore the tools or never learn how to use them, the chances of painting a satisfying scene are significantly less.<br />
<br />
In graduate school I settled on a set of ethnographic tools, which I have used and reused over the decades. This does not mean that my approach has been unchanging, but it has had a consistency based on long practice. I am not a trained ethnographer in the German sense of having a degree in it or even having taken classes, but cultural anthropology was part of the atmosphere of my graduate school environment, and I just keep reading and practicing it in my dissertation and beyond. Having a method means absorbing a way of thinking. This is essential to formulating a research question.<br />
<br />
<h2>
Data</h2>
<div>
Access to data is the equivalent to providing a painter with a scene to paint. If the painter has no one who will sit for a portrait, it becomes much harder to paint a portrait. People try sometimes with varying success, but without sufficient experience with real subjects, it is hard for a painter to create a portrait in the abstract. </div>
<div>
<br /></div>
<div>
Some students want to rely entirely on existing published results and to comment on them. This is more like copying a painting than creating a new one. It can be a reasonable approach if they can do a new analysis, but for a beginner merely to comment on other people's work without actually analysing the data anew risks superficial results. </div>
<div>
<br /></div>
<div>
Data are hard to get. Many desirable sources are closed to the public, and many public sources are overworked or unreliable. Data do not have to be perfect to be used in a scholarly study, but they do need to be available and the author does need to understand and be able to explain their imperfections.</div>
<div>
<br /></div>
<h2>
The question</h2>
<div>
Once the method and a source of data are clear, the student can then reasonably begin to formulate the research question. It needs a grounding in the scholarly discourse in the field to explain why that particular question is interesting. Many students want a completely new question so that they can do something original, but wise students often take a well-researched existing research question and approach it with new data or a new method. The advantage of an existing research question is that its importance is already clear. </div>
<div>
<br /></div>
<div>
The best research questions for a thesis are ones with a straightforward answer. I generally recommend a yes/no question, or one that has a quantitative answer, or one that is a choice among reasonable alternatives. These are not the only possible research questions, but questions involving complex issues about "why" or even "how" tend to be beyond the scope and experience of even the cleverest doctoral students. The virtue of a yes/no type question is that the student can make a clear choice. A thesis with a vague answer is not a contribution to knowledge, while even a very narrowly stated and highly qualified yes/no answer can be a reasonable step forward.</div>
<div>
<br /></div>
<div>
Choosing a research question is hard, but it is probably the most important step in writing a thesis. The topic matters only in so far as data are available and the research method can reasonably apply. Topics are temporary and can change with the seasons. Good research questions grow ultimately out of the intersection of scholarly methods and quality data. </div>
<div>
<br /></div>
Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com2tag:blogger.com,1999:blog-6621717127249325094.post-88818874624861236742012-12-22T10:13:00.001-08:002012-12-22T10:16:02.700-08:00Prize Selection<br />
<div>
I am involved with a number of paper prize awards and find myself wondering how effectively the selection process works. In the end the selection comes down to a small group of people with both personal and cultural preferences. In some fields the quality of the mathematics makes for a fairly even playing field, but information science today has little clarity about its core topics or methods, and a cultural diversity that makes consensus hard. Do we really know how well the process works? Several questions come to mind that a masters student could answer in a thesis.</div>
<div>
<br /></div>
<div>
Is there evidence, that papers that get an award are more
influential over time than other papers? Influence might be measured via the number of citations. The population of prizes should be restricted to awards with a long enough history to allow for publication and public reaction. The ASIS&T / Proquest awards, for example, list 15 years of winners. The JCDL student paper award is only 8 years old, but the Vannevar Bush best paper award goes back to 1998. The iConference awards are newer and there is no single list of winners. Nonetheless this data is generally available. Citations could be counted in a number of databases, or tested via Google Scholar, which would then include open source citations.</div>
<div>
<br /></div>
<div>
A related question is whether authors who win awards
also get more citations on other papers, regardless of the success of the
winning paper, and whether the authors become notable figures in the field. I recognized four of the 15 winners of the ASIS&T award immediately, and they are certainly active in the field. </div>
<div>
<br /></div>
<div>
An number of other research questions revolve around factors that influence reviewers. I see a lot of reviewer comments in my work and so many reviewers make errors in their comments on statistical analyses that I wonder whether a moderately complex statistical analysis actually hurts a paper's chances of winning prizes. A related issue is the use of popular buzzwords. There are years when certain topics generate intense interest that is not sustained over time. Buzzwords associated with these topics may give an impression of cutting-edge work and give these papers an edge. Finding measurable answers to both of these research questions would be harder than doing a simple citation analysis, but it would give useful information both to applicants and to prize committees.</div>
Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com4tag:blogger.com,1999:blog-6621717127249325094.post-30536255464285061182012-10-16T00:04:00.000-07:002012-10-16T00:04:12.050-07:00Do not track...According to a New York Times <a href="http://www.nytimes.com/2012/10/14/technology/do-not-track-movement-is-drawing-advertisers-fire.html?emc=eta1">article </a>by Natasha Singer (13 October 2012), 9 members of the US House of Representatives questioned the Federal Trade Commission's "<span style="background-color: white; font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px;">involvement with an international group called the </span><a href="http://www.w3.org/" style="background-color: white; color: #666699; font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px;" title="The group’s site.">World Wide Web Consortium, or W3C</a><span style="background-color: white; font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px;">, which is trying to work out global standards for the don’t-track-me features." What apparently has them distressed is that "do not track" may become the default in, for example, Microsoft's new version of its Internet Explorer browser.</span><br />
<span style="background-color: white; font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px;"><br /></span>
<span style="background-color: white; font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px;">Defaults are important, as Richard Thaler and Cass Sunstein explain in their <a href="http://nudge.blog/">Nudge.blog</a>. People tend to accept defaults rather than change them, and this is true for a wide variety of topics including pension plans, health care, and privacy. Those who control the choice of the default arguably determine what a majority will decide. This is not surprising, since we accept cultural defaults all the time in matters as basic as food and clothes. </span><br />
<br />
<span style="font-family: georgia, times new roman, times, serif;"><span style="font-size: 14.999999046325684px; line-height: 21.98611068725586px;">After years of working on operating systems and on network applications, I am perhaps less concerned about privacy than many of my colleagues for quite </span><span style="font-size: 14.999999046325684px; line-height: 21.97222137451172px;">contradictory reasons: first, because I realize that anyone with the right technical skills can break ordinary privacy protections in the Internet, and second, because I realize that it is a lot of work and mostly not worth the trouble. </span><span style="font-size: 14.999999046325684px; line-height: 21.98611068725586px;"> Nonetheless some regard for privacy seems basic to how free and democratic societies operate in the HTTP environment. Microsoft seems to have consumer interests more at heart than </span></span><span style="font-family: georgia, 'times new roman', times, serif; font-size: 14.999999046325684px; line-height: 21.98611068725586px;"> the nine US congressmen. </span>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com1tag:blogger.com,1999:blog-6621717127249325094.post-76926002220950421352012-10-07T06:45:00.000-07:002012-10-07T06:45:00.260-07:00Information Technology historySome while ago I began to put together an <a href="https://docs.google.com/document/d/1LSjV9LAM3L5Y2e69KOIJNgLNSx9lDADs3SrYi8TfwzM/edit">historical timeline</a> on digital library developments. The timeline began relatively informally, but lately I have started to add references to source materials. It is very much a work in progress, but I would be happy to have suggestions for more entries. Anyone may view the timeline, but only I can update it at the moment.<br />
<br />
Digital libraries and in a broader sense the world of information technology is relatively young, but it has become old enough that some attention to its history seems increasingly warranted. ASIS&T has, for example, a webpage devoted to the <a href="http://www.asis.org/historyis.html">history of information science and technology</a>. Professional historians are starting to take an interest as well, including colleagues at Humboldt-Universität zu Berlin.<br />
<br />
The social and legal issues are complex and interesting, and increasingly students need enough historical background in the history of technology to discuss topics like copyright or censorship or even the effect of technology on elections (such as the current US presidential election). We also have an imperfect understanding about the interaction between innovation and users, except that in some cases users quickly adopted new developments (HTML, for example) and in other cases innovations like the mouse sat fallow for years. Questions about the innovation/demand cycle play a key role in discussions about the industrial revolution. Whether the dynamics are similar or not I have too little evidence to judge.<br />
<br />
This blog has been quiet for some time, but I plan to use it more regularly to discuss issues about the history of information technology precisely because I hope for comments from readers.Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com14tag:blogger.com,1999:blog-6621717127249325094.post-44956938819213377452012-10-04T00:49:00.003-07:002012-10-04T00:49:09.824-07:0030 Years of Information Technology<br />
<div class="MsoTitle">
<span style="font-family: Arial, Helvetica, sans-serif;">Library Hi Tech has been celebrating its thirtieth year with a number of special issues, and the latest issue looks back on the development of information technology for libraries. Below is the structured abstract for my editorial. I will include a link when the issue is available online.</span></div>
<div class="MsoNormal">
<br /></div>
<blockquote class="tr_bq">
<span style="font-family: Arial, Helvetica, sans-serif;"><strong><span lang="EN-US">Purpose</span></strong><span lang="EN-US">: This
issue of Library Hi Tech offers a retrospective over the last thirty years of
information technology as used in libraries and other memory institutions,
particularly archives and museums. This editorial will add the editors’
reflections.</span></span></blockquote>
<blockquote class="tr_bq">
<span style="font-family: Arial, Helvetica, sans-serif;"><strong><span lang="EN-US">Method</span></strong><span lang="EN-US">: The
method uses historical documentation and relies heavily on personal
recollection.</span></span> </blockquote>
<blockquote class="tr_bq">
<strong style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-US">Findings</span></strong><span lang="EN-US" style="font-family: Arial, Helvetica, sans-serif;">: </span><span lang="EN-US" style="font-family: Arial, Helvetica, sans-serif;">Thirty years ago
information technology in libraries largely had to do with ways in which
libraries could make their ordinary operations more efficient. Today the
information science frontier has broken out of the comfortable institutional
paradigm of the past and made libraries aware that they need to redefine
themselves in a world where their buildings no longer represent a storehouse of
knowledge unavailable elsewhere.</span></blockquote>
<blockquote class="tr_bq">
<span style="font-family: Arial, Helvetica, sans-serif;"><strong><span lang="EN-US">Implications</span></strong><span lang="EN-US">: </span><span lang="EN-US">Information
technology advances have not made libraries obsolete, but they have made it
imperative that libraries redefine their role to be digital information
managers and service providers for their readers. <o:p></o:p></span></span></blockquote>
<div class="MsoNormal">
<br /></div>
Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com2tag:blogger.com,1999:blog-6621717127249325094.post-88343351033363394032012-07-11T23:27:00.000-07:002012-10-04T00:41:14.391-07:00ReCAPTCHA - a post by Estelle Shumann<br />
<div>
<span style="font-family: Arial;"><span style="white-space: pre-wrap;">Note: I have removed this post by Estelle Shumann after a number of negative comments and requests. The topic was interesting and it seemed harmless enough. Recently I received the following message:</span></span><br />
<span style="font-family: Arial;"><span style="white-space: pre-wrap;"><br /></span></span>
<br />
<blockquote class="tr_bq">
<i> You currently have a link on your site pointing to our
OnlineSchools.org website. We have recently received warning from
Google that they are suspicious of link trading schemes surrounding
this, and we want to make sure that you are taking the necessary
precautionary measures so that your site is not adversely affected.<br />
<br />We are requesting that you remove the link back to our site. </i></blockquote>
<span style="font-family: Arial;"><span style="white-space: pre-wrap;">I do not know that her post was part of this effort, but I am removing it as a precaution.</span></span><br />
<span style="font-family: Arial;"><span style="white-space: pre-wrap;"><br /></span></span>
</div>
<div>
<br /></div>
Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com4tag:blogger.com,1999:blog-6621717127249325094.post-6418134107105056652012-02-01T09:58:00.000-08:002012-02-01T09:58:51.081-08:00Writing on the iPadThis Blog entry is an experiment, as was my writing a full scale article (3500 words) on the iPad. <br />
<br />
The article was on measuring reliability in long term digital archiving. I based it on talks given in Tallinn, Estonia, and at a workshop here in Berlin, and I copied the text from the slides onto the iPad using Dropbox, though I could easily have mailed them to myself as well. Then I purchased the Apple Pages app and imported the text into Pages so that I had a ready-made outline. Actually I almost never write from an outline, so in some ways this was a bad idea, but not one for which the iPad bears any guilt. <br />
<br />
The Pages app is very easy to use once one recognizes where one has to tap to change styles, get fonts, or send backups in the form of email copies. The backups may not have been necessary, since iCloud is sharing copies among my various Apple devices, but it seemed like useful extra protection, since I could not be sure that the iCloud would not instantly and automatically change every copy if I accidentally deleted a key portion of text. That is not a theoretical but real issue. It is easy to tap the UNDO button a bit too often. Only later did I learn that I could REDO an UNDO by holding down the UNDO longer. I found it was also surprisingly easy to highlight far too marge a segment of text and then brush a key that deleted it accidentally. UNDO and REDO are really valuable options.<br />
<br />
The touchpad keyboard as such gave me no particular problems. I am a decent multiple-finger typist, but not especially fast. Nonetheless I do find that I often hit one of the keys in the bottom row rather than the space bar. The spelling checker and word-suggestion system is unexpectedly good, but the price for corrections in multiple languages is that I must constantly switch keyboards, since the spelling and keyboard choices are linked. With the English and German keyboards this is family simple, since only the Y and Z keys shift, but I am very accustomed to the German keyboard (all of my other devices have German keyboard) and sometimes my finger strays. The correction facility is fairly good at catching and fixing this. <br />
<br />
One advantage that I had hoped for with the iPad was that I could carry the machine with me easily and write in even quite short blocks of time, such as in the S-BahnCard train (six minutes from my home station to the office). I found that that worked fairly well as long as I worked on the article so regularly that I had it mostly in mind and did not have to search back to find the threat of what I wrote. Generally I write in landscape mode because the keyboard is bigger and mistakes are therefore fewer, but portrait mode give a far better sense of the virtual page. By the end of the article I tended to use portrait more often, especially when I was revising what I wrote.<br />
<br />
Only very recently have I returned to using Microsoft Word on my other computers, and I confess that it is really good. Mostly I don't want its features though. My articles require little fancy formatting or inserts. The Pages app is definitely no substitute for Word, but as a basic word processing tool on the iPad, I found it met my needs well.Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com5tag:blogger.com,1999:blog-6621717127249325094.post-68264842726581348582011-11-01T00:40:00.000-07:002011-11-01T00:40:19.845-07:00eReadingSome years ago Elke Greifeneder and I offered a seminar in which students tested their reading experiences on a Sony eBook reader, a laptop, a desktop computer, printouts, and a bound book. The reading content consisted of German novels and the students measured the experience only by testing reading speed. The result was that there was no apparent difference between the eBook reader and other media. The students subsequently published the research (see Grzsechik, K. et al, (2010), "Reading in 2110 – reading behavior and reading devices:a case study" The Electronic Library, Vol. 29 Iss: 3, pp.288 - 302 or online at <a href="http://www.emeraldinsight.com/journals.htm?articleid=1927542&show=abstract">Emerald</a>).<br />
<br />
Now a more extensive study at the University of Mainz has reached similar conclusions: "Almost all participants stated that reading from paper was more comfortable than from an e-ink reader despite the fact that the study actually showed that there was no difference in terms of reading performance between reading from paper and from an e-ink reader." The study also found that "the older participants exhibited faster reading times when using the tablet PC." (<a href="http://www.uni-mainz.de/eng/14685.php">Source</a>)<br />
<br />
The general assumption in Germany is that a strong cultural preference for print on paper is likely to persist. It may, of course, but if the US experience offers any indication, the resistance may give way to the convenience of having multiple works on a single device. On my iPhone right now I have a dissertation and 4 novels. The iPhone is a bit small for dissertation reading, but I never read scholarly works on paper any longer because I want to be able to search them and to look up references simultaneously. I also buy fewer and fewer novels in paper form because I do not want to have to carry one more object with me.<br />
<br />
A few years ago I thought that an interesting study would be to sit in the Berlin S-Bahn (elevated train) and count the number of people using eReader devices. Now such a study would be harder, because such a large number of people spend their transit time doing something on their smart phones, but whether that is reading, playing games, or sending email is hard to say. Or perhaps what they are doing is irrelevant. Whatever they are doing, it seems to involve reading on an electronic device.Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com1tag:blogger.com,1999:blog-6621717127249325094.post-82733042472805298222011-10-27T13:51:00.000-07:002011-10-27T13:51:09.285-07:00CyberAnthropologyI spent the day at the "<a href="http://berlinsymposium.org/">1st Berlin Symposium on Internet and Society</a>" While the topics were in general very narrowly focused on policy and regulation, the session I attended this afternoon was on "<a href="http://berlinsymposium.org/session/cyberanthropology-being-human-internet">CyberAnthropology -- Being Human on the Internet</a>". While I am not in a strict sense an anthropologist, I have explicitly used methods from cultural anthropology for close to 40 years and thus found the topic interesting.<br />
<br />
The session began with a critique of the presenters' paper in which the speaker noted that the presenters had done no original data collection of their own. In my own part of the academic world that criticism would probably be grounds for rejection from any serious scholarly symposium or conference.<br />
<br />
It further became clear that the presenters had almost no background in contemporary scholarly anthropology. Their approach was to throw out the whole empirical basis of contemporary anthropology as too narrow and to replace it with a cloud of philosophers starting with Plato and Aristotle and ending with Paul Ricoeur and Derrida. (Note: I was at the University of Chicago during the years when Ricoeur was there -- his concepts are not entirely new to me.) The presenters believe that "philosophical anthropology" and their understanding of hermeneutics eliminates the need for standards for evidence and rules for persuasion. At least that is what I heard them say over and over again. Nonetheless it is interesting that they welcome data from others. I wonder why?<br />
<br />
True to their law faculty roots, the presenters have already acquired the rights to <a href="http://www.cyberanthropology.de/CA/CyberAnthropology.html">http://www.cyberanthropology.de</a>. Others have already captured the name <a href="http://www.cyber-anthro.com/">http://www.cyber-anthro.com/</a>, so the brand is not exclusive.<br />
<br />
My objection to this symposium paper is partly that I regard it as an embarrassment to my own Philosophical Faculty 1, which houses the university's departments of philosophy and European Ethnography (cultural <a href="http://www.euroethno.hu-berlin.de/">anthropology</a>). It does not reflect our standards.<br />
<br />
<br />
<br />
<br />Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-45129176981958888582011-10-20T11:54:00.000-07:002011-10-20T11:54:37.273-07:00AnthroLib movesNancy Foster wrote yesterday that AnthroLib has moved to a new <a href="http://www.library.rochester.edu/anthrolib/">address </a>at the University of Rochester Library. A feature that I had not noticed before is the link to a bibliography in a Zotero Group. The bibliography is quite new (started apparently in September) and doesn't have much in it yet, but I suspect it will grow fast. Some links are to Proquest and seem to assume that everyone has the same level of Proquest access. From Berlin at least the links do not work.Nonetheless the list of articles is interesting.<br />
<br />
The map is a typical Google map with the odd quirk that it can start moving and be difficult to stop without reloading the location. The flaw may lie in how I touch the map screen with my cursor. It is mildly annoying. The map shows that most of the AnthroLib projects are US-based and generally east-coast, but perhaps we can get some started in Berlin.<br />
<br />
<br />Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-20535754371951081402011-10-18T05:03:00.000-07:002011-10-18T05:03:36.975-07:00O'Reilly Media Ebook reportI am thankful to Jim Campbell (University of Virginia) for sending me a link to the O'Reilly Media report on "<a href="http://www.buecher.at/rte/upload/news_11_2/ebook_report_2011_web1004c_1.pdf">The Global eBook Market: Current Conditions & Future Projection</a>s" by Rüdiger Wischenbart with additional research by Sabine Kaldonek. The strong growth of eBooks and eBook popularity in the US and UK is not yet reflected in Germany, though an acceptance for reading on the screen has grown since 2009. As the report says: "ebooks at this point have a difficult stand against a cultural tradition that places (printed) books and reading high on the scale for defining a person’s cultural identity.<br />
<br />
While I read a great deal on the screen and insist that I only read student papers in electronic form, I admit that I take pleasure in Berlin's excellent book stores with their intelligent selections and recommendations. As physical places, they are a delight. I wish they offered eBooks in house the way Barnes & Noble does, though. Then they would be perfect.Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-24400024740620929192011-09-11T06:44:00.000-07:002011-09-11T06:44:57.490-07:00Archiving in the Networked World: Preserving Plagiarized Works (Abstract)<span class="Apple-style-span" style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;">This article will appear in<a href="http://www.emeraldinsight.com/journals.htm?issn=0737-8831" style="color: #2288bb; text-decoration: none;"> Library Hi Tech</a>, v29, no. 4, which should be available in November 2011 in preprint form. The abstract is below.</span><br />
<span class="Apple-style-span" style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"><br /></span><br />
<span class="Apple-style-span" style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span><br />
<div class="MsoNormal">
<span><b>Purpose</b></span>:
Plagiarism has become a salient issue for universities and thus for university
libraries in recent years. This article discusses three interrelated aspects of
preserving plagiarized works: collection development issues, copyright
problems, and technological requirements. Too often these three are handled
separately even though in fact each has an influence on the other.<br />
<br />
<span><b>Methodology</b></span>: The article looks
first at the ingest process (called the Submission Information Package or SIP,
then at storage management in the archive (the AIP or Archival Information
Package), and finally at the retrieval process (the DIP or Distribution
Information Package).<br />
<br />
<span><b>Findings</b></span>: The chief argument of
this article is that works of plagiarism and the evidence exposing them are
complex objects, technically, legally and culturally. Merely treating them like
any other work needing preservation runs the risk of encountering problems on
one of those three fronts<br />
<br />
<span><b>Implications</b></span>: This is a problem,
since currently many public preservation strategies focus on ingesting large
amounts of self-contained content that resembles print on paper, rather than on
online works that need special handling. Archival systems also often
deliberately ignore the cultural issues that affect future usability.</div>
Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com1tag:blogger.com,1999:blog-6621717127249325094.post-51017444310124106252011-07-23T13:00:00.000-07:002011-07-23T13:00:57.811-07:00Microsoft Research Summit 2011 - day 3<b>Dinner Cruise</b><br />
The dinner cruise turned out not to be especially cold, since the ship had large indoor areas where we ate. Microsoft also provided an open bar at this and in fact at all the dinners. Often the wine at such functions is dubious, but even the wine was good and the quality of the selection of micro-brew beer was equally impressive.<br />
<br />
Of course the goal of the dinner was not food or drink or even the scenery along the lake, but the conversation among colleagues. I ate with fellow deans from Illinois, Michigan, and Carnegie Mellon, and even if no great research comes from our discussions, collegial discourse is an important social component in the efficient functioning of organizations and projects.<br />
<br />
<b>Day 3</b><br />
I should remember more of day 3 that I do, but jet lag had not quite lost its hold and the morning presentations, while good, left little permanent impression. The main event of day three was in any case the iSchool meeting with Lee Dirks and Alex Wade from Microsoft Research. Lee and Alex gave some sense of the projects they are working on. Lee especially has an interest in long term digital archiving that includes involvement in projects like <a href="http://www.planets-project.eu/">PLANETS</a>. While testing is an official component of PLANETS, I find that it puts less emphasis on testing than on planing and organization. Testing is, however, what is really needed and that is what I tried to suggest in the meeting -- not, I think, with great success.<br />
<br />
The other research aspect that I tried to sell, without much obvious resonance, was ethnographic research on what digital tools people really use and what they really want. Microsoft builds tools and we saw a lot of them that are oriented toward research, but I wonder how well some of them will do in the academic marketplace in the long run. Ethnographic research gives deeper insights into what people understand and misunderstand than do surveys.<br />
<br />
Just before I left we held the oral defense of the thesis for one of my very best MA students [1]) whose thesis looked at how a group of literature professors at Humboldt-Universität zu Berlin (which actively supports Open Access) regard Open Access. It was striking how much they misunderstand Open Access and how little they know about it. Most of them would never have filled out a survey. This information would just have slipped away or remained as an anomaly. Microsoft could profit from research like this and had an interest in it in years past. It is less clear that it does today.<br />
<br />
[1] Name available on request, with the student's permission.Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-66403852213089347152011-07-19T16:48:00.000-07:002011-07-19T16:48:47.395-07:00Microsoft Research Summit 2011 - day 2<div class="western" style="margin-bottom: 0in;"><b>Cosmos: big data and big challenges.</b></div><div class="western" style="margin-bottom: 0in;">Pat Helland talked about massively parallel processing based on Dryad using a Petabyte store and made the point that these massive systems process information differently. Database processing in this environment involves tools like SCOPE, which is an SQL-like language that has been optimized for execution over Dryad. Saving this data long term is a problem because they are worried about bit rot. Cosmos keeps at least three copies of the data, checks them regularly, and replaces data that is damaged. Interesting how close this is to LOCKSS (which saves 7 copies). </div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">In talking about how faster processing is not always the solution to processing problems, a speaker quoted Henry Ford as saying: “If I had asked my customers what they wanted, they would have said faster horses... “ (source: Eric Haseltine <i>Long Fuse and Big Bang</i>.)</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"><b>NUI (Natural User Interfaces)</b></div><div class="western" style="border-bottom: 1px solid #000000; border-left: none; border-right: none; border-top: none; margin-bottom: 0in; padding-bottom: 0.03in; padding-left: 0in; padding-right: 0in; padding-top: 0in;">One of the speakers distinguished between making user interfaces imitate nature and making them feel natural. Non-verbal clues that convey meaning need to be part of the interaction. Another speaker said that we need better feedback systems. An example is a touch screen with a single button. If a user touches it and nothing happens, the user will hit it again and again. When designers changed the button to send out sparkles when touched, the repeated touching stopped, even though sparkles are not a natural result of touching a screen.<br />
<br />
In the discussion someone said that we aren't doing science if we can't go back to the data. This suggests a clear separation of data and processing that speakers about very large data sets said is no longer really possible, since the data is usable only with a degree of processing. The speaker was doing fundamentally social science research, which is more human-readable than Big Science data.<br />
<br />
Other comments of interest:<br />
<br />
<ul><li>Do we need to take the “good” into account in our interpretation of the “natural”? </li>
<li>It's not a machine that we are interfacing to any more, though the speaker is not sure what it is exactly. There is no machine, but a task.</li>
</ul><br />
Microsoft clearly has a strong interest in image management, particularly three-dimensional images such are used in medical imaging (doubtless a good market) or gaming. They are dividing a picture into quadrants and creating mathematical representations of the edges in each square to create a hash to search for similar photos. <a href="http://photosynth.net/">Photosynth.net</a> was also demoed -- it allows the creation of three dimensional images from multiple photos.<br />
<br />
<b>Evening Cruise</b><br />
Microsoft has planned a dinner cruise for the evening. It should be pleasant (I will comment tomorrow), but many of us wanted to go back to the hotel to leave computers, etc., and to change clothes because it is fairly chilly out (despite the heat wave in the rest of the US). </div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-90803380618229332532011-07-18T16:35:00.000-07:002011-07-18T16:35:43.062-07:00Microsoft Research Summit 2011 - day 1<div class="western" style="margin-bottom: 0in;">Microsoft Research invited me to the Summit as part of the iSchool deans group. This blog posting (and several that follow) has my notes and comments.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"><b>Plenary</b></div><div class="western" style="margin-bottom: 0in;">Tony Hey opened the Summit with a talk about changes in scholarly communication, including ways of evaluating output. One of the reasons he left academia had to do with the ranking process at British universities. He emphasized that Microsoft is open (as Steve Jobs recently admitted). Microsoft is now working with the OuterCurve Foundation “to enable the exchange of code and understanding among software companies and open source communities”.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">One of the major goals of the conference – a goal that speakers emphasize repeatedly – is to network. This leads to a type of intellectual market in which researchers try to sell their ideas to others who are also trying to sell ideas. Theoretically everyone is a potential idea buyer too, but realistically mostly people want to sell to Microsoft to get research money. This makes Microsoft staff very popular.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">As is typical of conferences of computer-oriented people, the wireless network is periodically unable to keep up with the demand. Part of the problem comes from people viewing data-intensive websites related to the presentations (I tried too). Nonetheless it seems like a problem that a corporation like Microsoft should be able to overcome. Happily the problem went away once the plenary session ended. Too many people using too few access points.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"><b>Breakout Session: Federal Worlds meet Future Worlds</b></div><div class="western" style="margin-bottom: 0in;">Howard Schrove from DARPA talked about two models of survivability: the “Fortress” model (which is rigid and doesn't work against an enemy who is already inside) and the “Organism” model (which is adaptable). There is a balance in biology between fixed systems that address known threats, and adaptable systems that address new dangers. The underlying causes of problems in computers come from a few known sources, especially the difference between data and (executable) code. The speaker said that hardware immunity is relatively cheap to develop. Self-adaptive defensive architecture is an adaptive method for software that checks behaviors that compromise it and implements on-the-fly fixes. Instructions sets can be encrypted and randomized. Networking is a vulnerability amplifier, but if the cloud has an operating system that functions essentially as a public health system for the cloud, it may be possible to move the solutions out faster than the attack progresses. A quorum computation can check whether certain systems have been compromised. The result could involve reduced performance and randomization to confuse the attacker. Biosocial concepts are the underpinning of resilient clouds. </div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"><b>Breakout Session: Reinventing Education</b></div><div class="western" style="margin-bottom: 0in;">Kurt Squire presented some of the educational gaming development that he is working on to get people to have richer experiences with topics like the environment (in a particular consequences for a county in Wisconsin) and medicine (in particular identifying breast cancer). Seth Cooper presented a game called FoldIt, where the goal is to fold biochemicals. Problem-solving is fun and that is part of what makes the games interesting. Tracy Fullerton, a professional game designer, spoke about why traditional assessment takes the fun out of game design. She explained the “yes, and...” game (where you must preface each statement with “yes, and...” rather than “but” or “no”), which helps build collaboration. </div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"><b>Closing plenary</b></div><div class="western" style="margin-bottom: 0in;">The closing plenary included a variety of speakers. One talked about how Microsoft has been trying to enhance the security of its code. Another spoke about a new app that allows people to write programs on their mobile phones. A developer spoke about echo cancellation for enhancing speech recognition (primarily for gaming). Conclusion: computing research has incredible diversity, and rarely is exclusively "basic" or "applied". </div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-86938933256328588082011-07-11T12:35:00.000-07:002011-07-11T12:35:16.990-07:00Computational Thinking<div style="margin-bottom: 0cm;">I first heard this concept during <span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;">David De Roure</span>'s talk at the the Bloomsbury Conference (see Blog entry for 3 July 2011) and want to take this opportunity to define computational thinking for the sake of my students and to apply it to digital archiving (and related projects).<br />
</div><div style="margin-bottom: 0cm;"><br />
</div><div style="margin-bottom: 0cm;"><b>Definition</b></div><div style="margin-bottom: 0cm;">“Computational thinking” is (at least in my definition) processing information the way a computer processes it with the existing tools and systems. These tools and systems change over time and computational thinking has to shift over time as well. At present it implies some understanding of, for example, how to use regular expressions to identify specific text strings and how to search indexed information to find matches. Computational thinking tends to be literal (this string with this specification at this time) and tends to be unforgiving (there is no accidental recognition of what the author really meant). Computational thinking is what students ideally learn in their first computer programming class. The “born digital” generation has no advantage here and perhaps even a disadvantage, since they did not have think computationally when they first interacted with computers. </div><div style="margin-bottom: 0cm;"><br />
</div><div style="margin-bottom: 0cm;"><b>Computational Thinking and Archiving</b></div><div style="margin-bottom: 0cm;">In class I talked with students about the differences between the implicit definitions of integrity and authenticity in the analog and digital worlds. One of my favorite examples is a marginal note in a book. While we as librarians tend to discourage readers from marking in library books, we would not throw out a book as irreparably damaged because of a marginal note. A marginal note by a famous author can even add value (the Cornell CLASS project is an example). In the digital archiving world, however, we judge integrity by check-sums and hash-values. We do not look at the content, but at whether two or more check-sums agree with each other. Since a marginal comment changes the check-sums of (for example) a PDF file, we would replace that copy in a LOCKSS archive. </div><div style="margin-bottom: 0cm;"><br />
</div><div style="margin-bottom: 0cm;">If readers wanted to add a marginal comment to a file without changing its integrity (that is, its check-sum), then they could add the comment external to the file with a mechanism (search or index) to locate where it belongs in the original file. This is not necessarily trivial, but is certainly doable as long as content is not regarded as a single file, but as a set of interacting resources. Merely thinking about this choice is computational thinking.</div><div style="margin-bottom: 0cm;"><br />
</div><div style="margin-bottom: 0cm;"><b>Computational Thinking and digital cultural migration</b></div><div style="margin-bottom: 0cm;">Computational thinking is needed in order to recognize content that may not be readily comprehensible in future eras. These are words or phrases that will likely be obscure to future (human) readers, but the machine needs specific rules to follow. For example the city of New York is likely to remain familiar in 100 years, but Saigon (now Ho-Chi-Min City) may well be hard for ordinary readers to recognize, unless the name changes back, in which case readers may need help with Ho-Chi-Min City instead. </div><div style="margin-bottom: 0cm;"><br />
</div><div style="margin-bottom: 0cm;">Of course computational thinking may change substantially when natural speech recognition improves to the point that computer-based comprehension is not fundamentally worse than human comprehension. Or it may require a different type of computational thinking. Any assumptions will likely err in some direction.</div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com3tag:blogger.com,1999:blog-6621717127249325094.post-40355828469889373412011-07-03T08:03:00.000-07:002011-07-03T08:03:13.425-07:00ICE Forum and Bloomsbury Conference<div class="western" style="margin-bottom: 0in;">This post reports on two interrelated and back-to-back meetings: the International Curation Education (ICE) Forum (sponsored by JISC) and the Bloomsbury Conference (sponsored by University College London or UCL). Both took place in the Roberts Building at UCL (which is, interestingly enough, next to where I often lived in London in the 1970s at the now-vanished Friends International Centre). </div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">Overlap among the attendees was only partial – I would estimate that about a third of the registered attendees. The University of North Carolina and Pratt hat particularly strong representation, the former because of research projects, the latter because of a summer school for students. This post will not discuss all of the presentations, only a few points that seemed important to me. </div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"><b>ICE Forum</b> </div><div class="western" style="margin-bottom: 0in;">My own talk at the beginning of the ICE Forum addressed the question of whether the world needs digital curators. My answer talked about the need for digital cultural migration to make content comprehensible over long periods of time. The first half explained what this meant and the second looked at how we can design software to help migrate content. When I have talked about this to library groups, the audience largely sees the need as obvious. Many archivists in this audience felt outraged. One argued that archivists ought to leave it to future generations to interpret content. Another listener felt that machine-based interpretation and migration was too mechanical and allowed too little scope for human sense-making – though she grew thoughtful when I suggested that writing code to interpret a file was not fundamentally different than other forms of writing about it. I will say more about digital cultural migration in a future post.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">Seamus Ross (Toronto) gave the closing talk at the ICE Forum, in which he quoted Doran Swade that “software is a cultural artifact”. His argument followed my own theme closely in saying that information needs to be annotated and reannotated to be useful for the future. He emphasized the need for case studies like those in law or business school, and he recommended accrediting not the schools but the graduates. We talked about whether effective accreditation was possible without legal requirements and agreed that it would help. Some in the audience disliked the idea of individual accreditation as creating an elite. This did not bother either Seamus or me.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"><b>Bloomsbury Conference</b> </div><div class="western" style="margin-bottom: 0in;"><a href="" name="internal-source-marker_0.32788289384916425"></a>Carol Tenopir (Tennessee) discussed a research project to test a hypothesis that scholars who use social media read less. (Turns out that that is not true.) Some of her statistics were especially interesting. Among scholars:</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;"></div><div class="western" style="margin-bottom: 0in;">Electronic sources</div><ul><li><div class="western" style="margin-bottom: 0in;">in 2011 88% of scholarly reading in the UK came from an electronic source (94% of those readings from a library). </div></li>
<li><div class="western" style="margin-bottom: 0in;">In 2005 54% of the scholarly reading in the US was from an electronic source.</div></li>
</ul><div class="western" style="margin-bottom: 0in;">Screen reading</div><ul><li><div class="western" style="margin-bottom: 0in;">In 2011 45% of the scholarly reading was done on the computer screen and 55% of scholars printed a copy. </div></li>
<li><div class="western" style="margin-bottom: 0in;">In 2005 19% of the scholarly reading was done on the computer screen.</div></li>
</ul><div class="western" style="margin-bottom: 0in;">While the studies were done in different locations (US & UK) at different times, the expectation is that the country makes no significant difference. A substantial decline of personal subscriptions combined with a substantial improvement in the quality of computer screens could be significant factors. Carol's article is <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0021101">online</a> in PloS One.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">David De Roure (Oxford eResearch Centre) talked about Tony Hey's <a href="http://research.microsoft.com/en-us/collaboration/fourthparadigm/contents.aspx">book</a> on the “Fourth Paradigm”. Data-centric research is talked about as if it is new, but (David pointed out) the arts and humanities have done it for a long time. One of the challenges is to get people to think computationally. People also need to stop thinking in terms of “semantically enhanced publication” and to shift their thinking toward “shared digital research objects.” As an alternative to thinking in “paper-sized chunks”, Elsevier now offers an “executable paper grand challenge”. Perhaps Library Hi Tech should too.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">Carolyn Hank (McGilll) gave another notable talk. Her dissertation research was on scholars who blog and the blogs themselves. She did purposeful sampling drawing from the academic blog portal. Of 644 blogs 188 fit her criteria and 153 completed the sample. 80% of the authors considered their blog to be a part of the scholarly record. 68% also said that their blog was subject to critical review. 76% believed that their blog led to invitations to present at a conference. 80% would like to have their blogs preserved for access and use for the “indefinite future”.</div><div class="western" style="margin-bottom: 0in;"><br />
</div><div class="western" style="margin-bottom: 0in;">The last presentation that I hears was by Claire Ross, a doctoral student in digital humanities at UCL. While talking about the effect of social media, she told how she tweeted about her interests when she arrived at UCL and almost immediately got a response from a person at the British Museum that led to a research project. She uses her <a href="http://claireyross.wordpress.com/">blog</a> to show her research activities and argued that Twitter enables a more participatory conference culture. I confess that blogging about conferences makes me listen more closely. Perhaps I should try twittering too. Among her (many) interests is the internet-of-things (especially museum objects), which fits well with the Excellence Cluster (Bild Wissen Gestaltung) that we are developing at Humboldt-Universität zu Berlin.</div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com1tag:blogger.com,1999:blog-6621717127249325094.post-86442132478676744192011-06-23T10:48:00.000-07:002011-06-23T10:48:28.771-07:00Archiving in the Networked World: Metrics for Testing (abstract)<div style="background-color: transparent; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">This article will appear in<a href="http://www.emeraldinsight.com/journals.htm?issn=0737-8831"> Library Hi Tech</a>, v29, no. 3, which should be available in August 2011 in preprint form. The abstract is below.<br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"></span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: bold; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Purpose</span><span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">: This column looks at how long term digital archiving systems are tested and what benchmarks and other metrics are necessary for that testing to produce data that the community can use to make decisions.</span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"></span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: bold; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Methodology</span><span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">: The article reviews recent literature about digital archiving systems involving public and semi-public tests. It then looks specifically at the rules and metrics needed for doing public or semi-public testing for three specific issues: 1) triggering migration; 2) ingest rates; and 3) storage capacity measurement.</span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"></span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: bold; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Findings</span><span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">: Important literature on testing exists but common metrics do not, and too little data is available at this point to establish them reliably. Metrics are needed to judge the quality and timeliness of an archive’s migration services. Archives should offer benchmarks for the speed of ingest, but that will happen only once they come to agreement about starting and ending points. Storage capacity is another area where librarians are raising questions, but without proxy measures and agreement about data amounts, such testing cannot proceed.</span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"></span><br />
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: bold; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Implications</span><span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">: Testing is necessary to develop useful metrics and benchmarks about performance. At present the archiving community has too little data on which to make decisions about long term digital archiving, and as long as that is the case, the decisions may well be flawed.</span></div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-23870274646622853642011-06-18T00:33:00.000-07:002011-06-18T00:33:53.821-07:00German Library Conference in Berlin<div style="background-color: transparent; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><span id="internal-source-marker_0.9101943271234632" style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The German Library Conference (</span><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><i>Bibliothekartag </i></span><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">auf Deutsch) took place last week in Berlin at the <a href="http://www.estrel.com/rw_e11e/main.asp?WebID=estrel">Estrel </a>Conference Center in the (far) south east corner of Berlin. The theme of the conference was “Libraries for the Future; the Future for Libraries” and as the theme implies German libraries are aware that the information world is changing in ways that they cannot simply ignore. A friend describes the conference as a purely incestuous association meeting. The </span><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><i>Bibliothekartag </i></span><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">is certainly more like the American Library Association meeting than purely scholarly conferences like JCDL or <a href="http://www.tpdl2011.org/">TPDL </a>(formerly ECDL). Nonetheless such meetings are important both to measure the readiness of ordinary libraries to make changes and as an opportunity to educate the profession about topics that they approach with considerable reserve.</span><br />
<span class="Apple-style-span" style="font-size: 16px; white-space: pre-wrap;"><br />
</span><br />
<span class="Apple-style-span" style="font-size: 16px; white-space: pre-wrap;">I attended only a few sessions because of concurrent meetings at my University. One was by Lynn Connaway from OCLC Research. Lynn was one of relatively few speakers who spoke in English – which the German audience understood without any apparent problems. She spoke about a JISC funded project in which her task was to find common results among a number of user-studies. A point that she passed over quickly in the talk (but which we spoke about in greater detail privately) was the difficulty in finding exactly how some of the studies were done: how the subjects were chosen, how exactly the data were gathered, or how they were analyzed. Among the common conclusions that she reported were:</span><br />
<span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"></span><br />
<ul><li style="background-color: transparent; color: black; font-family: Verdana; font-size: 10pt; font-style: normal; list-style-type: disc; text-decoration: none; vertical-align: baseline;"><div style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;"><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><b>Virtual Help</b></span><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. Users sometimes prefer online help even in the library because they do not want to get out of their chairs. </span></div></li>
<li style="background-color: transparent; color: black; font-family: Verdana; font-size: 10pt; font-style: normal; list-style-type: disc; text-decoration: none; vertical-align: baseline;"><div style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;"><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><b>Squirreling instead of reading</b></span><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. Many users squirrel away information and spend relatively little time working actively with contents.</span></div></li>
<li style="background-color: transparent; color: black; font-family: Verdana; font-size: 10pt; font-style: normal; list-style-type: disc; text-decoration: none; vertical-align: baseline;"><div style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;"><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><b>Libraries = books</b></span><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. Many people think of libraries primarily as collections of physical books and often do not realize the library's role in providing electronic resources. They also criticize the physical library and its traditional collections.</span></div></li>
</ul><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"></span><br />
<div style="margin-bottom: 0pt; margin-top: 0pt; text-align: justify;"><span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I also attended a session that was entitled “Networked Libraries: Service providers for networked data.” Many of the talks discussed linked data or linked open data. Jakob Voss gave the initial lecture and used a visual metaphor of bridges to make the point both about the need for connections and their fragility (one of his slides showed a bridge that had collapsed). The final presentations in this session focused on digital archiving. The first looked at research data with the idea that “data is the new oil”. One major step forward is that DFG and NSF both now require data management plans for data from supported projects. A serious issue is the long term costs for archiving research data, which both nestor and MIT are beginning to examine. The second archiving talk was mine on the LuKII Project (LOCKSS und KOPAL: Infrastruktur und Interoperabilität). In my overview I mentioned the need to understand cultural as well as technical migration -- that is, our cultural understanding of information changes over time, just as do the formats. This evoked some interest during the discussion.</span><br />
<span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br />
</span><br />
<span style="background-color: transparent; color: black; font-family: 'Times New Roman'; font-size: 12pt; font-style: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br />
</span></div></div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-48051328411397900602011-06-01T12:15:00.000-07:002011-06-01T12:15:44.986-07:00AnthroLib mapNancy Foster, one of the editors of "<a href="http://docushare.lib.rochester.edu/docushare/dsweb/View/Collection-4436">Studying Students: the Undergraduate Research Project at the University of Rochester</a>" has published a <a href="http://maps.google.com/maps/ms?hl=en&ie=UTF8&oe=UTF8&msa=33&msid=206970219449377719813.00049c5d6169d6b252cf9&abauth=4de663dchsm0N8hdUJATBYQV7EG35zM1FCg">map</a> of anthropologically based library studies that is not only shows the geographic distribution, but shows an impressive number of projects.<br />
<br />
In Berlin's "winter" semester (that begins in October 2011), I plan to offer a seminar with a colleague where some students work with her using psychological experiments to evaluate digital libraries, and I use ethnographic methods. Many of the projects in Nancy's map seem to address physical library spaces, but my interest is in digital space.<br />
<br />
In talking about cultural anthropology with students I often quote Clifford Geertz:<br />
<blockquote>The "essential task of theory building here is … not to generalize across cases but to generalize within them. ...The diagnostician doesn't predict measles; he decides that someone has them…" (Geertz, 1973, p. 26)</blockquote>In looking at a particular digital resource the goal should not be not to generalize about all users, but to understand the culture and quirks of those who use (or do not use) that particular source. This is very much like the goal of those studying students at particular libraries, except that the users are not necessarily physically there -- which can be a problem.<br />
<br />
For the seminar we may well use some portion of the digital presence of our own university library. One possibility might be to repeat some experiments from Studying Students (for example, the one in which students redesign the library website), but to use culturally different sets of users (humanists and natural scientists? German students and Germanists in the US?) to understand different design preferences.<br />
<br />
If anyone has done ethnographic experiments in digital space, I would be interested in hearing about them.<br />
<br />
<b><span class="Apple-style-span" style="font-family: Verdana, sans-serif;">References</span></b><br />
<span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-size: xx-small;">Geertz, C. (1973) Thick Description: Toward an Interpretive Theory of Culture, In: The Interpretation of Cultures: Selected Essays. New York, Basic Books, pp.3-32.</span>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com1tag:blogger.com,1999:blog-6621717127249325094.post-18921006021120234522011-05-28T07:34:00.000-07:002011-05-29T01:53:55.166-07:00RosettaOn Friday I heard a presentation by Ex Libris staff about their <a href="http://www.exlibrisgroup.com/category/RosettaOverview">Rosetta</a> long term digital preservation system. Marketing presentations generally do not interest me, but the presenter was the project manager and could in fact answer questions about technical issues. <br />
<div><br />
</div><div><b>Bitstream Integrity</b></div><div><br />
</div><div>Basically this is not a problem that Rosetta addresses directly, but it also does not deny its importance. They have in fact talked with David Rosenthal about it. The system structure separates the bit-management from other layers and allows multiple solutions, including those that do active integrity checking. When Rosetta must manage the storage directly, it uses checksums and does periodic integrity testing against the stored copy. But if the copy's checksum does not match the stored checksum, then they can only ask for someone else to give them a new copy, which could be troublesome in 100 years or so. </div><div><br />
</div><div>We talked about whether LOCKSS might integrate with Rosetta at this level. The general answer seemed to be yes, or at least that it might be worth a try.</div><div><br />
</div><div><b>Authenticity</b></div><div><br />
</div><div>Rosetta does maintain provenance information, but has no way to link back to check against the original to make sure that the authenticity remains synchronized. This is problem is not unique to Rosetta. The digital preservation field really needs to develop reasonable criteria for authenticity testing.</div><div><br />
</div><div><b>Access</b></div><div><br />
</div><div>Here Rosetta seems to do a good job in making various access copies and controlling the access rules. </div><div><br />
</div><div><b>Risk Manager</b></div><div><br />
</div><div>This feature appears to function something like the migration manager in koLibRI. It uses a database to keep track of technical metadata about formats and versions. Rosetta has a knowledge base that allow institutions "to share their formats, related risks and applications". Rosetta has a work-group to enhance the knowledge base as well. <span style="font-size: xx-small;">[Thanks to Ido Peled for this addition.]</span><br />
<br />
We talked about the risk problem generally with format change. It is not really 0 or 1, but more likely a scaled reduction of access to certain formats. Clearly there needs to be more thinking about when to trigger migration and what kind of migration (on-the-fly or preventative) makes sense. </div><div><br />
</div><div><b>Load Speeds</b></div><div><br />
</div><div>I was pleased to see that Rosetta has tested its performance loading different sizes of data and that the information is publicly <a href="https://docs.google.com/viewer?url=http%3A%2F%2Fwww.exlibrisgroup.com%2Ffiles%2FProducts%2FPreservation%2FRosettaScalingProofofConcept.pdf">accessible</a> (see figure 3). I have talked with a number of publishers recently that have concerns about the ability of archiving systems to load their contents in a timely manner and I think other systems should test their ingest times. </div><div><br />
</div><div><b>Conclusion</b></div><div><br />
</div><div>The session ended with our agreeing to talk more about potential collaboration in the research arena. </div><div><br />
</div><div><br />
</div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com1tag:blogger.com,1999:blog-6621717127249325094.post-71475799188088199152011-05-24T22:51:00.000-07:002011-05-27T09:16:57.512-07:00ANADP in Tallinn - day 3<b>Educational Alignment Panel</b><div><br /></div><div>At the EU level, there is an effort to get more involved with professional bodies. Internships play no significant role in UK digital preservation education, since the masters there tends to be a one-year degree. Knolwledge Exchange is interested in these developments. The Library of Congress has collaborated with a number of US schools to establish internships, which benefit both the Library and the interns, as well as the professions that they later enter.</div><div><br /></div><div>One the biggest challenges is how to identify the essential skills needed for digital preservation. What correlates to bookbinding in the digital world? It may be programming. The students need much more technical competence. While adding courses step by step may be insufficient, finding the time for curriculum reform is challenging. Addressing the funding dilemma is a key aspect and George Coulbourne (LC) suggests corporate partnerships to share costs and responsibilities. In the question and answer period, the question of a "new" profession vs mainstreaming the new skills in the old profession arose. We need to remain aware of the difference between education and mere training that focuses only on particular skills and belongs to ongoing professional development. </div><div><br /></div><div><b>Economic Alignment Panel</b></div><div><br /></div><div>Costs are a vital issue for any digital archive. Sharing tools and collaboration are ways to manage costs. Examples include LOCKSS and NDIIPP. In Italy MiBAC offers a legal deposit service for its small institution partners. We can also learn from failed initiatives. PADI was, for example, discontinued after 10 years (see <a href="http://crln.acrl.org/content/71/10/562.full">ACRL</a>), largely for economic reasons because the national library ended up having to do most of the funding. Neil Grindley used an analogy with the computer game Asteroids -- in his version funders like JISC fire money a big problems like digital preservation in the hope of breaking the problem (asteroid) up. But what angle should the funder take, when it funds. Looking back at the JISC funding efforts, Neil wonders whether someone should write a "really good" history of digital funding. JISC has been doing some cost-modeling. Archival storage is consistently a small portion (15%) of overall project costs. Repairing problems costs significantly more than initial preservation. The UK is building a higher education cloud infrastructure. <a href="http://edina.ac.uk/projects/peprs/">PEPRS </a>(Piloting an e-journals preservation registry service) is trying to build similar infrastructure for preservation.</div><div><br /></div><div>In the Czech republic a funding problem is that digital preservation is invisible and often ignored in favor of digitizing more documents. Electronic deposit began only in 2011 as a pilot project, but digitization began in the 1990s with historical manuscripts, and with endangered newspapers and monographs. The aim is to digitize 26 million pages by 2014. The budget is 12 million Euros. </div><div><br /></div><div>Digital preservation is the flipside of collection development. At Auburn University in Alabama they are using distributed digital preservation in a Private LOCKSS Network. 7 institutions have joined the Alabama PLN and it has been self-supporting since 2008. The fee-base varies from $300 to $4800 per year. Governance took longer to establish. The guiding principles: keep it simple, keep it cheap, don't build something new if you don't have to. Recommendation: stop chasing soft money and start making tough choices about local commitment.</div><div><br /></div><div><b>Breakout session: Education</b></div><div><br /></div><div>A former student from the Royal School in Copenhagen suggested that we consider the <a href="http://eacea.ec.europa.eu/erasmus_mundus/">Erasmus Mundus</a> program and put together a focused program for that funding source. The students would like more specific job expectations, but the expectations are very various. Employers look for the right mindset, not the right skill-set. Squeezing in internships is hard. From the employer perspective, an internship is like a year-long interview. Many of the schools have active hands-on programs that emphasize teamwork and practical problem-solving.</div><div><br /></div><div><b>Summary Presentations</b></div><div><br /></div><div>Benchmarking takes data from content providers and some are ready to make data available. We also need to communicate about benchmarking and other tests. </div><div><br /></div><div>Cliff Lynch offered an "opinionated synthesis." What does this term "alignment" mean? Making our limited economic and intellectual resources go further through collaboration is obviously beneficial. Another aspect of alignment is that a common case should speak more effectively to governments. </div><div><br /></div><div>In the tech discussion there were valuable conversations about benchmarking and testing. We need to be clear what we mean by interoperability, what we want to accomplish and what we want to get out of it. Two topics were missing: monoculture and hubris. We will have more confidence that we know what we need to do in 100 years and diversity in the system in a valuable antidote to the mistakes we make. We need to focus on the bit-storage layer and there will be a lot of money flowing in this area. Security and integrity are topics that were mentioned but need more focus. Imagine a wiki-leaks type leak of embargoed content. It would undermine the trust in cultural preservation institutions. </div><div><br /></div><div>Strategies inside the national level were not discussed as much as they should have been. The question of the replication of material along organizations also needs more discussion. It is interesting that we see standards in so many roles in digital preservation. The legal issues are becoming more and more dominant and we need to look more opportunities to collaborate here. The one thing he would note on education is that the discussion needs to feed back into the national discussions. We did not talk much about scale in the discussion about economics. The risk management tradeoff for digitizing needs assessment.</div><div><br /></div><div>The elephant in the room is e-science and e-scholarship. There is a lot of money involved here and big investments. This is not a place where many national libraries have been involved, though universities often are. This is driving both technology and some educational efforts. Other smaller elephants are audiovisual material and the newborn digital contents. </div><div><br /></div><div>There are two additional axes that matter. One is making the case outside out community. The second is collecting policy. News has been a fundamental part of the public record and we know that it is fundamentally changing its character. Also software, personal records, social media. Cliff hopes that this is helpful in providing a frame for the conversation we have had in the last two days.</div><div><br /></div><div><br /></div><div><b>Links</b></div><div><br /></div><div>Another blog about the ANADP conference is Inge <span class="Apple-style-span">Angevaare</span><span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-size: 13px; font-weight: bold; ">'</span> <a href="http://digitaalduurzaam.blogspot.com/">Long-term Access</a> blog (this portion of the blog is in English -- sometimes it is also in Dutch).</div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-84966832094608965002011-05-23T22:51:00.000-07:002011-05-24T13:02:10.395-07:00ANADP in Tallinn - day 2<p style="margin-bottom: 0cm; font-style: normal; widows: 2; orphans: 2" align="LEFT"> <span style="color:#000000;"><span style="font-family:Georgia, serif;"><span style="font-size:100%;"><b>Keynote</b></span></span></span></p> <p style="margin-bottom: 0cm; widows: 2; orphans: 2" align="LEFT"><span style="color:#000000;"><span style="font-family:Georgia, serif;"><span style="font-size:100%;"><span style="font-style: normal"><span style="font-weight: normal">The keynote speaker at the <a href="http://www.educopia.org/events/ANADP">ANADP </a>conference for the second day was Gunnar Sahlin from the National Library of Sweden. One of the National Library's explicit tasks is to support university libraries. Open access and e-publishing are key initiatives together with the other 4 Nordic countries. Linked open data a more problematic topic because of resistance by publishers, but the National library strongly supports Europeana's efforts in this area. There is a close cooperation with the public sector, especially Swedish radio and television. The Swedish Parliament is considering a new copyright law that may clarify some issues.</span></span></span></span></span></p> <p style="margin-bottom: 0cm; font-style: normal; widows: 2; orphans: 2" align="LEFT"> <span style="color:#000000;"><span style="font-family:Georgia, serif;"><span style="font-size:100%;"><b>Standards panel</b></span></span></span></p> <p style="margin-bottom: 0cm; font-style: normal; font-weight: normal; widows: 2; orphans: 2;" align="LEFT"> <span style="color:#000000;"><span style="font-family:Georgia, serif;"><span style="font-size:100%;">The standards panel began with the idea that we have both too many standards and too few. Standards can be seen as a sign of maturity in a field. Digital preservation has not only its own standards, but many from other areas -- a Chinese menu of choices. Information security standards are for preserving confidentiality, integrity, and the availability of information. Many memory institutions have to comply with these standards. The issue was especially important for Estonia because of internet attacks, especially denial of service attacks. In general information security is well integrated into plans in the Baltic countries, but long term digital preservation is not. Only 12% have an offsite disaster recovery plan.<br /></span></span></span></p><p style="margin-bottom: 0cm; font-style: normal; font-weight: normal; widows: 2; orphans: 2" align="LEFT"><span style="color:#000000;"><span style="font-family:Georgia, serif;"><span style="font-size:100%;">The UK Data Archive is an archive for social science and humanities data since 1967. "A standard is an agreed and repeatable way of doing something -- a specification of precise criteria designed to be used consistently and appropriately." In fact many standards are impractical, with unnecessary detail (8 [?] pages to explain options for gender in humans). Cal Lee spoke about 10 fundamental assertions, including that no particular level of preservation is canonically correct. Context is the set of symbolic and social relationships. With best practices and standards, trust is a key issue. PLANETS is concerned about quality standards and such standards begin with testing. Trust consists of audits, peer-reviewing, self-assessment, and certification. The process moves from awareness to evidence to learning. The biggest technology challenge comes from de facto standards from industry, and we have little control there. Good standards have metrics and measurement systems. Within our lifetime everything that we have as a preservation standard now will be superseded, but the principles will remain.</span></span></span></p> <p style="margin-bottom: 0cm; font-style: normal; widows: 2; orphans: 2" align="LEFT"> <span style="color:#000000;"><span style="font-family:Georgia, serif;"><span style="font-size:100%;"><b>Copyright panel</b></span></span></span></p> <p style="margin-bottom: 0cm;" align="JUSTIFY"><a name="internal-source-marker_0.09694895357824862"></a> <span style="color:#000000;"><span style="text-decoration: none;"><span style="font-family:Arial;"><span style="font-size: 11pt;font-size:85%;" ><span style="font-style: normal;"><span style="font-weight: normal;"><span style="background: none repeat scroll 0% 0% transparent;">Digital legal deposit is a key element, but not a form of alignment. In the UK, for example, legal deposit is still just for print material. In the Netherlands there is a voluntary agreement that works well. Territoriality is a problem – how to define the venue in which publishing takes place in the digital world, what is unlawful, what is protected, etc. The variance in legal deposit between countries leads to gaps. The rules for diligent search for orphan works are so complicated that they are too expensive to use. Even within the context of Europeana cross border access to orphan works is a problem. In US law contract law takes precedent over copyright law. Too many licenses could undermine the ability to preserve materials. <br /></span></span></span></span></span></span></span></p><p style="margin-bottom: 0cm;" align="JUSTIFY"><span style="color:#000000;"><span style="text-decoration: none;"><span style="font-family:Arial;"><span style="font-size: 11pt;font-size:85%;" ><span style="font-style: normal;"><span style="font-weight: normal;"><span style="background: none repeat scroll 0% 0% transparent;">To a question about Google a speaker said that Google's original defense of the scanning project was "fair use" (17 USC 107) and they had a good chance there. It changed to a class-action suit, which is more complicated. The breakout session on copyright went into further depth about what problems exist in dealing with copyright across national borders. Apparently a feature of Irish copyright law is that the copyright law takes precedence over private contracts. Generally contracts take priority.</span></span></span></span></span></span></span></p><p style="margin-bottom: 0cm;" align="JUSTIFY"><br /></p><span style="font-weight: bold;">Summary of the sessions</span><br /><br /><p style="margin-bottom: 0cm" align="JUSTIFY">Panel chairs gave a summary of their sessions and breakout sessions. For the technical group I spoke about the need for testing, trust (or distrust) and metrics and argued that we are really just beginning to address these issues.<br /></p> <p style="margin-bottom: 0cm" align="JUSTIFY"><br /></p>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0tag:blogger.com,1999:blog-6621717127249325094.post-27969082411019905672011-05-23T00:05:00.000-07:002011-05-25T06:56:41.532-07:00ANADP in Tallinn - day 1<div><b>Opening</b></div><div><br /></div>This blog is beginning with a very international conference called Aligning National Approaches to Digital Preservation (<a href="http://www.educopia.org/events/ANADP">ANADP</a>) that is taking place today in Tallinn, Estonia. <div><br /></div><div>The President of Estonia opened the conference. He emphasized how technologically and digitally aware the country is and also the national library. 10 million Estonian books were destroyed during the Soviet occupation as part of an effort to erase Estonian identity. Further destruction took place when Estonia came free and people wanted to cover up their past. Digitization allows the country to preserve and made materials accessible. The president closed by saying: "Digitizing our national memory is a cornerstone of liberty."</div><div><br /></div><div>Laura Campbell (Library of Congress) gave the keynote address. NDIIPP (National Digital Information Infrastructure Preservation Plan) has the goal of preserving digital materials. Congress provided $100 million for this effort. LoC has worked on a distributed network to carry out this mission. The program model was to learn by doing. There was no clear pathway forward. She cited WARC development as one of the key technological components and argued that secrecy through proprietary systems does not lead to success in digital archiving. As an example she told the story of Goldcorp - a Canadian gold mining company -- that put their proprietary software online and offered a prize for the best recommendations on what to do. The company grew significantly as a result of crowdsource-suggestions. She recommended planing broad goals for collaboration for digital preservation and expanding national digital collections into international ones. LoC has a strong outreach program with classroom teachers to push discussion out to younger people.</div><div><br /></div><div><b>Technical Alignment</b></div><div><br /></div><div>The technical alignment panel looked at two issues: infrastructure and testing. Presentations on infrastructure included kopal, nestor, LuKII, and the UK LOCKSS Alliance. The presentations about testing called for benchmarking, public tests, and metrics that librarians can use when making decisions, rather than just believing vendor claims. A vendor raised questions about this, but admitted that they were not willing to share their test data, except among customers. (Note: I was panel chair and could not make detailed notes during this session.)</div><div><br /></div><div>The panel on organizational alignment looked at long term commitment, the scale necessary to make the work cost-efficient, and effective interaction with vendors. While the EU funds many projects that promise to continue when the funding ended, most do not. TRAC fostered the audit and Certification of Trustwothy Digital Repositories, which is now an ISO Standard. Social collaboration is a necessary element of infrastructure and the National Digital Stewardship Alliance is an attempt to address this. Distributed digital preservation is an idea as old as monastic copying. MetaArchive is a distributed digital preservation initiative that began with NDIIPP funding. MetaArchive now also has European members and has been experimenting with cross-deposit with IRODS.</div><div><br /></div><div><b>Later</b></div><div><br /></div><div>The day ended with a reception. </div><div><br /></div><div><br /></div>Michael Seadlehttp://www.blogger.com/profile/10797656360572968065noreply@blogger.com0