Welcome to the ECS Blogging system
Posted by admin in Uncategorized on January 6, 2010
Welcome to the community blog pages for ECS-Electronics and Computer Science at the University of Southampton.
These blogs are written by academic and research staff and students in ECS on a mixture of topics, ranging from progress on undergraduate projects to reflections on current research issues and postgraduate progress notes.
ECS staff and students can register for a blog by filling out the Blog request form.
A collection of latest posts is below – the full list of blogs is on the right.
User testing: design of the test procedure
Posted by Steve Hitchcock in Uncategorized on January 27, 2012
We began this current series of posts with a promise to present the results of user tests of the new deposit tools developed in the DepositMO project. Via a convoluted route we are finally ready to present those results.
In this series of posts we have described the target of the user tests: the DepositMO tools for Word (Add-in) and for the file manager (Watch Folder), and the extended SWORD infrastructure, to deposit in-progress works from the desktop to the user’s specified repository. To design and perform the tests, however, we need to be more precise about what is being tested and how. That is what we will map out in this post, covering purpose of the test, materials to be used, target users, and measures and outcomes.
Purpose. With any user test it is vital to be clear what you are, and are not, testing. In this case we are testing the process of repository deposit using two new tools, that is, it is a usability test.
Ultimately, if these tools are to succeed then they will have to be available to general users in ways that enable them to be downloaded and installed by those users. Here there are particular limitations and difficulties with installing each tool, given the early stage of development reached. So in these tests we will not be testing the users’ capacity to install the tools themselves, but to use tools already set up for use. This is thus a controlled test environment, which was a pair of Web-connected laptops running Windows 7, and connecting to a demonstrator EPrints repository running the SWORDv2 extensions.
Materials. If this is a test solely of repository deposit, what is to be deposited? The aim is to provide users with content suitable for use with the new deposit tools, and to be used as a basis, initially, for exploring the features of the tools. Beyond that we don’t want to lead the user in terms of choice of deposit tool. Also, for comparative purposes, we introduced a task that involves using the native repository deposit interface. For subsequent deposit tasks the choice of deposit tool is left open for the user.
Given that we have a Word deposit tool, we provide Word documents for deposit. The file manager deposit tool is intended to be a simple and general-purpose tool that might be used to deposit materials considered unconventional in current repository collections. Images are perhaps the simplest case here and were provided in the test corpus. Also, as part of the test we invited users to bring their own content, as a way of motivating interest in particular features that users might have identified during the set test.
By the end of the test users would have deposited quite a large and diverse set of materials in the demonstrator repository. This is not a typical deposit scenario for current repositories, but the test seeks to anticipate new ways of working and deposit made possible by the tools and when such scenarios may not be unexpected. Even so, the extent of deposit in the time available for the test may be ambitious and complex, and in this sense could be seen as subjecting the tools to maximum stress.
Users. Who are the target users? They could be any members of an institution with a repository and who might need to deposit content. Since the new tools are intended to open new deposit scenarios, the users could be new or experienced repository users.
In the case of the DepositMO project we have a number of interested groups participating in the project committee, representing different repositories hoping to implement the tools, and academic subject groups representing authors with expanding needs for repository deposit. To be specific, our users were invited and coordinated by representatives for the e-Prints Soton and EdShare repositories, and the archaeology and chemistry departments at the University of Southampton. In addition, we had interest in the tools from the Kultivate project, a companion JISC repository deposit project, and we worked with two representatives of that project as well.
Since the number of users is small, 13 users in total across four user groups, the tests are sufficient to reveal usability issues with the tools, but to draw any further conclusions about repository deposit would be speculation.
Measures and outcomes. A user test needs to have a way to measure the outcomes. From a general perspective the purpose of repository deposit is to transfer a copy of the designated content to the repository in the most timely and efficient manner, to ensure that the integrity of the content is maintained through copying and transfer, and that it is sufficiently described to aid retrieval by a variety of queries.
In this test we focus on the first issue, the timeliness of deposit, to some extent on the description issue, and not at all on the integrity issue as that is not a function of the new tools but is left to existing repository processes.
In most use cases involving new technology or services it is better to measure what the user does rather than what the user thinks, although the latter can be helpful and revealing in the context of the former. So here we have a record of what the user does, in terms of the repository history for that user’s account, and allied to the history timings we independently timed, in quite a rough manner, how long it took to complete each task in the specified test process.
To discover what users thought of the tools we included short questionnaires in the test document, to be completed just before and immediately after the test process. The first questionnaire inquired about the user’s relevant experience with repositories, if any, and the second prompted their reactions to the test and the tools. During the test observers recorded more instinctive and spontaneous, perhaps less considered, reactions. Most users, apart from one, worked in pairs to provide the opportunity for these reactions to be vocalised, since otherwise the observers were not permitted to be directly involved in the test or to influence it in any way.
The test. Taking all the above factors into account we produced the test specification document. Users were invited, by the coordinators, to a test session at a specified time and place, as local to the users as possible. Four sessions were held, allowing up to 1.5 hours for the test each time. As already mentioned, users were encouraged to bring some of their own content for deposit. Other than that, users did not have to bring any equipment. Pre-installed laptops were provided for each user or pair, and they were not briefed in advance on the form of the test.
Apart from some introductory words by the test moderator, the test is intended to be self-contained and self-explanatory. The purpose of the introduction was to put users at ease, and to ensure that the setup and arrangements were sufficient to allow them to focus on the test and not be distracted.
In the next post we will begin to set out the instructions for performing the test as presented to the test users.
David Barron: Visionary, mentor and friend
Posted by dwb memories blog in Contributions on January 27, 2012
(From Professor Dame Wendy Hall)
I owe my career in computing to David
He didn’t really remember me as an undergraduate mathematics student at Southampton. He taught me computing – Fortran programming to be exact. I hated it. Not because of his teaching – he was an excellent teacher – but because I couldn’t see the point of hanging around the computing laboratory for hours to get the results of a three line programme. At the time I really loved the abstract thinking required for pure mathematics and went on to do a PhD in algebraic topology. I got to know David then as one of the professors in the maths department at Southampton. But I didn’t really socialise with the computer scientists and never dreamt the subject would come to mean so much to me.
I left Southampton in 1977 looking for a career in academia but I couldn’t get a job as a pure maths lecturer. I managed to get a job lecturing in maths but it was maths for engineers and then maths for trainee teachers. During this time the first personal computers started to emerge. I was asked to set up a course to teach teachers BASIC programming. I taught myself BASIC one summer holiday and never looked back, although I was of course mentally mutilated for life as described by Dijkstra – I never really mastered the art of programming! But I got more and more interested in the use of computers in education. I took a part-time MSc at City University in computer science and applied for a job as a lecturer in computer science back at Southampton.
I had little experience teaching computer science and even less research experience. I didn’t think I stood a chance at the interview panel but for some reason best known to him David decided to take a bet on me. After all I had a PhD from Southampton, so I had shown I could do research – just not in computing!
At that time the CS Group, which David lead at Southampton, had just become a Department and had taken its first intake of BSc Computer Science students. We’ve included some of the early CSD photos in this blog. It is amazing to think how small we were then.
Pascal was the first teaching language in those days. We taught it to our own students and as a service course across the University. It was a baptism of fire for me but David was so incredibly supportive.
He encouraged us to use new technology in our teaching – Gillian Lovegrove and I were reminiscing at David’s funeral about the videotape programme we used to teach first year Pascal – and dictated in the mid-80’s that email would be used as the main method of communication in the Department from that point on. I remember thinking – surely we’re not going to send emails to people in the office next door – but of course David with his amazing ability to see into the future when it came to computing technology was right. We all took to it like ducks to water, as did the rest of the research community and eventually the rest of the world.
I was still very interested in the use of computers in education and began exploring the possibility of using the latest interactive videodiscs for teaching programming and other subject disciplines instead of videotape. David enthusiastically encouraged me to do this and even bought me a videodisc player to experiment with. It cost about £1K, which in those days was a large chunk of the Department’s equipment budget. He also paid for me to go to multimedia conferences in the US, which I could never have funded on my own. Others in the Department were not so happy. Was this real computer science? David stood his ground. This was the future and he was encouraging me to be a pioneer. If he hadn’t done this things would have been very different. I would never have had the career I had in computer science and I might even have left the subject behind all together as the negative comments about the direction my research was taking me were very hard to deal with.
The rest as they say is history. We became the first group in the UK to develop videodisc drivers for the Apple Mac. We pioneered the development of multimedia and hypermedia information management systems and we became one of the few computer science research groups to take Web research seriously. I would argue David’s hunch paid off and I think so would he.
He introduced my inaugural lecture. “Watch my lips” he said. “When I first knew Wendy she was shy and retiring”. With his help I had become so much more confident. I secured an EPSRC Research Fellowship and a personal chair. David was so proud and I loved him for that.
He had of course had the foresight to lead us into the merger of Electronics and Computer Science at Southampton in 1987. He became the second Head of ECS and the meteoric climb of CS at Southampton began – from a small group that did virtually no research to the world-class research group it is today. When he stepped down as Head of ECS he was a bit lost as to where he belonged. I was so pleased when he agreed to join the Multimedia research group and spend the last years of his working life associated with our research effort.
At his father’s funeral, Nick talked about David’s approach to diversity. When David was asked what he was doing about women in computing at Southampton he replied “We employ them as academics”. I will be forever grateful that included me.
INBOX d/dt
Posted by Christopher Gutteridge in Data on January 26, 2012
So I’ve been looking at my inbox. I felt like I’ve been getting more email than usual so I’ve put that theory to the test:
I’ve been collecting my INBOX size every hour for the past year, which makes it easier.
Data as CSV [CC-BY]
First a basic graph of time vs INBOX size is a start… you can see the peak where I went to the sea-side and didn’t answer my email!
But to work out how loaded I am, a differencial is more useful: Hourly change in INBOX which is unreadable, so lets add a 168 hour (one week) rolling average: Hourly change in INBOX, Weekly rolling average.
Nearly there, but it’s only increases that I’m interested in so lets make a Hourly increases in INBOX, weekly rolling average.
So I can clearly see from this that my email inbox increases remain at an average of 2-3 per hour. Obviously any hour where I answer email will be ignored, so it’s imperfect, and I get far more email in week days so the actual number is probably higher, but it shows that the steady rate isn’t unsually high right now.
As a final note; the level of decreases is a very different to that of increases: Hourly decreases in INBOX, weekly rolling average, which you’d expect as it’s when I’ve been clearing more/less email.
OK… I should probably get back to my INBOX.
Repository usability review 3: new deposit protocols
Posted by Steve Hitchcock in Uncategorized on January 26, 2012
This is one of a series of posts building towards a full paper on the use and testing of the repository deposit tools, specifically for deposit of in-progress work, developed in the JISC DepositMO project. In this post we continue and complete our review of repository usability studies, here considering repository deposit processes and protocols that increase the utility of deposit. If we have missed any relevant work that should be included, please leave a comment.
Looking forward, there have been glimpses of repository interfaces that might play a role for in-progress file deposit and management, where in-progress means as the work is written and created. Runner-up in the Developer Challenge at Open Repositories 2009 was a prototype called FedoraFS by Rebecca Koesar, who exposed Fedora as a desktop filestore using Fuse: “while only a command line prototype at this point the ease of overlaying a Graphical User Interface with file-folder icons is all but a done deal.”
At Open Repositories 2011 the same Developer Challenge was won by Repository as a Service (RaaS) based on the idea that with standard repository interfaces, to get data in and out the repository becomes a commodity that can be swapped. The entry demonstrated an Android mobile app that used SWORD to deposit photos into both DSpace and EPrints. A common interface was provided to access the items in the repository. Both EPrints and DSpace provided identical experiences because of the common interfaces.
Which brings us to SWORD (Simple Web service Offering Repository Deposit). The idea that repository deposit might be abstracted from repository software began in practice with SWORD in 2007. At the core of its proposition was, given the plethora of repositories and user interfaces, that users could choose a single interface for deposit in multiple repositories. It may thus be the single-most important development in repository user interfaces since the original repository softwares, but its impact is still more in principle than practice in terms of alternative interfaces that have attracted wide use. This may be due to a lack of interfaces, or lack of uptake, but it has not limited the number of potential use cases, as shown by Lewis, et al. (2012).
Or it could be due to limited functionality, which will be fixed by SWORD version 2, released in December 2011. We have written on this blog before about the key developments in SWORD v2, taking it from a ‘fire and forget’ deposit method that is limited in terms of what authors can do with their content after depositing in a repository, to a model where authors retain more control over their content and items can be created, updated, replaced, or deleted (CRUD) in the repository. SWORD v2 is the basis of DepositMO’s vision for in-progress deposit.
SWORD builds on AtomPub, the Atom Publishing Protocol. Many may be more familiar with Atom in the form of an RSS-like syndication protocol; AtomPub is the flip side, about pushing content into a dissemination resource rather than its receipt by readers (Snell 2006). SWORD 2.0 continues to build on AtomPub through the inclusion of the CRUD operations of AtomPub to enable the following kinds of interactions, described by the SWORD 2.0 Profile:
- Clients may create resources by sending compound resources, such as archive files (tar, zip).
- Workflows which may or may not include manual stages before deposited resources become available as web resources, are acknowledged and supported
- Clients may update or delete the compound resources or associated metadata
In the progression to profile 2.0, SWORD becomes more explicitly a deposit lifecycle standard: “Most of the enhancements in SWORD 2.0 are around closing the deposit loop. This deposit process is only a portion of the full content lifecycle and does not attempt to provide support for collaborative or distributed authoring environments or policy management; it is focused entirely on the process of moving content from one location to another.”
During consultation over the new profile the adoption of more powerful publication protocols was advocated, but these have not so far been included, leaving the profile specification to note: “For standards to integrate detailed content management operations it is recommended to look at OASIS CMIS (Organization for the Advancement of Structured Information Standards – Content Management Interoperability Services), or the Google Documents (GData) List API.” It was this aim of bringing more powerful content management features to repositories that led to the DepositMO extensions using the platform and flexibility that SWORDv2 created with reference to these other publication protocols.
A similar publishing protocol connecting content-producing tools and repositories for learning resources and metadata is the Simple Publishing Interface (SPI), which also uses AtomPub. The main difference between SWORD and SPI, according to Ternier, et al. (2010), is in the submission of metadata. “In SWORD, metadata is available in a package and thus submitted to the repository as a part of the resource. In contrast, the SPI model makes a clear distinction between metadata and content at submission time. The rationale for this strict separation comes from the SPI application domain. Where SWORD is primarily concerned with depositing data, SPI is also intended for application scenarios where only metadata is considered.”
We have presented in this series of three posts a sweeping review of practical work on the usability of repository interfaces, most notably the deposit interface. Broadly these studies have found little wrong with current ‘fire and forget’ deposit interfaces while in some cases making adjustments for the structure and volume of metadata collected at deposit. Detailed and critical evidence on whether deposit interfaces make a substantive difference to levels of deposit, however, is hard to find. If we are to move towards more sophisticated Web 2.0-like interfaces for repositories, providing more interaction with authors using new protocols such as SWORD v2 to capture work at an earlier stage of creation, then users and usability will have to become much more central to repository testing.
This concludes our three-part review of repository usability focussing on user deposit. The next post will start to report the results of user testing of the repository deposit tools developed in DepositMO.
Repository usability review 2: user deposit interfaces
Posted by Steve Hitchcock in Uncategorized on January 25, 2012
This is one of a series of posts building towards a full paper on the use and testing of the repository deposit tools, specifically for deposit of in-progress work, developed in the JISC DepositMO project. In this post we continue our review of repository usability studies, here with an emphasis on repository interfaces. If we have missed any relevant work that should be included, please leave a comment.
If the deposit projects described in part 1 of this review were more concerned with metadata than the usability of repository interfaces, it may be because, unlike the Southampton projects that have close links with the development of EPrints, most of these projects do not have the keys to develop one of the main repository softwares. In response to its repository user survey, however, University of Rochester went further: it built its own repository software, known as ‘IR+’. This began with studies of faculty work practices and “resulted in modifications to the University of Rochester’s implementation of the DSpace code to better align the repository with the existing work practices of faculty.” (Foster and Gibbons 2005)
This included the introduction of Researcher and Researcher Tools pages. Further studies of work practices, of graduate students, followed at Rochester (Randall, et al. 2008). “The results of our work-practice study pointed to several enhancements, including personal showcase pages for faculty members and researchers, download statistics, and a checksum tool to support long-term preservation of files. We added these features to our IR and observed an increase in repository use as a result, but it was not a dramatic increase.” Note that the interest here was on authoring and information management practices used by the students: “Specifically, we wanted to build an authoring environment on our IR platform, while also integrating traditional and digital library functions and services. The end product is to be one interface for a wide range of research, writing, and archiving activities.”
The result, in 2010, was IR+, which “focuses on giving researchers an online ‘workspace’ within the repository where they can upload and preserve different versions of an article they are working on.” (Kolowich 2010)
An animated video on authoring support in IR+ shows that this private user workspace provides a file manager interface for sharing and collaboration, versioning and publication (Bell and Sarr 2010). In this respect IR+ can be compared with document services such as Google Docs and file management services such as Microsoft Sharepoint, rather than conventional repositories. DepositMO has worked on enabling deposit from similar applications – word processing and file management – but these applications are external to the repository.
There have been number of user studies of repositories focussed on the content submission process. A user evaluation of the DSpace multimedia repository B@bele (Caccialupi, et al. 2009) covered a variety of user issues including ‘Upload’, concluding: “The problem with submitting a new document depends on the layout of the upload page. Finding the upload link on that page is not simple since it is not visually recognizable as the link is inserted in the middle of the page. This task becomes especially difficult if other processes are not yet concluded and are therefore still active.” The method used here proceeded in a similar form to the DepositMO user tests.
Silva, et al. (2007, not OA) measured the ability of segregated groups of users to submit metadata to a repository, in this case the Brazilian Digital Library of Computing (BDBComp). Users had to access the repository, register, login, submit and check data. Results were claimed to show BDBComp to be an easy, comfortable, and useful self-archiving service, without indicating the motivations to use the system.
A model user study – simple, fast, focused – tested the submission process for a DSpace repository for electronic theses and dissertations (Boock 2005). The test included registration as well as deposit: “Usability testing proved we were on the right track and was well worth the two hours invested. Try it; it’s easy.”
While there are other usability studies of repositories and software, not all are from the author deposit viewpoint. McKay (2007) is concerned with repository users and usability, but not authors because they are “better studied than any other users of IRs.” Many such studies are to do with author motivations, or the lack of them, to use repositories (Mark and Shearer 2006, Davis and Connolly 2007) rather than with Web user interfaces.
Where usability studies get closer to repositories and software, they tend to be interested in functional issues, installation and configuration (Ottaviani 2006, Körber and Suleman 2008). Ottaviani considers deposit usability, but in practice proposes functional changes rather than a test report.
McKay and Burriss (2008) perform usability testing of VTLS Vital, one of the Fedora user interfaces, but in this case they test the information-seeking interfaces for end-users rather than deposit interfaces. Kim (J) (2005) compared the search user interfaces of DSpace and EPrints.
Kim (HH) and Kim (YH) (2008, not OA) provide suggestions that could be adapted to improve the usability of institutional repository systems, and to establish a usability evaluation framework. They seem principally interested in how to search for documents, improving visual appearance, clustering and display.
Similarly, although Feng and Huang (2008, not OA) claim to have evaluated the usability of three discipline repositories – arXiv, PubMed Central and E-LIS – their framework, or criteria, for evaluation are really features and usage data taken from the repositories rather than direct involvement with users.
Also based on E-LIS, a subject-focussed digital repository, Tsakonas and Papatheodorou (2008) explored usefulness, usability and performance of an open access digital library based on a statistical analysis of a user questionnaire. Using a theoretical framework rather than observations of practical use, the work revealed that repositories would need to be more closely linked with users’ work tasks.
The next post will continue this review of studies of repository usability by looking at repository deposit processes and protocols that increase the utility of deposit, which we expect will receive greater attention and use going forward.
Repository usability review 1: designing for metadata deposit
Posted by Steve Hitchcock in Uncategorized on January 24, 2012
This is one of a series of posts building towards a full paper on the use and testing of the repository deposit tools, specifically for deposit of in-progress work, developed in the JISC DepositMO project. In this and the next two posts we shall explore earlier attempts to document and test repository usability, particularly from the author viewpoint, and try to show how it relates to our work in this area. This is an area that spans both formally reported work and practical work that may have been reported informally. If we have missed any relevant work that should be included, please leave a comment.
Digital repositories essentially provide a series of interfaces to get things done, such as deposit content in the repository, manage content (usually an administrator, but can involve others) and access content (users can browse or search, or machines can harvest content using OAI-PMH). It’s curious, then, that there have been few published studies of repository deposit or user testing of deposit interfaces.
In this work we are concerned with one of the repository interfaces, for deposit – that is, how new content is added to the repository. Both EPrints and DSpace provide native deposit interfaces, configurable by repository administrators; another popular repository software, Fedora, requires third-party interfaces. Basically, these deposit interfaces consist of Web forms that collect information, or metadata, describing the item or digital object being deposited, and a button to start the download of the selected item from a specified Web location.
These forms have been criticised for being too long and taking too much time to complete, even if the claim may be shown to be exaggerated (Carr and Harnad 2005). There is a perennial trade-off between collecting sufficient information from an author or creator of an object at the point of deposit to ensure it can be categorised, differentiated and fully searchable, and minimising inconvenience and time taken by the depositor.
Thus for digital repositories, user-facing issues in supporting deposit include design and requirements for metadata as well as more general Web usability.
This is not a new issue for a repository deposit interface. EPrints launched as the first institutional repository software in 2000, and by 2003 its deposit interface was being reviewed by the JISC TARDis project: “A focal point of the Southampton-TARDis re-design has been to simplify and assist user input, which was tackled in two principal ways. A facility for mediated data input is provided. Also, data input pages were structured so that authors, or mediators, need see only those input fields required for the type of document to be deposited.” (Gutteridge, et al. 2003) In fact, this structure is still clearly evident in the native EPrints deposit interface today.
It is notable that the TARDis report refers to “testing by local Southampton users and by administrators of EPrints.org software elsewhere”, but does not provide detail of the tests, just the outcomes. Later, around 2006, an independent research group did some user testing of EPrints, but its report was only briefly available online, was incomplete, and now seems to have been removed entirely. EPrints may have been a little coy about revealing the results of past user tests.
Taking the minimal metadata approach to its extreme, one repository project, EdShare was to develop “a closer, integrated deposit mechanism such that with a single deposit ‘click’, resources will be made visible within the institutional learning and teaching repository” (Morris 2009)
Together with Language Box, another teaching and learning repository project at the University of Southampton, EdShare aimed to raise use of such repositories but had recognized that traditional repositories ‘fell flat’ for this audience. They were inspired by Web 2.0 sharing sites such as YouTube and Flickr that allow users to deposit content with the minimum of overhead.
“Both repositories have simplicity at their heart. We used a minimum manual set of metadata, requiring only that users name their deposits and provide a minimum set of automatic metadata such as time and date of deposit, attribution, etc. The few optional metadata fields are based on well-understood terms (such as language) or are nonrestrictive (such as a description and tag fields).” (Davis, et al. 2009)
All the projects described so far have been funded by JISC, which has recently invested in a series of projects that have investigated aspects of repository deposit, culminating in its most substantial deposit strand within the Information Environment Programme 2009-11.
Earlier projects were concerned with metadata requirements and automated metadata extraction from source documents (Deposit Plait 2008-9), and making deposit more convenient for users by supporting batch deposit of documents (EM-Loader, 2008-9).
Motivations for many of the tools being developed in current projects, including DepositMO, can be seen in the JISC Deposit Tool Show & Tell Meeting (October 2009)
Among current JISC projects there is an emphasis on improving and supplementing repository metadata, and thereby reducing the information required from depositors, by linking information with research information systems and other institutional publications management systems. Enrich, DURA and RePosit are among the projects connecting repositories with systems such as Mendeley and Symplectics.
Another concern for repository deposit – given the different types of repository available to authors, including subject-based repositories such as ArXiv and PubMed Central, as well as institutional repositories – is that authors may have different requirements from research funders and institutional open access policies to deposit in one or more repositories. Based on such factors, the likely information flows between repository types were examined by Jones, et al. (2008).
Similarly, multi-authored papers may need to be deposited in multiple repositories.
Given these competing or complementary requirements placed on authors for repository deposit, Open Access Repository Junction is producing a broker tool to assist authors to deposit “in all the appropriate locations”, and to make the whole deposit process as simple as possible: “Deposit once, send to many”.
The next post will continue this review of studies of repository usability and user interfaces, with more emphasis on the interfaces.
DepositMO and SWORD at Repository Fringe 2011
Posted by Steve Hitchcock in Uncategorized on January 23, 2012
Seven posts in the last week is a lot to read, so today we invite you to stop reading, relax and watch – should you still be interested in the repository deposit tools produced by DepositMO, and in SWORDv2. If you have been reading this far you will know about these already, but you might be wondering if they work for real. YouTube videos of Dave Tarrant and Richard Jones demonstrating these tools at Repository Fringe 2011 in Edinburgh last August might persuade you.
Or you might wait for the results of user tests performed with these tools, but since we began this journey towards user testing we have taken such a diversion that you might wonder whether we will get there. We are still on course, so stick with us.
What follows is the @depositMO Twitter commentary on the videos at the time of release in October 2011. For shorter, edited versions of the first video presented below, scroll further down.
Dave Tarrant demos MS Kinect & SWORD v2 deposit
- Dave T has been accused of hand waving. As you will see from the video, it’s true
- To be clear, file manager drag-drop tool seen in the video is tech from DepositMO (Kinect hand waving tech courtesy someone else) #jiscdepo
- It is clearer that the Word deposit tool seen in 2nd half of the video is from DepositMO-so that’s both tools from the project #jiscdepo
- Good to see both DepositMO tools got a spontaneous round of applause during Dave’s show. So that’s a tick for flashiness #jiscdepo
Deposit with MS Word. For a heavily edited version of just the Word part of the demo, see this alt. video (4min11)
- From Word video: “Can hold process of ‘publishing’ (to repository) open. One of the key features of SWORDv2, build object before publish”
- From Word video: “DepositMO has been focussing on building communication between repository and user”
- From Word video: “Telling users where their content has gone so they can access it, add to it, edit it”
- From Word video: “Build this conversation between the Web and whatever system you are using to produce (content)”
- Why demo using Kinect? “Need to think carefully how we are going to get all these things in the repository-make it so intuitive, so easy”
Drag-drop Kinect deposit demo (Watch Folder). For a heavily edited version of just the drag-drop part of the demo, see this alt. video (2min40)
- From Drag-Drop video: “1st object going in EPrints, 2nd in DSpace, going into both. All underlying technology is SWORD v2″
Richard Jones on SWORD2
Since “all underlying technology” in DepMO is SWORDv2, here is Richard Jones on SWORD2 in another Repo Fringe video
- RJ on SWORDv2: “awesome community process”
- RJ on SWORDv2: “going to investigate data deposit aspects that SWORD might be able to support. We’ve also got money for client developments”
DepositMO for EPrints
Posted by davetaz in Uncategorized on January 20, 2012
With SWORD version 1 it became simple to deposit into many digital repositories using any number of clients which all connected to a common Application Program Interface (API) at the repository. However, it was not specified how resources can be managed and enhanced subsequent to initial deposit. The SWORDv2 and DepositMO projects have been focussed on enabling this enhanced interaction, and also making it incredibly easy for users.
Where the SWORDv1 specification allowed resources to be Created in a repository, SWORDv2 has focussed on adding Retrieve, Update and Delete functions, thus providing full CRUD support. DepositMO focussed on use cases requiring extra functionality beyond basic CRUD but which could still use standard, already available specifications.
With DepositMO and SWORD v2 projects running in parallel this provided a set of clear requirements to enable complete interactive control of repository resources via a set of abstracted interfaces. By building upon existing, well developed and widely implemented standards (including HTTP and AtomPub), both projects utilise these standards to enhance digital repositories.
As both projects essentially outline the ideal RESTful implementation for discovery and control of resources in a digital repository, the specifications of the SWORDv2 and DepositMO projects have been integrated into the core of EPrints (v3.3). As a result, the RESTful interface to EPrints is as powerful and flexible as the built-in web interface, providing the following features:
1. All first-class objects can be managed via the REST interface. All EPrints, Documents, Files, Users and anything else identifiable by a URI can be created, updated and deleted via the REST interface.
2. All import plug-ins are now generic and can be utilised via the REST interface, doubling the number of supported package formats which can be deposited via SWORD. Additional plug-ins, which can be installed in a single click, are also available via the EPrints Bazaar app store.
3. The same permissions model is applied both via the SWORD/REST interface as via the normal EPrints interface. The permissions system provides a granular model via which users can be granted permissions based on item status, item type and even an individual item itself. Basically, if editing is possible via the web interface, then it is possible via SWORD.
4. The SWORDv2 specification supports the notion of “in-progress” items, so it has been possible to reduce the number of SWORD deposit endpoints in EPrints to ONE. This has many benefits:
- This endpoint represents a user’s deposits (or repository contents), closely aligning it with the Google Docs implementation and extensions of the AtomPub specification on which SWORDv2 is based.
- SWORD clients are easier to write.
- The user requires less context about which “collection” their content belongs to – it belongs to their collection.
In EPrints, the collection in which an item belongs is represented as a piece of metadata, a significant difference from DSpace. Rather than managing and connecting with several collections, a user can deposit and discover all of their resources via ONE URI, which is the same URI for every user (typically http://myrepo.org/contents). While the SWORDv2 specification outlines how this URI can be used for deposit, DepositMO mandates that a client should also be able to perform the inverse operation and request a complete list of deposit contents via this same URI, regardless of the status of the object.
4. A unique identifier (in the form of a URI) is assigned to every object (including EPrints, Documents, Files, Users etc…) as soon as it is created regardless of the stage in the workflow. This URI will NEVER be reused or overwritten, and can thus be used to reference an object throughout its entire lifecycle in the repository.
5. All URIs support all REST operations: PUT/POST, HEAD, GET and DELETE. The DepositMO profile of SWORDv2 adds the requirement that each URI must accept a HTTP HEAD request. Using this HEAD request, a client is able to request information about an object, such as last changed date, without needing to download the whole object to find this out. This enables each object to be synchronised with a local copy, allowing two-way update with clients that are keeping a local up-to-date copy, an important requirement in distributed systems.
6. All URIs can be content negotiated, meaning that you can get an RDF/XML, Atom, csv… serialisation of every object in the repository. These plug-ins are the inverse of the import plug-ins, so not only can you ingest items in any format, you can also export them in that same format. Again further plug-ins can be installed in one click from the EPrints Bazaar.
In summary, SWORDv2 and DepositMO on EPrints represents a major leap forward in repository flexibility. By utilising the built-in power of the EPrints identifiers, permissions model and Bazaar Store for plug-ins (all of which have been key parts of the EPrints 3.3 development), the HTTP CRUD interface supports SWORDv2 and DepositMO specifications as core functionality, replicating the functionality available via the web interface. This gives users the power to interact fully with their repository and content via an interface or client which suits their way of working, in their environment!
DepositMO and the future of SWORD
Posted by rjones in Uncategorized on September 27, 2011
This post explores the relationship between the SWORDv2 project and the DepositMO project, and how they have influenced each other.
SWORDv2 officially began in late 2010, and DepositMO started at around the same time, alongside a number of other JISC deposit related projects including DURA and RePosit. When SWORDv2 set up the Technical Advisory Panel in early 2011, representatives from these projects were invited to join to share their deposit scenarios and technical expertise. Of the 3 projects, DepositMO was by far the most closely aligned to the goals of SWORDv2 and also the most technical project.
The SWORDv2 mission was to support the full range of CRUD (Create, Retrieve (Read), Update, Delete) operations for scholarly systems, and to maintain the use-cases which had driven SWORDv1, including mediated deposit and most importantly of all Packaging. DepositMO, meanwhile, was focussing on desktop-to-repository deposit and two-way synchronisation.
This meant that DepositMO would need to take full advantage of the CRUD protocol operations offered by SWORD, although both mediated deposit and packaging were not so relevant. The project would therefore be a valuable resource for a number of critical aspects of SWORDv2:
- A sounding board for core profile developments: DepositMO had a vested interest in the CRUD protocol operations and had explicitly no interest in the packaging aspects. This meant that the protocol operations would go through thorough review while the need for every packaging related concept would be questioned as to its necessity. As such, the SWORDv2 profile received extensive and sustained review throughout the project which has hardened it against many counter-arguments.
- A testing base for software development: DepositMO is being implemented against DSpace and EPrints, as is SWORDv2. Since it was clear from the outset that DepositMO is providing some extensions to SWORDv2, the majority of the codebase for both systems is the same. Extensions to SWORDv2 have been developed for both repositories by the project though. This means that not only has the core software been through some important testing, but its capacity to be extended has also been examined.
- Representation of a critical use case: the desktop-to-repository use case is one of the most under-developed in our community, so it was very interesting to have a project focussing on it to represent that use case throughout discussions. With services such as Dropbox and Ubuntu-One becoming common, the deposit applications developed by DepositMO will no doubt be an important demonstrator as to the way academics will interact with repositories in the future.
In addition to the base operations provided by SWORDv2, the DepositMO project has also specified the following extensions:
- That Collections be able to list their content items which belong to the authenticated user.
- That individual files behave RESTfully and in line with the rest of the SWORDv2 specification. This means that replace and delete operations can be carried out on the individual files in an object.
The SWORDv2 project therefore considers it to be a realistic possibility that a SWORD 2.1 specification may be produced in the future incorporating the DepositMO extensions in addition to any suitable extensions from other projects using the protocol.
Repository deposit turns to CRUD
Posted by Steve Hitchcock in Uncategorized on October 18, 2010
There’s no more elegant way of putting it really. What’s at the heart of DepositMO? CRUD
Create, Retrieve (or Read), Update and Delete (CRUD), the four basic functions of persistent storage. This is what differentiates the project from current capabilities for remote deposit of content to repository services. I’m using CRUD to write this blog post in WordPress. As I write I have two action buttons to the right of the content pane: Save Draft, and Publish. Both allow me to update the content to the storage server, the difference being whether the post is made public or not.
So there is nothing new about CRUD, except that it is not yet directly applicable to many of today’s digital repositories, which tend to support single publishable item deposit with subsequent versioning should changes be needed or if updated versions are produced. In other words, there is no concept of a repository workspace – or a connected workspace – that allows the simple incremental updates widely supported by other computer authoring services and described by CRUD, or applications that go beyond this.
“For authors it is often suggested that content might be deposited once by filling data in a Web form, but too much effort is involved for the process to be repeated for another repository. Better is multiple simultaneous deposit under the control of the author.”
It became clear that we should do more to emphasise the role of CRUD in this project following a short, branched exchange on the American Scientist Open Access Forum mail list. A recurrent theme on the list had returned – the apparent tension between deposit in central, subject-based repository collections such as arXiv and distributed institutional repositories (IRs). The question is where to deposit; the aim to maximise the volume of open access content.
Currently the answer probably depends on which subject area the research to be deposited covers, e.g. physical or biomedical sciences will most likely deposit centrally, due to the strength of the repositories serving these areas. Other disciplines will deposit institutionally, but on a much lower scale. There is the crux of the open access problem.
The answer proposed, notably by Stevan Harnad on the AmSci list, is for institutions to mandate deposit of published research papers in the local repository. That is, all papers, not just those not already deposited in a subject-based open access repository elsewhere.
For authors it is often suggested that content might be deposited once by filling data in a Web form, but that too much effort is involved for the process to be repeated for another repository. One approach to reduce the perceived workload, given that all these repositories are open and allow open harvesting of data using OAI-PMH, is to deposit once and then harvest the content to other repositories as required.
Another approach might be multiple simultaneous deposit. To save authors effort, data for deposit is entered into a form once, and then copied to the designated repository destinations. One tentative suggestion to emerge in the latest round of list discussion was that deposit to an IR be accompanied with login details for a central subject repository for subsequent deposit. This is fraught with security problems, as we pointed out to the list.
Enter SWORD, for it was suggested that this be the mechanism for sharing deposit and logins in this case. It turns out that the organisation developing SWORD has a case study that looks quite like that proposed.
Separately, this is what arXiv says about using SWORD for deposit:
“This interface is primarily intended for use by conference organizers, proceedings and journal editors, etc. for programmatic bulk upload of pre-vetted material to arXiv for long term archival and dissemination. It is assumed that this is done with the (implied or explicit) approval of the authors of individual contributions or on their behalf.
“Individual authors may prefer arXiv’s interactive web upload for personal use, because it provides better feedback mechanisms, but in principle the deposit API can be used for one-at-a-time deposit to arXiv by individual authors, too. We envision integration of the deposit process into authoring tools for efficient upload from the desktop.”
So third-party deposit is just about acceptable, perhaps, without being wholly endorsed. The last sentence points indirectly towards the work of DepositMO, and Simeon Warner of arXiv was a co-author of the project’s short debut paper at the Open Repositories 2010 international conference (OR10).
As this paper shows, better than deposit-once and subsequent deposit elsewhere by another agent is multiple simultaneous deposit under the control of the author. It turns out that SWORD has this covered as well.
In fact, there are quite a few SWORD implementations connecting different applications (sources) and repositories (destinations). If you look closely, one of those implementations listed is Microsoft Article Authoring Add-in for Word 2007/2010 – allows repository deposit direct from Word. Within DepositMO we have made some claims about enabling repository deposit from popular applications such as MS Office, and in the project we shall be working with Microsoft to enhance this tool.
Have we made the USP for DepositMO clear in the documentation to date? It’s not SWORD, or deposit or even multiple deposit, or deposit from specified applications. The answer begins with CRUD.
Among this welter of deposit applications, you are probably asking what exactly will be DepositMO’s unique contributions? No. Well I was. At least, I was beginning to wonder if we had made our USP clear in the documentation to date. It’s not SWORD, or deposit or even multiple deposit, or deposit from specified applications. The answer, as we have already indicated, begins with CRUD.
The project proposal talked of ‘an effective culture change mechanism’. That’s a wider issue for another time. On more technical issues the proposal describes the aim to ‘extend the capabilities of repositories to exploit desktop and authoring environments’. More specifically it refers to components for the Microsoft Office authoring environments and enhanced SWORD interaction.
No reference to CRUD-like features here. Nor in the OR10 paper – at least, not using these terms – but the direction is clearer. The paper starts by specifying the motivations for multiple deposit.
Today the use case for repository deposit is write the content with a typical computer desktop application and save it somewhere, but not in the repository yet – the equivalent of the blog Save Draft button. When the work is complete it can be packaged and delivered to the repository using SWORD, the same as the Publish function in the blog. The OR10 paper puts it like this:
“Currently SWORD is a one-way protocol, meaning that a repository can either accept a record, or reject it; there is no middle ground. Adding a lightweight mechanism to desktop applications to enable negotiation on what is sent in a SWORD package would go some way to bridging this gap.”
This facility should become available in SWORD v2.0, and developers from the project are contributing directly to this activity since there is a vested interest in the outcome.
It would open new deposit possibilities. An admittedly complex and possibly unusual, but nevertheless feasible, case is suggested in the OR10 paper where an author of a research paper pulls information from other linked sources, such as a contacts list and a citation manager:
“At the point the document is submitted all this valuable information (such as author identities disabiguated by email address and structured citation listing) is lost.”
All this is more sophisticated than CRUD and points the way forward, but first implementing CRUD features using SWORD as a mediator between applications and multiple repositories would represent serious progress.
What might follow from this is ‘culture change’ or, more immediately, dialogue between author and repository. The OR10 paper puts it more prosaically:
“Our proposal is to enable a simple yet powerful set of negotiations to occur between the desktop application and multiple repositories such that a single familiar submission workflow (in the style of the author’s application) can be presented to the user.”
So as a starting point the aim in DepositMO is to activate the repository as a storage service for the iterative Save Draft action in an authoring application.
In the next post we will consider the practical implications of this approach and look at an early sketch of a possible interface design.
Just as long as we all understand and can share in what is new and where we are heading. Otherwise we may just find ourselves talking about a more familiar form of crud.
