Available online at www.sciencedirect.com

Procedia Computer Science 4 (2011) 658–667

International Conference on Computational Science, ICCS 2011

Paper Mˆ ch´ : Creating Dynamic Reproducible Science
a e
Grant R. Brammer, Ralph W. Crosby, Suzanne J. Matthews, and Tiﬀani L. Williams
[grb,rwc,sjm,tlw]@cse.tamu.edu Department of Computer Science & Engineering, Texas A&M University, College Station, TX 77843-3112

Abstract
For centuries, the research paper have been the main vehicle for scientiﬁc progress. From the paper, readers
in the scientiﬁc community are expected to extract all the relevant information necessary to reproduce and validate
the results presented by the paper’s authors. However, the increased use of computer software in science makes
reproducing scientiﬁc results increasingly diﬃcult. The research paper in its current state is no longer suﬃcient to
fully reproduce, validate, or review a paper’s experimental results and conclusions. This impedes scientiﬁc progress.
To remedy these concerns, we introduce Paper Mˆ ch´ , a new system for creating dynamic, executable research papers.
a e
The key novelty of Paper Mˆ ch´ is its use of virtual machines, which lets readers and reviewers easily view and interact
a e
a e
with a paper, and reproduce key experimental results. For authors, the Paper Mˆ ch´ workbench provides an easy-touse interface to build an executable paper. By transforming the static research paper into a dynamic and interactive
entity, Paper Mˆ ch´ brings the presentation of scientiﬁc results into the 21st century. We believe that Paper Mˆ ch´
a e
a e
will become indispensable to the scientiﬁc process, and increase the visibility of key ﬁndings among members and
non-members of the scientiﬁc community.
Keywords: executable paper, virtual machines, scientiﬁc reproducibility, abstract management, reviewing

1. Introduction
Scientiﬁc progress depends on the eﬀective dissemination and reproducibility of existing research. For centuries,
scientiﬁc papers (as well as scientiﬁc books) have been the primary mechanism for disseminating scientiﬁc results.
However, such a mechanism is based on the reader having access to the materials needed to validate the results
discussed in the scientiﬁc paper. The increased use of computer software in science makes reproducibility of results
quite diﬃcult—especially since many scientists do not publish the source code nor the data needed to reproduce their
results. As a result, the hypotheses and results discussed in a scientiﬁc paper are not validated since it is too diﬃcult for
the reader to recreate the authors’ experimental environment. Given that our current dissemination practices impede
scientiﬁc progress, how can we make scientiﬁc contributions more easily accessible (or executable) for the scientiﬁc
community and public at large?
We introduce Paper Mˆ ch´ , a novel paper management system under development that allows users to explore
a e
research papers interactively, reproduce results and test hypotheses of their own. That is, Paper Mˆ ch´ supports
a e
the notion of an executable paper. Our expertise in high-performance computing and bioinformatics have provided
the motivation for developing our system for a wide variety of users. More speciﬁcally, Paper Mˆ ch´ divides the
a e
scientiﬁc community into three diﬀerent individuals: authors, reviewers, and readers. For authors, our Paper Mˆ ch´
a e
system oﬀers a simple interface to build an interactive and executable paper. The novel use of virtual machines in
1877–0509 © 2011 Published by Elsevier Ltd. Selection and/or peer-review
under responsibility of Prof. Mitsuhisa Sato and Prof. Satoshi Matsuoka
doi:10.1016/j.procs.2011.04.069

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

659

Paper Mˆ ch´ allows authors to easily reconstruct their experimental environment, which in turn, allows reviewers and
a e
readers to explore a paper interactively, reproduce and validate experimental results, and test their own hypotheses.
1.1. Existing paper management systems
Current systems for author-reviewer interaction involve the ability for authors to submit static, scientiﬁc papers to
a server, which are then assigned to reviewers. The reviewers then review the paper by submitting their comments
back to the server. This is usually done with the assistance of abstract management software (usually bundled with a
conference management system). The purpose of these systems is to lessen the administrative workload of handling
the large volumes of submitted scientiﬁc abstracts and articles. Within the computer science community, EasyChair [1]
is by far the most popular conference management system, due to its ease-of-use and being free. In 2010, there were
3,306 computer science conferences managed using EasyChair [1]. Popular commercial options include START [2]
and Linklings [3], the latter which is used by computing conferences such as SuperComputing (SC) and Grace Hopper
Celebration of Women in Computing (GHC). Professional organizations for computing such as ACM and IEEE use
ScholarOne Manuscripts [4] (formerly Manuscript Central) as their abstract management software. However, the key
limitation to these systems is that they only permit the inclusion of the research paper itself. Experimental data and
source code are not uploaded to these systems. If an author wishes to have these elements available for reviewers and
readers, the author must ﬁnd a way to host the source code and data during the review process. The reviewer then has
to download the source code and then spend time ﬁguring out how to execute the source, which can be a nontrivial
process. As a result, the current techniques for managing papers makes it very hard to reproduce the experimental
results in the paper, which is key to validating the claims the authors make in their scientiﬁc paper.
In addition, the forums in which authors can interact with readers is quite limited. At conference talks, audience
members have a limited time span in which to directly interact with the author. In some scientiﬁc journals (e.g. Systematic Biology), readers can respond to authors and their work in the form of a “Points of View” section. However,
this form of communication between authors and readers is not immediate since it can take months before the point
of view appears in the journal—assuming the reviewers reading the point of view feel that the viewpoint is worthy
of publication. Web-based systems like CiteULike [5] promote reader interaction within the scientiﬁc community
through “social bookmarking”. Here, users share references to papers they enjoy, and they can also see who likes
the same paper they do. However, there does not seem to be a way for these readers to interact directly with the
author themselves. Such interaction would be very valuable since authors can use the feedback as a way to improve
their work.
1.2. Summary and software availability
a e
We believe that Paper Mˆ ch´ is indispensable for future scientiﬁc research, and provides a mechanism for increasing the visibility and accessibility of scientiﬁc ﬁndings to everyone. Currently, our proposed system is under
development and is a ﬁnalist in the Executable Paper Grand Challenge sponsored by Elsevier. For more information regarding this challenge, please visit http://www.executablepapers.com/index.html. A demonstration
of our Paper Mˆ ch´ system will take place at the International Conference on Computational Science (ICCS’11)
a e
in June 2011. Thus, this paper discusses the underlying design of Paper Mˆ ch´ , considers sample use cases, and
a e
discusses how our system will be evaluated, and explores the future capabilities of Paper Mˆ ch´ .
a e
2. Background: Virtual Machines
The novelty of Paper Mˆ ch´ is the use of virtual machines (VMs) [6] to implement an executable paper that allows
a e
authors, reviewers, and readers to interact. A virtual machine is a system that performs software-level emulation of
a system diﬀerent than the host machine. While virtual machines have existed since the 1960s, they have been used
in recent years as a mechanism to evaluate new operating systems, act as test environments for software, back-up a
system, and run code on virtualized “clones” of legacy machines. A virtual machine consists of two main components:
the hypervisor and (one or more) guest operating systems. The hypervisor is installed on the native operating system
of the host machine, and is used to run one or more guest operating systems. The guest operating system is stored
on the host machine in the form of an logical image, which is a disk ﬁle consisting of one or more physical disks
that the guest requires to execute. This guest image contains the full installation of the guest operating system and

660

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

Figure 1: A diagram showing how authors, reviewers, and readers interact in Paper Mˆ ch´ during the life cycle of a research paper. Before
a e
publication, authors submit their paper to the reviewers for review. After discussing the paper amongst themselves, the reviewers submit their
reviews to the author. If requested, the authors can then resubmit the revised paper, and receive further revisions from the reviewers. After
publication, the research paper is available to the community of readers. These readers can discuss the paper amongst themselves as well as
comment on the paper. The Paper Mˆ ch´ system will facilitate reader comments on a paper and expedite author responses.
a e

software applications. When instantiated by the hypervisor, the guest operating system executes as if it is running
directly on the host machine’s hardware. In addition to commercial virtual machine software such as VMware [7] and
Parallels [8], fully-functioning open source alternatives such as VirtualBox [9] and Xen [10] are available.
For Paper Mˆ ch´ , the advantages of virtual machines that have allowed for their proliﬁc use are now being apa e
plied to improve the quality of research papers and the interactions between authors, reviewers and readers. Virtual
machines allow authors to create a snapshot in time of their experimental system, allowing them to easily package
their results and data with their research paper in a single entity. Thus, the results of the executable paper can easily
be reproduced in the future as a virtualized clone of the original experimental platform. Another added beneﬁt to
creating executable papers as virtual machine images is that is easy to enforce an added security levels. For example,
during the reviewing process, the VM image of the paper can be locked as “read-only”. This allows reviewers to
simultaneously view unpublished source code and data, while preventing them from separating unpublished material
from the package and co-opting them for personal beneﬁt. Security controls will increase author conﬁdence in the
reviewing process, and help prevent plagiarism. Once the paper enters public domain, some of these restrictions may
be lifted. Thus, the use of virtual machines in the creation of the executable paper within Paper Mˆ ch´ simultanea e
ously allows readers and reviewers to interact with a research paper and its experimental results, while protecting the
author’s sensitive data and source code during the pre-publication phase of the paper.
3. The Paper Mˆ ch´ System
a e
Paper Mˆ ch´ is designed to support the requirements of the authors, reviewers and readers that comprise the
a e
scientiﬁc community. As illustrated in Figure 1, Paper Mˆ ch´ is intended to support the needs of the full life-cycle of
a e
a research paper. Moreover, Paper Mˆ ch´ does not replace, but in fact augments, the capabilities of scientists working
a e
to create new research. Before a paper is published, authors are responsible for creating the paper and submitting the
paper to reviewers. At the core of this process is the Paper Mˆ ch´ package (.pm) ﬁle. This ﬁle is the artifact that
a e
represents the “executable paper” and is the container for all the various elements of the paper. For a particular paper,
a single .pm ﬁle is created. Users interact primarily with the Paper Mˆ ch´ Workbench, which allows the user to create,
a e
update, manage and access the .pm ﬁles. While authors use the Paper Mˆ ch´ Workbench to create, update and manage
a e
the .pm ﬁles, readers and reviewers use the workbench to view and access the .pm ﬁles. During the pre-publication
phase, reviewers evaluate the paper and communicate with the author. Once the paper has been published, readers

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

661

Author
Create/Edit

Paper Mâché
VM

Comments

Paper

Source Code
Executables
Data
Libraries
Dependancies

Ratings
Reviews
Discussion

Text
Figures
Audio
Video

Execute

Comment

Read

Reader/Reviewer

Figure 2: An overview of the Paper Mˆ ch´ system showing the interaction between the actors and the major components of the system. The author
a e
is responsible for creating the paper, virtual machine (VM), and all underlying content. These sections are uploaded through the Paper Mˆ ch´
a e
workbench, which creates a .pm ﬁle. Once the components are uploaded, a web based comments section is made available to readers or reviewers.
Online users can download and execute the VM, comment on diﬀerent aspects of the system, and read the paper.

discuss the paper and send their comments to the authors via the Paper Mˆ ch´ Workbench. The author, in turn, can
a e
respond to the comments. These interactions are explored in more detail in Sections 3.1 and 3.2.
As mentioned earlier, Paper Mˆ ch´ uses virtual machines to replicate the environment in which code and scia e
entiﬁc experiments were run. Image ﬁles containing virtual machines (.vm ﬁles) will be packaged within the .pm
Paper Mˆ ch´ package. In order to work with Paper Mˆ ch´ virtual image ﬁles, all participants in the process (aua e
a e
thors, reviewers and readers) will need to download a Paper Mˆ ch´ hypervisor appropriate to their environment. Once
a e
downloaded, the hypervisor will be usable with any .vm ﬁles contained in .pm Paper Mˆ ch´ packages. Authors will
a e
also need to do a one-time download of the Paper Mˆ ch´ image tool that will help automate the process of creating
a e
virtual images for their research.
3.1. Creating an executable paper or .pm ﬁle
Here, we describe how an author (or authors) of a scientiﬁc paper use Paper Mˆ ch´ to create an executable version
a e
(or .pm ﬁle) of the paper. We note that Paper Mˆ ch´ does not replace conventional research leading to publishable
a e
scientiﬁc results. Prior to working with Paper Mˆ ch´ , it is assumed that the author(s) of the paper have performed all
a e
of the necessary research and written the paper.
First, the primary author logs into the Paper Mˆ ch´ workbench, creates an empty .pm ﬁle (the Paper Mˆ ch´ wrapa e
a e
per in Figure 2) and identiﬁes any additional authors authorized to edit the .pm ﬁle. Additional metadata (description,
keywords, etc.) may be entered at this time or at any future point in the process. At this point, the empty .pm will
be in an “editable” status indicating that the contents may be freely changed by the authors. To create the contents of
A
the .pm ﬁle, the authors will upload the text of the paper (e.g., LTEX or .doc(x) format) and also upload associated
ﬁles (e.g., audio/video, graphics, ﬁgures) in their native formats. This is represented by the Paper section in Figure 2.
While there will be no particular order required by the upload process, dependency checking (e.g. the existence of
referenced ﬁgures in the document) may be requested and will be required prior to review.
To create the virtual machine (.vm) ﬁles associated with paper, the virtual machine image tool will be run on a test
machine (or machines) to create one or more .vm ﬁles for machines that host any applications referenced in the text.

662

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

As part of the virtual image, the authors will deﬁne the scripts or commands necessary for a reader to recreate the tests
referenced in the text. These instructions will be packaged as metadata associated with the .vm ﬁle. Authors will be
responsible for testing and adjusting the generated .vm ﬁles using a previously downloaded Paper Mˆ ch´ hypervisor
a e
appropriate to their environment. When completed the .vm ﬁles will be uploaded into the .pm ﬁle package creating
the VM section shown Figure 2. We note that in Table 2, many of the steps listed under the “Traditional” column will
need to be performed by the author. However, the use of virtual machines will eliminate this process for reviewers
and readers. Once satisﬁed with the contents of the .pm ﬁle, the authors will transition the ﬁle to a “submitted” status.
In this state, the .pm ﬁle will be locked preventing updates. Additionally, the ﬁle will only be visible to the authors
and reviewers, which are assigned by a conference program chair or journal editor. For simplicity, it is best to think
of journal editors and conference program chairs as “super reviewers” within Paper Mˆ ch´ . As a super reviewer, they
a e
have the power to assign papers to reviewers as well as make decisions as to whether a paper has been accepted for
publication.
Once the reviewing process has ended, authors will receive their reviews as well as the decision whether their
paper has been accepted for publication. If changes are required prior to publication, the journal editor or program
chair will change the .pm ﬁle state to allow the authors to make any changes requested. When the package is approved
for publication, the editor or program chair will change the status to “published” and the .pm ﬁle becomes publicly
available. In this state, the .pm ﬁle will be locked for changes. However, the authors will still be able to make changes
(e.g., updates to source code) to the .pm ﬁle. All updates will be tracked separately from the base .pm ﬁle so that
readers will easily be able to view the original contents of the ﬁle. Finally, we note that there are unfortunate situations
where a published paper has to be retracted or formally corrected. Paper Mˆ ch´ will be able to change the status of
a e
such published papers and make the new status clearly visible to readers.
Once completed, the authoring process will have generated a .pm ﬁle package containing everything necessary to
not only understand the research but duplicate and further experiment with far into the future.
3.2. Reading an executable paper or .pm ﬁle
The following discussion is focused on the readers of the executable paper, but all operations are equally appropriate to reviewers. The only diﬀerence between reviewers and readers is the visibility of the materials. During
pre-publication, only reviewers and the paper authors will be able to access the paper. Comments entered by the
reviewers (and authors) will only be visible to the authors and reviewers during this period.
A reader starts their interaction with the Paper Mˆ ch´ Workbench by logging into the web application and searcha e
ing for a paper of interest. They will be able to read the abstract as part of the search results. If they decide to study
the paper further, they will be able to click on the paper and open the .pm ﬁle in a web page. After opening the .pm
package, the reader will be able to read the paper as well as view multimedia, ﬁgures and other content available (again
represented by the Paper section of Figure 2). Hyperlinks help users navigate and view relevant portions of the paper,
ﬁgures, charts and graphics associated with the package. At any point, the reader will be able to enter comments and
ratings for the paper or speciﬁc elements of the paper. Comments will be available to the authors in the .pm package
itself (see Figure 2), and shown on the web page associated with the .pm ﬁle. Authors will also be able to review and
respond to comments from readers through the web page.
Clicking on the .vm ﬁle name on the web page will initiate the download of the virtual machine image onto the
reader’s computer. Once downloaded, the .vm ﬁle is executed on the previously downloaded Paper Mˆ ch´ hypervia e
sor. The reader is able to sign into the virtual machine and recreate the authors’ experiments using the scripts and
commands packaged within the .vm ﬁle. Since the .vm ﬁle is a fully functioning virtual machine, readers can easily
adjust parameters and try running the source with diﬀerent data.
An example. Figure 3 shows a sample prototype of a Paper Mˆ ch´ virtual machine executing a published paper
a e
describing a MapReduce inspired algorithm called MrsRF [11]. Here, the executable paper (or Matthews2010.pm)
is executing on a reader’s desktop. Within the virtual machine window, the reader has executed an application from
the command line and generated a graph. The source code and build ﬁles will be packaged within the .vm ﬁle and
the reader is able to view and modify the source code for the application to further experiment with the application.
Such changes may be saved in the reader’s local copy of the virtual machine but will not be saved in the web based
package.

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

663

Figure 3: A prototype of the Paper Mˆ ch´ system. Readers and reviewers download the Matthews2010.pm ﬁle to their desktop. Using a hypervisor
a e
to execute the .pm ﬁle opens up a new window, displaying the guest operating system (in this case, Ubuntu). In the context of this guest operating
system, readers and reviewers can reproduce the experimental results discussed in the paper. When readers and reviewers are done interacting with
the experimental environment, they can simply close the window.

Executing the paper within Paper Mˆ ch´ is much easier than having each reviewer and reader set up the expera e
imental environment on their own. In this example, the source code for MrsRF is available publicly from the web.
The MrsRF source code takes advantage of Phoenix [12], an underlying MapReduce framework which was originally
designed for the Solaris operating system. We then modiﬁed Phoenix to get working on some versions of Linux
(e.g., Ubuntu and CentOS). However, if the readers and reviewers do not have access to those versions of Linux (or
the correct version of the Gnu C Library [13]), then they cannot run our software properly. Gnu C Library (glibc)
incompatibilities are especially diﬃcult to deal with, since making haphazard updates to a system’s glibc installation
can lead to disastrous results. In the past, we had a real-life case involving a reader at another university who had
Linux and glibc incompatibility issues with MrsRF. Despite the code being freely available on the web, and open,
continuous communication between the authors and the user, the situation was only fully resolved when the reader
ended up using a virtual machine to execute the MrsRF code. This ﬁnally allowed her to reproduce the results found
in the paper. As a result, we believe that the integration of virtual machine ﬁles in Paper Mˆ ch´ will quickly and easily
a e
allow users to recreate a paper’s experimental framework and reproduce results.
Paper Mˆ ch´ readers and reviewers will be able to easily interact with research. As a result of the .vm ﬁle, all
a e
components of the research (source code, libraries, etc.) will be exposed. Thus, if the reader desires to create an
executable version outside of the virtual image (e.g., execute the paper on their operating system of choice), it will be
far easier to construct such an environment with the working model that is available from within Paper Mˆ ch´ .
a e
3.3. The Paper Mˆ ch´ ﬁle (.pm ) and Workbench
a e
An executable paper will be represented as a single .pm ﬁle within the Paper Mˆ ch´ system. This ﬁle will be
a e
structured as a set of subdirectories as shown in Table 1, similar to the structure of a Java .jar ﬁle. The ﬁle will
be built, updated and maintained by the Paper Mˆ ch´ Workbench. It is not intended that the authors be directly
a e
responsible for updating the .pm ﬁles. Within the .pm ﬁle, there will be a set of directories corresponding to the
sections of the executable paper as shown in Figure 2. The paper subdirectory holds the text of the paper as well as any
ﬁgures referenced and any additional multimedia ﬁles. The comments subdirectory contains the comments and ratings
entered by readers and reviewers as well as responses of the authors. The metadata subdirectory holds additional
information (author’s names and contact information, dates updated, etc.) associated with the paper. The .vm ﬁle will
also be contained with the .pm ﬁle.
While readers and reviewers will primarily interact with the paper using the web-based Paper Mˆ ch´ workbench,
a e
it will be possible to download the entire .pm ﬁle from the web if desired. The .pm ﬁle may then be expanded into a

664

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

\Matthews2010.pm
\paper
\text
\figures
\media
\comments
\metadata
MyPaper.vm

• Files associated with the paper portion of the executable document
• Text ﬁles such as html and pdf representations of the paper
• Figures and data referenced in the paper
• Multimedia content (e.g. video, audio)
• Comments and ratings for the paper
• Metadata associated with the package
• Virtual machines image

Table 1: Contents of the Matthews2010.pm ﬁle shown in Figure 3. The .pm ﬁle will have a standard directory structure similar to Java .jar
ﬁles. The paper, comments and .vm sections correspond to sections in Figure 2. The metadata section contains various information about the .pm
package and its contents.

set of directories on the users machine. In this mode, any changes to the contents of the package will not be recorded
in the web based copy of the .pm ﬁle.
The only required elements of the .vm ﬁle (in addition to an operating system) will be the source code to the
applications, any dependencies required to build and run the applications, and scripts or instructions for reproducing
the experiments referenced in the paper. Inclusion of other portions of the paper (e.g. pdf ﬁles) in the .vm ﬁle already
contained in the .pm package will be optional.
The web based Paper Mˆ ch´ Workbench will act as the point of contact for all users and provide robust capabilities
a e
for the management of .pm packages. Implementation details of the system include using contemporary web design
(CSS, AJAX, etc.). The workbench will be developed using Ruby on Rails hosted on an Apache web server and
interfacing with a MYSQL database. The workbench is organized around the actions associated with the various roles
(author, reviewer, reader) an individual performs in a scientiﬁc community. For example, a casual reader just browsing
a set of papers will only be able to view those elements that the author (and publisher) have enabled for view. For
example, only authors can view the comments left to them by reviewers. Individual elements within the .pm ﬁle will
also be secured. For example, to protect intellectual property, essential data may not be viewable until the paper is
actually published.
All operations and functions within the workbench will be secure. For example, a standard role-based security
model (e.g. author, reviewer) will be used in conjunction with an overall state for the .pm ﬁle (e.g. under construction,
in review, published) to allow security to be varied depending on where the paper is in the publishing cycle. Each
element within the .pm ﬁle will have an Access Control List (ACL) to provide highly granular control over security.
Additionally, the workbench provides revision control on the contents of the package to allow those with appropriate
security to view and revert changes to elements with the package. Whenever possible, elements within the package
will be watermarked with codes that would allow for tracking of the elements back to the original package to reduce
plagiarism.
To create and execute virtual machines, the workbench will provide a downloadable tool that will assist the author
in creating a virtual image from an existing machine. This image may then be uploaded. Moreover, the workbench
will provide the ability to execute the virtual image providing a remote desktop to the user (VNC, Windows Remote
Desktop). Through the workbench, the reader will be able to interact with the paper, and participate in public discussions (see Figure 2). By taking advantage of the security features allowed by the Paper Mˆ ch´ hypervisor, authors
a e
will be able to “lock down” portions of the virtual machine to prevent coping of unpublished research. Authors using
the workbench can create new packages and modify packages they own. They will also be able to view and respond
to comments, whether from readers (public), or from reviewers (private). Lastly, reviewers can use the workbench to
interact with the paper and leave comments to the authors.
4. Evaluation of Paper Mˆ ch´
a e
We will evaluate the performance of our Paper Mˆ ch´ system based on two metrics: speed and understanding.
a e
For example, our ﬁrst experiment will measure the time it takes for a reader/reviewer to interact with the science
described in a paper. That is, consider Table 2. The traditional approach requires seven steps to interact with the
science in a paper. Of course, this assumes that the software is available publicly or directly from the author and

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

665

that readers and reviewers can get the source code compiled on their experimental platform and that the data is also
available for experimentation. For Paper Mˆ ch´ , Table 2 shows that there are three steps that are needed to recreate
a e
the experiments that are discussed in the scientiﬁc paper of interest. For our performance evaluation, we will select
a set of papers from various conference and journal publications, and measure the average amount of time that the
traditional mechanism requires the reader/reviewer to obtain the science discussed in a paper. We will compare that
number to the time required under Paper Mˆ ch´ . Given that Paper Mˆ ch´ is a new system, we will ask for volunteers
a e
a e
to store their scientiﬁc results on the system so that we have a large enough sample for comparison with the traditional
approach.
Secondly, we will measure the amount of time required by an author to use Paper Mˆ ch´ to package the executable
a e
portion of their paper. The traditional approach places little overhead on the author in terms of making the executable
portion of their paper available. Instead, the overhead is placed on each reviewer and reader to spend the time required
to recreate the experimental environment used by the author (as shown in the Traditional column of Table 2). Clearly,
there will be overhead placed on the author in order to use Paper Mˆ ch´ . However, the time spent by the author is
a e
time saved for each reviewer and reader that accesses the paper. We are interested in the amount of time required
by the author to share their executable environment in Paper Mˆ ch´ . Our hope is that the additional time required
a e
by the author is minimal when compared to the savings gained by reviewers and readers to interact with the science
described in a scientiﬁc paper.
Finally, we will experiment with accessing the improvement in understanding a scientiﬁc paper as a result of
interacting with it through Paper Mˆ ch´ . Improved understanding means a better experience interacting with a paper
a e
for readers and reviewers. Certainly, user studies will be conducted as a way to measure understanding. However,
we will also consider other types of experiments. For example, one way to measure understanding is to measure the
level of engagement with a scientiﬁc paper. We could measure the amount of time spent reading a traditional paper
compared to the amount of time spent with a Paper Mˆ ch´ package. Our hypothesis is that engaged readers will have
a e
better comprehension of the scientiﬁc content. We could compare the frequency of comments on papers with and with
out an attached virtual machine as one metric of accessing interest and to some degree understanding. Furthermore, it
will be interesting, to measure whether techniques (such as Paper Mˆ ch´ ) that increase participant’s engagement in a
a e
paper positively aﬀects the impact factor of scientiﬁc journals.
5. Extending the Capabilities of Paper Mˆ ch´
a e
Section 3 provides a description of the primary features that are the focus of the current development of our
Paper Mˆ ch´ system. However, Paper Mˆ ch´ can be extended in many ways in order to improve the experience of
a e
a e
authors, readers, and reviewers. For example, cloud computing has become a major topic of interest and one that
could provide further utility to Paper Mˆ ch´ . By hosting the hypervisor in the cloud, users can execute papers from
a e
a web browser. This would oﬄoad the computational requirements from the user to the host server. With enough
computing power server side, we could enable users to test and interact with super computing scale research from
their commodity hardware.
Community features can signiﬁcantly add to the reader’s experience. Imagine two copies of the same paper:
one fresh oﬀ the printer and the other annotated by a graduate student who has poured over the research. Wouldn’t
both paper types be helpful in understanding the work? While it might be preferential to ﬁrst read the paper as it
was published before turning to an annotated copy, those notes and comments could be invaluable to understanding
complicated passages, ﬁgures, or algorithms. One of the goals of Paper Mˆ ch´ is to allow readers to pick up right
a e
where the authors left oﬀ. Furthermore, the ability to share detailed comments and annotations allows readers to
experience the paper from diﬀerent perspectives.
Science does not remain static. New experiments are performed, tweaks to code are made, and diﬀerent data
sets are tested. Since research does not stop once a paper is published, it would be useful if these advances were
represented in the executable paper. By combining Paper Mˆ ch´ with version control systems, executable papers
a e
could be made to reﬂect the ever advancing nature of research. The beauty of version control systems is that they
track changes. Hence, the original work can always be available while also making it easy to obtain the most recent
version of the software. Small changes to projects do not often warrant a new publication. However even something
as simple as a bug ﬁx could have great impact on readers and researches interested in the research. Thus, version
control systems are an important step in keeping the community up to date on the most recent iteration of a project.

666

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

Traditional
1.
2.
3.
4.
5.
6.
7.

Obtain source code
Resolve OS / platform dependencies (32 bit vs. 64)
Resolve library dependencies
Compile with proper ﬂags
Obtain data set used in paper
Obtain the commands used to run experiments in the paper.
Run experiments

Paper Mˆ ch´
a e
1. Install VM framework∗
2. Download VM
3. Run Experiments

Table 2: A table comparing the steps required for executing the science in a scientiﬁc paper using the traditional approach and Paper Mˆ ch´ .
a e
The ∗ denotes that this step is only required the ﬁrst time a user uses Paper Mˆ ch´ .
a e

6. Conclusions
In this paper, we introduce Paper Mˆ ch´ , a novel system for creating dynamic, executable research papers. While
a e
virtual machines are widely used to maintain controlled, reproducible environments for software development and
testing, Paper Mˆ ch´ extends the use of virtual machines to facilitate the reproduction of scientiﬁc research. By
a e
allowing authors, reviewers, and readers to interact with not just the text but its programs and data in a virtual machine
environment, the scientiﬁc paper becomes a dynamic, executable entity. Short and long-term compatibility is assured
through the use of virtual machines. The programs and data associated with the paper will be runnable even if the
actual source code no longer compiles in modern environments.
Virtual machines allow for easy, instant execution providing a quick method for validation of the programs and
data. The robust security model associated with the Paper Mˆ ch´ packages provides a ideal method for managing
a e
copyright and licensing issues. The capabilities of any user (or role) can be managed based on the licensing requirements of operating systems and applications. The cloud environment and the ability to seamlessly work with the
authors host environment will provide the ability to work with large scale systems and large ﬁle sizes. By providing a
single point of management (the Paper Mˆ ch´ Workbench) it becomes possible to track the provenance of individual
a e
elements within the papers.
However, the beneﬁts of Paper Mˆ ch´ extend beyond the scientiﬁc community. The interactive aspects of our
a e
Paper Mˆ ch´ system encourages the interest of science and the scientiﬁc process amongst the general public, thanks
a e
to an increase in visibility and accessibility of current research. The increase in accessibility to current ﬁndings
changes the way that scientiﬁc research is performed and communicated. We believe paper management systems
such as Paper Mˆ ch´ have the ability to pave the way for more scientiﬁc collaborations, increases the communication
a e
and understanding of core concepts, and will consequently allow for earlier adoption of critical ﬁndings into existing research. Thus, our Paper Mˆ ch´ system provides a bridge that allows everyone to actively participate in the
a e
scientiﬁc process.
7. Acknowledgements
This publication is based in part on work supported by Award No. KUS-C1-016-04, made by King Abdullah
University of Science and Technology (KAUST). This work was also supported by the National Foundation under
grants DEB-0629849, IIS-0713618, and IIS-1018785.
References
[1]
[2]
[3]
[4]

A. Voronkov, Easy chair conference system, Internet Website, last accessed, March 2011., available from http://www.easychair.org.
Sofconf.com, START v2, Internet Website, last accessed, March 2011., available from http://www.softconf.com/about/.
L. LLC, Linklings, Internet Website, last accessed, March 2011., available from http://www.linklings.com/.
T. Reuters, ScholarOne manuscripts, Internet Website, last accessed, March 2011., available from http://scholarone.com/products/
manuscript/.
[5] Springer, Citeulike: Everyone’s library, Internet Website, last accessed, March 2011., available from http://www.citeulike.org/.
[6] R. Figueiredo, P. Dinda, J. Fortes, A case for grid computing on virtual machines, in: Distributed Computing Systems, 2003. Proceedings.
23rd International Conference on, 2003, pp. 550 – 559. doi:10.1109/ICDCS.2003.1203506.

Grant R. Brammer et al. / Procedia Computer Science 4 (2011) 658–667

667

[7] B. Walters, Vmware virtual platform, Linux Journal 1999.
URL http://portal.acm.org/citation.cfm?id=327906.327912
[8] P. H. LTD, Virtualization and automation solutions for desktops, servers, hosting, saas - parallels optimized computing, Internet Website, last
accessed, March 2011., available from http://www.parallels.com/.
[9] J. Watson, Virtualbox: bits and bytes masquerading as machines, Linux Journal 2008.
URL http://portal.acm.org/citation.cfm?id=1344209.1344210
[10] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, A. Warﬁeld, Xen and the art of virtualization, SIGOPS
Oper. Syst. Rev. 37 (2003) 164–177. doi:http://doi.acm.org/10.1145/1165389.945462.
URL http://doi.acm.org/10.1145/1165389.945462
[11] S. Matthews, T. Williams, MrsRF: an eﬃcient mapreduce algorithm for analyzing large collections of evolutionary trees, BMC Bioinformatics
11 (Suppl 1) (2010) S15. doi:10.1186/1471-2105-11-S1-S15.
URL http://www.biomedcentral.com/1471-2105/11/S1/S15
[12] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, C. Kozyrakis, Evaluating mapreduce for multi-core and multiprocessor systems, in:
High Performance Computer Architecture, 2007. HPCA 2007. IEEE 13th International Symposium on, 2007, pp. 13–24. doi:10.1109/
HPCA.2007.346181.
[13] S. Loosemore, R. Stallman, R. McGrath, A. Oram, The GNU C Library: Reference Manual, Free software foundation, 1996.

10