Reading list
•
Sorted by Date •
Classified by Publication Type •
Classified by Research Category •
Sorted by Date
•
2007
•
2006
•
2005
•
2004
•
2003
•
2002
•
2001
•
2000
•
1999
•
1998
•
1997
•
1996
•
1995
•
1994
•
1993
•
1991
•
1990
•
1987
•
1979
•
1967
•
2007
- Roberto Lucchia and Manuel MazzaraCorresponding. A pi-calculus based semantics for WS-BPEL. In Journal of Logic and Algebraic
Programming, January 2007.
Note: interprate the BPEL in Pi-calculs, mainly about the exception
and compensation handle processing
Download:
(unavailable)
- Morris Matsa, Eric Perkins, Abraham Heifets, Margaret Gaitatzes Kostoulas, Daniel Silva, Noah Mendelsohn, and Michelle Leger.
A high-performance interpretive approach to schema-directed parsing. In WWW '07: Proceedings of the 16th
international conference on World Wide Web, 2007.
Download:
(unavailable)
- Yinfei Pan, Wei Lu, Ying Zhang, and Kenneth Chiu. A Static Load-Balancing Scheme for Parallel XML
Parsing on Multicore CPUs. In IEEE International Symposium on Cluster Computing and the Grid, Rio
de Janeiro, 2007.
Download:
(unavailable)
- Haller, Philipp and Odersky, Martin. Actors that Unify Threads and Events. In International Conference on Coordination
Models and Languages, 2007.
Download:
(unavailable)
- A. Slominski. Workflow for e-Sciences, Springer, 2007.
Download:
(unavailable)
- Tian tian. Effective Use of the Shared Cache in Multi-core Architectures. Dr.Dobb's Portal, 2007.
Note: some general programming hint for the shared cache awareness programming
Download:
[HTML]
2006
- Ali-Reza Adl-Tabatabai, Brian T. Lewis, Vijay Menon, Brian R. Murphy, Bratin Saha, and Tatiana Shpeisman.
Compiler and runtime support for efficient software transactional memory. SIGPLAN Not., 41(6):26–37,
ACM Press, New York, NY, USA, 2006.
Download:
(unavailable)
- Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer,
David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine
Yelick. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report UCB/EECS-2006-183,
EECS Department, University of California, Berkeley, 2006.
Note: good overview for the possible
research pathes for the multicore programming
Download:
(unavailable)
- Brendan Burns, Kevin Grimaldi, Alex Kostadinov, and Mark Corner. Flux: A Language for Programming High-Performance
Servers. In USENIX, 2006.
Note: a language to describe the stages/pipline based
pattern for the server processing
Download:
(unavailable)
- Gary Carleton and Walter Shands. Performance Analysis and Multicore Processors. DDJ, MAY 2006.
Note:
introduce the usage of Intel VTUNE for identifying the parallization in the application; It mentions
the Amdal law for hyperthreading.
Download:
[HTML]
- Brian D. Carlstrom, Austen McDonald, Hassan Chafi, JaeWoong Chung, Chi Cao Minh, Christos Kozyrakis, and
Kunle Olukotun. The Atomos transactional programming language. SIGPLAN Not., 41(6), ACM Press, New York, NY, USA, 2006.
Download:
(unavailable)
- Francisco Curbera, Rania Khalaf, William A. Nagy, and Sanjiva Weerawarana. Implementing BPEL4WS: the architecture
of a BPEL4WS implementation: Research Articles. Concurr. Comput. : Pract. Exper., 18(10), John Wiley
and Sons Ltd., Chichester, UK, UK, 2006.
Note: a little bit implemenation about BPEL
Download:
(unavailable)
- Matthew Drake, David Zhang, Michael Gordon, Janis Sermulins, William Thies, Allyn Dimock, Rodric Rabbah, and and Saman Amarasinghe.
Ubiquitous Stream Programming to Facilitate Migration to Multicore Architectures. In STMCS: First Workshop
on Software Tools for Multi-Core Systems (STMCS), 2006.
Note: stream on multicore?
Download:
(unavailable)
- Joe Duffy. Using concurrency for scalability. 2006.
Note: Performance of lock !!, GC is per thread
structure
Download:
[HTML]
- Greg Eisenhauer, Karsten Schwan, and Fabian Bustamante. Publish-Subscribe for High-Performance Computing.
IEEE Internet Computing, 10(1):40–47, IEEE Computer Society, Los Alamitos, CA, USA, 2006.
Download:
(unavailable)
- W. Emmerich, B. Butchart, L. Chen, B. Wassermann, and S. L. Price. Grid Service Orchestration using the Business
Process Execution Language (BPEL). Journal of Grid Computing, 2006.
Note: some sort of
workflow BPEL scalability
Download:
(unavailable)
- Byoungro So, Anwar Ghuloum and Youfeng Wu. Optimizing Data Parallel Operations on Many-Core Platforms. In
STMCS: First Workshop on Software Tools for Multi-Core Systems (STMCS), 2006.
Note:
remove redundent barriers
Download:
(unavailable)
- Philipp Haller and Martin Odersky. Event-Based Programming without Inversion of Control. In Proc. Joint
Modular Languages Conference, 2006.
Download:
(unavailable)
- Tim Harris, Mark Plesko, Avraham Shinnar, and David Tarditi. Optimizing memory transactions. SIGPLAN Not., 41(6):14–25,
ACM Press, New York, NY, USA, 2006.
Download:
(unavailable)
- Danny Hendler, Yossi Lev, Mark Moir, and Nir Shavit. A dynamic-sized nonblocking work stealing deque. Distrib.
Comput., 18(3):189–207, Springer-Verlag, London, UK, 2006.
Note: stealing queue with
dynamic increased dequeue
Download:
(unavailable)
- Intel. Tutorial of Intel Thread Building Block. 2006.
Note: good example for parallel interface
for possible xml processing; particually the task schedulre part. Essentially the task scheduling
is equivleent with the one in the parallel Tree traversing; In this paper , it is caled
"breath-firs theft and depth-first work"; The breadth-first thefe rule raises parallelism
sufficiently to keep threads busy. The depth-first work rule keeps each thread opearting
efficiently(due to the cache);
Download:
[HTML]
- Intel. Supra-linear Packet Processing Performance with Intel® Multi-core Processors. Intel, 2006.
Note: how intel uses the multicore for the pipelining pattern and spinflowing to speedup
the network protocl processing
Download:
[HTML]
- Intel. Accelerating Security Applications with Intel Multi-core Processors. Intel, 2006.
Download:
(unavailable)
- Wei Lu, Kenneth Chiu, and Yinfei Pan. A Parallel Approach to XML Parsing. In The 7th IEEE/ACM International Conference
on Grid Computing, Barcelona, September 2006.
Download:
(unavailable)
- Jeffrey Richter. Concurrency and Coordination Runtime. 2006.
Note: introduction of CCR
Download:
[HTML]
- Jeffrey Richter. CLR via C\#, Microsoft, 2006.
Download:
(unavailable)
- Jeffrey Richter. Reader/Writer Locks and the ResourceLock Library. 2006.
Download:
[HTML]
- Jeffrey Richter. Build a Richer Thread Synchronization Lock. 2006.
Note: It addresses the convoy
problem for the mutual lock, general solution will emphasis the faireness (FIFO) in their
impelemantion, but that means whenver a thread leaves the critical section, it will enter
the kernel to wake up other threads. If at this time a thread tries to get the lock, it will fail and
enter the sleep queue even it could get the lock without any kernel invovlment. To get the
performance brought by the faireness, this paper design a solution in which when a thread leave the
critical section it can transform the lock to the thread who is acquriing the lock, rather than waking
up a thread in the sleeping queue. It is a tradeoff between the performance and faireness.
Download:
[HTML]
- Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hudson, Chi Cao Minh, and Benjamin Hertzberg. McRT-STM: a
high performance software transactional memory system for a multi-core runtime. In PPoPP '06: Proceedings
of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming,
pp. 187–197, ACM Press, New York, NY, USA, 2006.
Download:
(unavailable)
- Tian Tian and Chiu Pi Shih. Software Techniques for Shared Cache Multi-Core Systems. Intel, 2006.
Note: More detail about programming hint for how to leverage the shared cache bewteen the
cores
Download:
[HTML]
2005
- Joseph Albahari. Threading in C\#. 2005.
Note: This article gives a basic but not trivial
introduction of thread in C# and only this article explicitly declare the .NET thread is delegated to
the system thread
Download:
[HTML]
- David Chase and Yossi Lev. Dynamic circular work-stealing deque. In SPAA '05: Proceedings of the seventeenth annual ACM
symposium on Parallelism in algorithms and architectures, pp. 21–28, ACM Press, New York, NY, USA,
2005.
Download:
(unavailable)
- Georgio Chrysanthakopoulos and Satnam Singh. An Asynchronous Messaging Library for C#. In SCOOL Conference Proceedings,
2005.
Note: for distributed memory model
Download:
(unavailable)
- Andreas Gustafsson. Threads Without the Pain. ACM Queue, 2005.
Note: the article argues
that the cooperative thread implemenation should better than the even-driven model, a hybrid use of peremptive thread with
cooperative thread is promising, but does Microsoft has supported this hybrition by fiber?
Download:
(unavailable)
- Matthew Hertz and Emery D. Berger. Quantifying the performance of garbage collection vs. explicit memory
management. In OOPSLA '05: Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming,
systems, languages, and applications, pp. 313–326, ACM Press, New York, NY, USA, 2005.
Note:
garbage collection and the regular memeory allocaon scheme
Download:
(unavailable)
- von Itzstein and G. Stewart. Introduction of High Level Concurrency Semantics in Object Oriented Languages.
Ph.D. Thesis, University of South Australia, 2005.
Note: the implemenation of join java
Download:
(unavailable)
- Phil Kerly. Improve .NET* Performance: Detecting and Reducing Thread Imbalance. 2005.
Download:
[HTML]
- Poonacha Kongetira, Kathirgamar Aingaran, and Kunle Olukotun. Niagara: A 32-Way Multithreaded Sparc Processor. IEEE Micro,
25(2):21–29, IEEE Computer Society, Los Alamitos, CA, USA, 2005.
Download:
(unavailable)
- Richard McDougall. Extreme software scaling. Queue, 3(7):36–46, ACM Press, New York, NY, USA, 2005.
Note: overview of scalability issue on the programming on multicores
Download:
(unavailable)
- Vance Morrison. Understand the Impact of Low-Lock Techniques in Multithreaded Apps. MSDN, 2005.
Note: introduce the basic knowledge of memory model, which I don't quiet understand;
But this paper introduces 4 techniques: Read free lock (in some cases the reading operation
doesn't need the lock); Spin lock; Direct Interlock update (like transaction memory, in
some cases we can use the test_and_set to make the lock-free) Lazy initilization (in some
case, like double check, the lock can be avoide, but it will be very subtle to determin if double check
is safe);
Download:
[HTML]
- Jeffrey Richter. Performance-Conscious Thread Synchronization. MSDN, 2005.
Note: implemenation
of the spinlock in C#
Download:
[HTML]
- Yaoping Ruan, Vivek S. Pai, Erich Nahum, and John M. Tracey. Evaluating the impact of simultaneous multithreading
on network servers using real hardware. SIGMETRICS Perform. Eval. Rev., 33(1):315–326, ACM Press, New York, NY,
USA, 2005.
Download:
(unavailable)
- Herb Sutter and James Larus. Software and the concurrency revolution. Queue, 3(7):54–62, ACM Press, New York,
NY, USA, 2005.
Note: the general introduction and overview on mutlicore software
programming mode
Download:
(unavailable)
- Brain J Welch. Impact of Load Imbalance on Processors with Hyper-Threading Technology. 2005.
Download:
[HTML]
2004
- Nick Benton, Luca Cardelli, and Cerderic Fournet. Modern Concurrency Abstractions for C#. In ACM Transactions on Programming
Languages and Systems, September 2004.
Download:
(unavailable)
- Suren A. Chilingaryan. XML Benchmark Project. http://xmlbench.sourceforge.net/,
2004.
Download:
(unavailable)
- Matthew Wilson. Imperfect C++, Addison Wesley, 2004.
Download:
[pdf]
2003
- Rob von Behern, Jeremy Condit, and Eric Brewer. Why Events Are A Bad Idea. In Proceedings of HotOS IX, May 2003.
Note: the paper argues that the cooperative scheduleing based threads library could provide better
performance than the event-driven style
Download:
(unavailable)
- Dino Esposito. Applied XML Programming For Microsfot .NET, Microsoft, 2003.
Download:
(unavailable)
- Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and Anne-Marie Kermarrec. The many faces of publish/subscribe.
ACM Comput. Surv., 35(2):114–131, ACM Press, New York, NY, USA, 2003.
Download:
(unavailable)
- Georg Gottlob, Christoph Koch, and Reinhard Pichler. XPath processing in a nutshell. SIGMOD Rec.,
32(1):12–19, ACM Press, New York, NY, USA, 2003.
Note: the formal defination about the
XPath
Download:
(unavailable)
- Tim Harris and Keir Fraser. Language support for lightweight transactions. In OOPSLA '03: Proceedings of the 18th annual
ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications,
pp. 388–402, ACM Press, New York, NY, USA, 2003.
Download:
(unavailable)
- Maurice Herlihy, Victor Luchangco, Mark Moir, and William N. Scherer, III. Software transactional memory for dynamic-sized
data structures. In PODC '03: Proceedings of the twenty-second annual symposium on Principles
of distributed computing, pp. 92–101, ACM Press, New York, NY, USA, 2003.
Download:
(unavailable)
- Rico Mariani. Gargabe Collector Basics and Performance Hints. MSDN, 2003.
Download:
[HTML]
- Gregor Noriskin. Writing High-Performance Managed Applications : A Primer. MSDN, 2003.
Download:
[HTML]
- Nickolai Zeldovich, Alexander Yip, Frank Dabek, RobertMorris, David Mazières, and Frans Kaashoek . Multiprocessor Support
for Event-Driven Programs. In Proceedings of the 2003 USENIX Annual Technical Conference(USENIX '03), San Antonio,
Texas, June 2003.
Note: the only one talking about the event-driven style on multiple processors,
by using color to identify the coordinating callbacksscheduler will not assgin the callbacks with same colors on the different
cpus at same time. It is like the interleaving construct in CCR,also it gave me some thoughts on the BPEL engin. BPEL engin
doesn't have the dependet requestes as the a cached web server has, but when a workflow instance is running, how to make sure
the mutual exclusivly acess the share variables in different concurrent actitivies is a challenge,we may use color scheme(lock
doesn't help here).
Download:
(unavailable)
2002
- Atul Adya, Jon Howell, Marvin Theimer, William J. Bolosky, and John R. Douceur. Cooperative Task Management Without Manual
Stack Management. In Proceedings of the General Track: 2002 USENIX Annual Technical Conference, USENIX Association,
Berkeley, CA, USA, 2002.
Note: the paper talks the stack ripping problem in the event-driven stype,
and claim a new area which has cooperative task management and automatic stack management,mainly it is done via the fiber
in window32, and giving the scheme how the automatic stack management code and manual stack management code interoprated(basically
via wrapping the fiber)
Download:
(unavailable)
- Nick Benton, Luca Cardelli, and Cdric Fournet. Modern Concurrency Abstractions for C\#. In ECOOP '02: Proceedings of the
16th European Conference on Object-Oriented Programming, Springer-Verlag, London, UK, 2002.
Download:
(unavailable)
- Frank Dabek, Nickolai Zeldovich, Frans Kaashoek, David Mazières, and Robert Morris. Event-driven programming for robust
software. In EW10: Proceedings of the 10th workshop on ACM SIGOPS European workshop: beyond the PC, ACM Press, New
York, NY, USA, 2002.
Note: short version of zeldovich's paper
Download:
(unavailable)
- Danny Hendler and Nir Shavit. Non-blocking steal-half work queues. In PODC '02: Proceedings of the twenty-first annual
symposium on Principles of distributed computing, pp. 280–289, ACM Press, New York, NY, USA, 2002.
Note:
non blocking implemenation of stealing half queue
Download:
(unavailable)
- David Lifka, Lucia Walle, Veaceslav Zaloj, and John Zollweg. Parallel Processing with .NET. In unknown, 2002.
Note: for distributed memory model
Download:
(unavailable)
- Kevin Lu, Yuanling Zhu, Wenjun Sun, Shouxun Lin, and Jianping Fan. Parallel Processing XML Documents. ideas,
IEEE Colmputer Society, Los Alamitos, CA, USA, 2002.
Download:
(unavailable)
2001
- Ping An, Alin Jula, Silvius Rus, Steven Saunders, Tim Smith, Gabriel Tanase, Nathan Thomas,
Nancy Amato, and Lawrence Rauchwerger. STAPL: A Standard Template Adaptive Parallel C++ Library. In Int.
Wkshp on Adv. Compiler Technology for High Perf. and Embedded Processors, 2001.
Note:
It has the parallel tree stuff, may be helpful for my parallel XML stuff
Download:
[pdf]
- Christine Flood, Dave Detlefs, Nir Shavit, and Catherine Zhang. Parallel Garbage Collection for Shared Memory
Multiprocessors. In Usenix Java Virtual Machine Research and Technology Symposium (JVM '01), Monterey,
CA, 2001.
Note: This paper desribes a simple (but I am sure it is correct) terminiation
detection algorithm in the work stealing scheme designed for the garbage collector in Java
Download:
(unavailable)
- James R. Larus and Michael Parkes. Using Cohort Scheduling to Enhance Server Performance (Extended Abstract).
In LCTES '01: Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded
systems, 2001.
Note: using cohort to improve the cache effect for the event/stage
based server architecture
Download:
(unavailable)
- A. R. Schmidt, F. Waas, M. L. Kersten, D. Florescu, I. Manolescu, M. J. Carey, and R. Busse. The XML Benchmark Project. Technical
Report INS-R0103, CWI, 2001.
Download:
(unavailable)
- Carlos A. Varela and Gul Agha. Programming Dynamically Reconfigurable Open Systems with SALSA. In 16th
Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications:
Intriguing Technology Track, Tampa, FL, October, 2001.
Download:
(unavailable)
- Matt Welsh, David E. Culler, and Eric A. Brewer. SEDA: An Architecture for Well-Conditioned, Scalable Internet
Services. In Symposium on Operating Systems Principles, pp. 230–243, 2001.
Download:
(unavailable)
2000
- Emery Berger, Kathryn McKinley, Robert Blumofe, and Paul Wilson. Hoard: A Scalable Memory Allocator for Multithreaded
Applications. University of Texas at Austin, 2000.
Download:
(unavailable)
- C. Fournet and G. Gonthier. The join calculus: a language for distributed mobile programming. 2000.
Download:
(unavailable)
1999
- Gul Agha and WooYoung Kim. Unifying Parallel and Distributed Computation: An Actor-Based Approach. In Journal
of Systems Architecture, 1999.
Download:
(unavailable)
- Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work stealing. J.
ACM, 46(5):720–748, ACM Press, New York, NY, USA, 1999.
Note: this paper introdeuces
the concept of stealing in the CILK project, I didn't follow the theoretical proving part.
Download:
(unavailable)
- Ananth Y. Grama and Vipin Kumar. State of the Art in Parallel Search Techniques for Discrete Optimization
Problems. IEEE Transactions on Knowledge and Data Engineering, 11, 1999.
Note: an good
overview of the load balancing technoligies for the parallel tree traversering, including DFS,
BFS)
Download:
[ps]
- J. Hu and D. Schmidt. JAWS: A Framework for High Performance Web Servers. 1999.
Download:
(unavailable)
- Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel. Flash: An efficient and portable Web server. In Proceedings
of the USENIX 1999 Annual Technical Conference, 1999.
Download:
(unavailable)
1998
- Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton. Thread scheduling for multiprogrammed
multiprocessors. In SPAA '98: Proceedings of the tenth annual ACM symposium on Parallel algorithms and
architectures, pp. 119–129, ACM Press, New York, NY, USA, 1998.
Note: Derived from
CILK, this paper is focusing on the implemenation of a lock-free task queue, by which the
tasks can be stealed from the top, and push into the bottom. The lock-free dequeue is known as ABP
queue;
Download:
(unavailable)
- Robert D. Blumofe and Dionisios Papadopoulos. The performance of work stealing in multiprogrammed environments
(extended abstract). SIGMETRICS Perform. Eval. Rev., 26(1), ACM Press, New York, NY, USA, 1998.
Download:
(unavailable)
- James C. Hu, Sumedh Mungee, and Douglas C. Schmidt. Techniques for Developing and Measuring High
Performance Web Servers over High Speed ATM Networks. In INFOCOM (3), pp. 1222–1231, 1998.
Download:
(unavailable)
- Maged M. Michael and Michael L. Scott. Nonblocking algorithms and preemption-safe locking on multiprogrammed
shared memory multiprocessors. J. Parallel Distrib. Comput., 51(1):1–26, Academic Press, Inc., Orlando, FL, USA,
1998.
Download:
(unavailable)
- Fabian Zabatta and Kevin Ying. A Thread Performance Comparison: Windows NT and Solaris on a Symmetric Multiprocessor.
In USNIX, pp. 57–66, 1998.
Download:
(unavailable)
1997
- Mark Moir. Practical implementations of non-blocking synchronization primitives. In PODC '97: Proceedings
of the sixteenth annual ACM symposium on Principles of distributed computing, pp. 219–228, ACM
Press, New York, NY, USA, 1997.
Download:
(unavailable)
1996
- Sarita V. Adve and Kourosh Gharachorloo. Shared Memory Consistency Models: A Tutoial. Computer, 29(12):66–76,
IEEE Computer Society Press, Los Alamitos, CA, USA, 1996.
Download:
(unavailable)
- George Karypis and Vipin Kumar. Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs. In Supercomputing,
1996.
Download:
(unavailable)
- Maged M. Michael and Michael L. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue
algorithms. In PODC '96: Proceedings of the fifteenth annual ACM symposium on Principles of distributed
computing, pp. 267–275, ACM Press, New York, NY, USA, 1996.
Note: it is well known
for its correct implemenation of nonblocking queue as well as a queue with two lcoker algo.
it is called MS queue sometime, and java5:concurrentQueue is based on this paper!
Download:
(unavailable)
- Erich Nahum, David J. Yates, Sean O'Malley, Hilarie Orman, and Richard Schroeppel. Parallelized Network Security Protocols.
In SNDSS '96: Proceedings of the 1996 Symposium on Network and Distributed System Security (SNDSS '96),
pp. 145, IEEE Computer Society, Washington, DC, USA, 1996.
Download:
(unavailable)
- John Ousterhout. Why Threads Are A Bad Idea. In unknown, January 1996.
Note: A good slides
talking about what the shortcoming of the thread model is, and claims that event-driven mode has mulitple adventages
Download:
(unavailable)
1995
- Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. Cilk:
an efficient multithreaded runtime system. In PPOPP '95: Proceedings of the fifth ACM SIGPLAN symposium
on Principles and practice of parallel programming, ACM Press, New York, NY, USA, 1995.
Note:
CILK introduces the stealing based load balancing scheme
Download:
(unavailable)
- Ananth Grama and Vipin Kumar. Parallel Processing of Combinatorial Optimization Problems. ORSA Journal
of Computing, 1995.
Note: A good survey of parallel solution for the cominatorial
optimization problem; in which it is said a automic operation may make the global ronder
robin be better than random polling;
Download:
(unavailable)
- A. Reinefeld. Scalability of Massively Parallel Depth-First Search. In Parallel Processing of Discrete
Optimization Problems, pp. 305–322, 1995.
Note: The introduction of
search-Frontier algorithm, which is similiar with static pxp; also the performance result
is consistent with the one of staticpxp
Download:
(unavailable)
- D. C. Schmidt and T. Suda. Measuring the performance of parallel message-based process architectures. In
INFOCOM '95: Proceedings of the Fourteenth Annual Joint Conference of the IEEE Computer and
Communication Societies (Vol. 2)-Volume, pp. 624, IEEE Computer Society, Washington, DC, USA, 1995.
Download:
(unavailable)
- Douglas C. Schmidt and Tatsuya Suda. Measuring the impact of alternative parallel process architecture on
communication subsystem performance. In Protocols for High-Speed Networks IV, pp. 123–138, Chapman \& Hall, Ltd.,
London, UK, UK, 1995.
Download:
(unavailable)
- Nir Shavit and Dan Touitou. Software Transactional Memory. In Symposium on Principles of Distributed Computing, pp.
204–213, 1995.
Download:
(unavailable)
- John D. Valois. Lock-free linked lists using compare-and-swap. In PODC '95: Proceedings of the fourteenth annual ACM
symposium on Principles of distributed computing, pp. 214–222, ACM Press, New York, NY, USA, 1995.
Download:
(unavailable)
1994
- G. Karypis and V. Kumar. Unstructured Tree Search on SIMD Parallel Computers. IEEE Trans. Parallel Distrib. Syst.,
5(10):1057–1072, IEEE Press, Piscataway, NJ, USA, 1994.
Note: Parallel Tree search solution
on the SIMD machine, in which the algorithm consists of two stages. The first stage is parallel
searching, and the 2nd stage is load balancing adjusting; all processors should do the two
stages in the lock-step fashion. The hard problem is how to re-assigne the jobs among the
processor to keep the balanced work load.
Download:
(unavailable)
- Vipin Kumar, Ananth Y. Grama, and Nageshwara Rao Vempaty. Scalable load balancing techniques for parallel
computers. J. Parallel Distrib. Comput., 22(1):60–79, Academic Press, Inc., Orlando, FL, USA, 1994.
Note: the paper uses the isoefficiency function to analyze various load balancing schems
(like global round robin or randon polling or ann)
Download:
(unavailable)
- Erich M. Nahum, David J. Yates, James F. Kurose, and Donald F. Towsley. Performance Issues in Parallelized
Network Protocols. In Operating Systems Design and Implementation, pp. 125–137, 1994.
Download:
(unavailable)
- A. Reinefeld. Effective parallel backtracking methods for Operations Research applications. In Proc. EUROSIM
Intl. Conf. Massively Par., Delft, June 1994.
Note: The introduction of search-Frontier algorithm
and AIDA, The more important thing is the recursive algorithm is faster than explicit statck
algo.
Download:
(unavailable)
- J. D. Valois. Implementing Lock-Free Queues. In Proceedings of the Seventh International Conference on
Parallel and Distributed Computing Systems, pp. 64–69, Las Vegas, NV, 1994.
Note: valois's
implemenatio nof lockfree queues
Download:
(unavailable)
1993
- Mats Bjorkman and Per Gunningberg. Locking effects in multiprocessor implementations of protocols. In SIGCOMM
'93: Conference proceedings on Communications architectures, protocols and applications,
ACM Press, New York, NY, USA, 1993.
Download:
(unavailable)
- Maurice Herlihy and J. Eliot B. Moss. Transactional memory: architectural support for lock-free data structures.
In ISCA '93: Proceedings of the 20th annual international symposium on Computer architecture, pp.
289–300, ACM Press, New York, NY, USA, 1993.
Note: the first paper about the hardware TM
Download:
(unavailable)
1991
- Maurice Herlihy. Wait-free synchronization. ACM Trans. Program. Lang. Syst., 13(1):124–149, ACM Press, New York,
NY, USA, 1991.
Note: This paper provides the fundermental theory for NBS
Download:
(unavailable)
1990
- T. E. Anderson. The Performance of Spin Lock Alternatives for Shared-Money Multiprocessors. IEEE Trans.
Parallel Distrib. Syst., 1(1), IEEE Press, Piscataway, NJ, USA, 1990.
Note: a classical article
about the spin lock
Download:
(unavailable)
- M. Furuichi, K. Taki, and N. Ichiyoshi. A multi-level load balancing scheme for OR-parallel exhaustive search
programs on the multi-PSI. In PPOPP '90: Proceedings of the second ACM SIGPLAN symposium on Principles
\& practice of parallel programming, pp. 50–59, ACM Press, New York, NY, USA, 1990.
Note: Example of the Static partition,it is sender initilized task partition with receiver
initilized task delivery , and it could be mulitple level;
Download:
(unavailable)
1987
- Vipin Kumar and V. Nageshwara Rao. Parallel depth first search. Part II. analysis. Int. J. Parallel Program., 16(6):501–519,
Kluwer Academic Publishers, Norwell, MA, USA, 1987.
Download:
(unavailable)
1979
- Hugh C. Lauer and Roger M. Needham. On the duality of operating system structures. SIGOPS Oper. Syst. Rev., ACM Press,
New York, NY, USA, 1979.
Note: a very earily paper about the equivlence between the event-driven
mode and thread-mode
Download:
(unavailable)
1967
- Gene M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities.
In AFIPS Conference Proceedings, 1967.
Download:
(unavailable)
Generated by
bib2html.pl
(written by Patrick Riley
) on
Wed Aug 08, 2007 18:44:15