The NYU Ultracomputer project was largely documented in a huge series of "Ultracomputer Reports" -- this is a circa 1986 collection of fragments from those. It's a pretty good reference for fetch-and-op.
@techreport{ gottlieb87overview, author = "Allan Gottlieb", title = "An Overview of the {NYU} Ultracomputer Project", number = "100", month = "1986 {"}Experimental Parallel Computing Architectures{"}, Editor - J.J. Dongarra. Elsevier Science Publishers; B.V. (North Holland)", year = "1987", url = "citeseer.ist.psu.edu/gottlieb86overview.html" }
A nice paper with lots of circuit detail about fetch-and-op implementations.
@inproceedings{52443, author = {G. J. Lipovski and P. Vaughan}, title = {A fetch-and-op implementation for parallel computers}, booktitle = {ISCA '88: Proceedings of the 15th Annual International Symposium on Computer architecture}, year = {1988}, isbn = {0-8186-0861-1}, pages = {384--392}, location = {Honolulu, Hawaii, United States}, publisher = {IEEE Computer Society Press}, address = {Los Alamitos, CA, USA}, }
A nice overview of basic memory consistency issues.
@article{ adve96shared, author = "S. V. Adve and K. Gharachorloo", title = "Shared Memory Consistency Models: {A} Tutorial", journal = "IEEE Computer", volume = "29", number = "12", pages = "66--76", year = "1996", url = "citeseer.ist.psu.edu/adve95shared.html" }
One of the MANY papers about TreadMarks, which is probably the best known page-fault DSM system.
@inproceedings{treadmarks, author="P. Keleher and S. Dwarkadas and A.L. Cox and W. Zwaenepoel", title="TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems", booktitle="Proceedings of the Winter 94 Usenix Conference", pages="115--131", month="January", year="1994" }
A little paper about an alternative implementation of what appears to be shared memory, but really is just remote memory access via function calls hidden by a few clever C++ templates.
@article{dietzrsm97, author="H. G. Dietz and T. I. Mattox", title="Managing Polyatomic Coherence and Races with Replicated Shared Memory" journal="Technical Committee on Computer Architecture (TCCA) Newsletter, Special Issue on Distributed Shared Memory and Related Issues", publisher="IEEE Computer Society", month="March", year="1997", url="http://tab.computer.org/tcca/NEWS/mar97/" }