Jump to content

Gerard Salton: Difference between revisions

No edit summary
No edit summary
Line 2: Line 2:


== Life ==
== Life ==
'''Gerard A. "Gerry" Salton''' was born om March 8, 1927 in Nuremberg, Germany and named Gerhard Anton Sahlmann. He came to the United States in 1947 and was naturalized in 1952. He received Bachelor's (1950) and Master's (1952) degrees in mathematics from Brooklyn College. he earned a Ph.D. from Harvard in applied mathematics in 1958 and taught there until 1965, when he joined Cornell University and co-founded its department of Computer Science.  
'''Gerard A. "Gerry" Salton''' was born om March 8, 1927 in Nuremberg, Germany and named Gerhard Anton Sahlmann. He came to the United States in 1947 and was naturalized in 1952. He received Bachelor's (1950) and Master's (1952) degrees in mathematics from Brooklyn College. He earned a Ph.D. from Harvard in applied mathematics in 1958 and taught there until 1965 when he joined Cornell University and co-founded its department of Computer Science.  
He died on August 28, 1995 in Ithaca, NY.
He died on August 28, 1995 in Ithaca, NY.


== Contributions ==  
== Contributions ==  
His group at Cornell developed the SMART (System for the Mechanical Analysis and Retrieval of Text) information retrieval system, which he had initiated when he was at Harvard. SMART used automatic indexing of the full text of documents and comparison based of similarity in word frequency. Each document in a collection is a vector; each word is a dimension; and each document's position on each dimension is determined by the relative frequency of use of that word in that document.[https://en.wikipedia.org/wiki/Vector_space_model]
Salton's team at Cornell developed the influential SMART (System for the Mechanical Analysis and Retrieval of Text) which he had initiated when he was at Harvard. SMART was an environment for evaluating retrieval systems.  


Relative frequency was commonly calculated using term frequency–inverse document frequency (td-idf) wherein
SMART popularized the vector space model which used automatic indexing of the words in the text of each document in a collection, excluding a stop list of very common or insignificant words. Each word is treated as a dimension a very highly dimensional graph. Each document in the collection is a treated as a vector with its position on each word dimension determined by the relative frequency of that word in that document. Each query would be expressed in a comparable graph and then each document's relevance to it is assessed by its relative proximity to the query graph. [https://en.wikipedia.org/wiki/Vector_space_model]
the relative frequency is weighted by the inverse relative frequency in the collection so that rare words counted for more than common words.[https://en.wikipedia.org/wiki/Tf%E2%80%93idf]  


Each query would be expressed in a vector.comparable graph.  
Relative frequency was commonly calculated using term frequency–inverse document frequency (td-idf) whereby the relative frequency of a word in a document is weighted by the inverse relative frequency in the collection as a whole so that rare words count for more than common words.[https://en.wikipedia.org/wiki/Tf%E2%80%93idf]


According to Bellardo and Bourne, Salton's retrieval experiments of the 1980's "greatly contributed to the knowledge base of computerized information indexing, storage and retrieval."
Salton was editor-in-chief of the ''Communications of the ACM'' and the ''Journal of the ACM'' (1969 – 1972). He was an associate editor of the ''ACM Transactions on Information Systems'' and chaired the ACM Special Interest Group on Information Retrieval (SIGIR).
 
Salton was editor-in-chief of the ''Communications of the ACM'' and the ''Journal of the ACM'', and chaired Special Interest Group on Information Retrieval (SIGIR). He was an associate editor of the ACM Transactions on Information Systems.


== Publications ==
== Publications ==
*''Automatic Information Organization and Retrieval''. 1968.
*''Automatic Information Organization and Retrieval''. New York: McGraw-Hill, [1968].
*''A Theory of Indexing.'' Society for Industrial and Applied Mathematics. p. 56. ISBN 9780898710151. 1975.
*''The SMART retrieval system; experiments in automatic document processing''. Comp. by G. Salton. Englewood Cliffs, N.J., Prentice-Hall [1971].
*''Introduction to modern Information Retrieval.'' With Michael J. McGill. 1983. ISBN 0-07-054484-0
*"A vector space model for automatic indexing." With Anita Wong & Chung-Shu Yang. ''Communications of the ACM'' 18, no. 11 (1975): 613-620. [https://ecommons.cornell.edu/bitstream/handle/1813/6057/74-218.pdf?sequence=1]
*''Automatic Text Processing''. Addison-Wesley Publishing Company. 1989. 530. ISBN 978-0-201-12227-5.
*"A Vector Space Model for Automatic Indexing," With A. Wong & C. S. Yang. ''Communications of the ACM'' 18, no. 11 (1975): 613–620.
*"Toward a dynamic library." In: F. Wilfrid Lancaster, ed. ''The Role of the Library in an Electronic Society: Clinic on Library Applications of Data Processing''. Urbana-Champaign, IL: University of Illinois Graduate School of Library Science, 1980.
*"Toward a dynamic library." In: F. Wilfrid Lancaster, ed. ''The Role of the Library in an Electronic Society: Clinic on Library Applications of Data Processing''. Urbana-Champaign, IL: University of Illinois Graduate School of Library Science, 1980.
*"Extended boolean information retrieval." With E. A. Fox & H. Wu. ''Communications of the ACM'' 26, no. 11 (1983): 1022-1036. [https://ecommons.cornell.edu/bitstream/1813/6351/1/82-511.pdf]
*''Introduction to modern information retrieval''. With Michael J. McGill. New York: McGraw-Hill, 1983.
*"Term-weighting approaches in automatic text retrieval." With C. Buckley. ''Information processing & management'' 24, no. 5 (1988): 513-523. [https://ecommons.cornell.edu/bitstream/1813/6721/1/87-881.pdf]
*''Automatic text processing: the transformation, analysis, and retrieval of information by computer''. Reading, MA: Addison-Wesley Publishing Company, 1989.
*"Improving retrieval performance by relevance feedback." With Chris Buckley. ''Journal of the American society for information science'' 41, no. 4 (1990): 288-297. [https://courses.cs.umbc.edu/graduate/CMSC676/umbconly/SaltonBuckleyJASIS90.pdf]


== Offices ==
== Offices ==
Line 31: Line 31:
== Awards ==
== Awards ==
*Guggenheim Fellowship, 1962.
*Guggenheim Fellowship, 1962.
*Alexander von Humboldt Senior Science Award, 1988.
*[[ASIST|American Society for Information Science]]. Best JASIS Paper Award, 1970; best book award, 1975; Award of Merit, 1989.
*[[ASIST|American Society for Information Science]]. Best JASIS Paper Award, 1970; best book award, 1975; Award of Merit, 1989.
*Alexander von Humboldt Foundation. Senior Science Award, 1988.
*Association for Computing Machinery. Fellow, 1995.
*Association for Computing Machinery. Fellow, 1995.
*The Association for Computing Machinery Special Interest Group on Information Retrieval (ACM SIGIR) confers the Gerard Salton Award every three years for contributions to information retrieval". Salton ws the first recipient, 1983.
*The Association for Computing Machinery Special Interest Group on Information Retrieval (ACM SIGIR) confers the Gerard Salton Award every three years for contributions to information retrieval". Salton ws the first recipient, 1983.
Line 41: Line 41:
*"The father of information retrieval." p 25. [https://www.cs.cornell.edu/gries/40brochure/pg24_25.pdf]
*"The father of information retrieval." p 25. [https://www.cs.cornell.edu/gries/40brochure/pg24_25.pdf]
*Evslin, Tom. "Search Down Memory Lane." ''Fractals of change'' (Jan 19, 2006) [https://blog.tomevslin.com/2006/01/search_down_mem.html]
*Evslin, Tom. "Search Down Memory Lane." ''Fractals of change'' (Jan 19, 2006) [https://blog.tomevslin.com/2006/01/search_down_mem.html]
*Stock, Wolfgang G. & Mechtild Stock. In ''Handbook of information science.'' Berlin ; Boston : De Gruyter Saur, [2013], pp 289-300, "E.2 Vector Space Model."


== Papers ==
== Papers ==
*Cornell University Library. Division of Rare and Manuscript Collections: 16-13-2908. Gerard Salton papers. [http://rmc.library.cornell.edu/EAD/htmldocs/RMA02908.html]
*Cornell University Library. Division of Rare and Manuscript Collections: 16-13-2908. Gerard Salton papers. [http://rmc.library.cornell.edu/EAD/htmldocs/RMA02908.html]

Revision as of 18:47, 25 March 2025

Gerard Salton (1927-1995) was a German-American computer scientist and information retrieval specialist.

Life

Gerard A. "Gerry" Salton was born om March 8, 1927 in Nuremberg, Germany and named Gerhard Anton Sahlmann. He came to the United States in 1947 and was naturalized in 1952. He received Bachelor's (1950) and Master's (1952) degrees in mathematics from Brooklyn College. He earned a Ph.D. from Harvard in applied mathematics in 1958 and taught there until 1965 when he joined Cornell University and co-founded its department of Computer Science. He died on August 28, 1995 in Ithaca, NY.

Contributions

Salton's team at Cornell developed the influential SMART (System for the Mechanical Analysis and Retrieval of Text) which he had initiated when he was at Harvard. SMART was an environment for evaluating retrieval systems.

SMART popularized the vector space model which used automatic indexing of the words in the text of each document in a collection, excluding a stop list of very common or insignificant words. Each word is treated as a dimension a very highly dimensional graph. Each document in the collection is a treated as a vector with its position on each word dimension determined by the relative frequency of that word in that document. Each query would be expressed in a comparable graph and then each document's relevance to it is assessed by its relative proximity to the query graph. [1]

Relative frequency was commonly calculated using term frequency–inverse document frequency (td-idf) whereby the relative frequency of a word in a document is weighted by the inverse relative frequency in the collection as a whole so that rare words count for more than common words.[2]

Salton was editor-in-chief of the Communications of the ACM and the Journal of the ACM (1969 – 1972). He was an associate editor of the ACM Transactions on Information Systems and chaired the ACM Special Interest Group on Information Retrieval (SIGIR).

Publications

  • Automatic Information Organization and Retrieval. New York: McGraw-Hill, [1968].
  • The SMART retrieval system; experiments in automatic document processing. Comp. by G. Salton. Englewood Cliffs, N.J., Prentice-Hall [1971].
  • "A vector space model for automatic indexing." With Anita Wong & Chung-Shu Yang. Communications of the ACM 18, no. 11 (1975): 613-620. [3]
  • "Toward a dynamic library." In: F. Wilfrid Lancaster, ed. The Role of the Library in an Electronic Society: Clinic on Library Applications of Data Processing. Urbana-Champaign, IL: University of Illinois Graduate School of Library Science, 1980.
  • "Extended boolean information retrieval." With E. A. Fox & H. Wu. Communications of the ACM 26, no. 11 (1983): 1022-1036. [4]
  • Introduction to modern information retrieval. With Michael J. McGill. New York: McGraw-Hill, 1983.
  • "Term-weighting approaches in automatic text retrieval." With C. Buckley. Information processing & management 24, no. 5 (1988): 513-523. [5]
  • Automatic text processing: the transformation, analysis, and retrieval of information by computer. Reading, MA: Addison-Wesley Publishing Company, 1989.
  • "Improving retrieval performance by relevance feedback." With Chris Buckley. Journal of the American society for information science 41, no. 4 (1990): 288-297. [6]

Offices

Awards

  • Guggenheim Fellowship, 1962.
  • American Society for Information Science. Best JASIS Paper Award, 1970; best book award, 1975; Award of Merit, 1989.
  • Alexander von Humboldt Foundation. Senior Science Award, 1988.
  • Association for Computing Machinery. Fellow, 1995.
  • The Association for Computing Machinery Special Interest Group on Information Retrieval (ACM SIGIR) confers the Gerard Salton Award every three years for contributions to information retrieval". Salton ws the first recipient, 1983.

Further reading

  • "Gerard Salton." Wikipedia [7]
  • Dubin, David. "The Most Influential Paper Gerard Salton Never Wrote." Library Trends 52, no 4 (Spring 2004): 748–764. [8] Revisionist history of the vector space model and Salton's work.
  • "The father of information retrieval." p 25. [9]
  • Evslin, Tom. "Search Down Memory Lane." Fractals of change (Jan 19, 2006) [10]
  • Stock, Wolfgang G. & Mechtild Stock. In Handbook of information science. Berlin ; Boston : De Gruyter Saur, [2013], pp 289-300, "E.2 Vector Space Model."

Papers

  • Cornell University Library. Division of Rare and Manuscript Collections: 16-13-2908. Gerard Salton papers. [11]