Complex Social Science (CoSSci) Supercomputer project
CoSSci Project
CoSSci has established the Complex Social Science (CoSSci) Supercomputer Gateway project, which aims to paradigmatically advance the use, status and role of comparative ethnographic and ethnological research in Anthropology and the Social Sciences, and expand the relevance of comparative research to a wide range of theoretical, academic and applied settings. The R software used in the CoSSci Galaxy is by Malcolm Dow and Anthon Eff (e.g., Eff and Dow. 2009. "How to Deal with Missing Data and Galton’s Problem in Cross-Cultural Survey Research: A Primer for R." Structure and Dynamics: eJournal of Anthropological and Related Sciences.)
The Complex Social Science (CoSSci) Supercomputer Gateway (portal implementation 2013 at UCI/SDSC@UCSD) provides remote access for researchers and classrooms or online classes to do advanced computing in social science and environmental comparative studies of human societies. Four major comparative databases are available to date with the following N=cases and V=variables: Standard Cross-Cultural Sample (N=186,V=2800); Binford Foragers (N=339,V=1800); Ethnographic Atlas (N=1270,V=399); Western Indians: Comparative Environments, Languages and Cultures of Western American Indian Tribes (N=172,V=496). A new database corrects postcolonial Ethnographic Atlas coding biases in gendered subsistence contributions when compared against coded archaeological data for the same societies.
CoSSci takes into account the evolutionary and environmental context of societies in each sample model (dependent and independent variables) and adds a “context” variable, “Wy,” to those of the regression results. Thus DEf R methodology used by CoSSci treats each specific set of variables in a model as solving separate complex issues of nonindependence. Modeling includes autocorrelation controls, imputation of missing data in the independent variables, Hausman tests for exogeneity, many other inferential statistical tests, and world and detailed mapping of variables.
Gateway
CoSSci has developed a Galaxy gateway site at UCI (a Virtual Machine) with analytic R software duplicated at Comet, an HPC (high performance and parallel) computer (replacing Trestles in 2015) that allows many classrooms or individuals worldwide to compute at the same time. VM run time for a single variable is two minutes for the Standard Cross-Cultural Sample (N=186, V=2800); and 15 minutes at Comet because of queue time. This will decrease to 2 minutes for one or multiple classes once we specify that models run on a single compute node. Modeling projects on Comet (or R gui, with matching CoSSci results) output a csv file and world maps of key variables, Results have been outstanding and represent a major accomplishment that enhances research and classroom learning and a major advance for publishable results in cross-cultural research. The Galaxy site is easier for students to use than R gui modeling; whether done at work, home or on classroom computers. Downloads of working DEf R gui scripts with model output from Anthon Eff provide a learning ramp for students and researchers wanting to use or learn R.
Work for 2015-16 with parallel computing and the help of Paul Rodriguez at SDSC will implement R libraries(bnlearn) and (bootstrap) Bayesian for causal networks of variables using subsets of variables that are fully imputed for missing data as their input. Systemfit modeling may obtain path analyses and temporal panel effects for networks of variables.
Wiley Companion
To support the CoSSci Gateway project a Wiley-Blackwell Companion is being published with 30 some authors contributing worked examples using the Gateway, discussing impact on theory and practice, and how a wide range of researchers can use and update comparative research to augment their own work and expand the boundaries of current knowledge.
Online access to CoSSci is provided in 2014-16 to chapter authors of the Wiley Companion to Cross-Cultural Research (Editors White, Eff and Gray) in which CoSSci modeling facilities are highlighted and explained for teaching purposes. These and other authors are potential contributors to future courses (online and off) that use the CoSSci portal for their students and others are invited for current or subsequent US eScholarship publications.
Teaching
The first online classroom startups for “Advances in Sociology” by Ren Feng, a former TA of Doug White who taught DEf courses, ran for 15 weeks in fall of 2013, 2014 and will run again at Xiamen University in 2015. Some fundamental research questions have already been addressed by some of Ren Reng’s students, chapter authors of the Wiley Companion to Cross-Cultural Research, and Conference presentations of the core researchers (White, Eff, Oztan and chapter authors). Online coursework includes distribution through C-Commons to new instructors. This will greatly boost usage. The word has spread about the new statistical modeling and datasets now widely available through the CoSSci project. We at CSAC are working on eventual servicing to expand usage and software communities of use and courseware with the help of the Adobe Software Foundation and Suresh Marru (Education and Science Commities) whose current research focus is to investigate next generation science gateway and workflow systems to support interactive, adaptive and dynamic needs of exploratory science.
The CoSSci Gateway is growing to include Complex Network Analysis and Simulation that include modeling capabilities for evolutionary aspects of human complexity, including Gateway projects having access to large-network software for measuring cohesive subgroups and effects of multiconnectivity at Gordon’s HPC parallel computing at SDSC. Datasets include not only environmental and climatic data, but will grow to include disease and genetic data at the population level; also historical data on growth of cities and trade routes, historical empires, and complex economies, while also modeling interfaces between ethnographic and historical data and archaeology. A related historical project (“Sheshat”, named after an Egyptian female scribe goddess of writing and measurement) is developing interfaces with texts, codes, and data for the comparative study of historical empires. As computational power grows for managing networked data (limited causal graph explorations but also larger networks of observed-data path analysis, and panel analysis of temporal sequences), larger-scale modeling can make use of more complex questions in supercomputer modeling in the social, economic and historical sciences.
CSAC
In addition to updated analytic software contributions from Fischer's group at CSAC, University of Kent (UK), Co-PI Fischer will provide the resource services framework for people to integrate summaries of ethnographic information relevant to coded data variables and provide modeling examples and discussions of statistical inferences and problems of interpretation and validation. He and UK’s Janet Bagg have created a summarizing algorithm for ethnographic literature that can link specific categories in coded data, through Murdock's Outline of Cultural Materials (OCM), to deliver summarized content from ethnography page references, a tremendous boon for students, coders, and analysts. Virtual servers at Kent (UK) will link to CoSSci.
CoSSci Contributors
Founded by CoSSci PI Doug White, the project has support from NSF, Argonne Labs, UCI’s Tolga Oztan and Social Science Computing and UCSD’s San Diego Supercomputer Center (Co-PI Paul Rodriguez, Co-PI Robert Sinkovits, Co-PI Michael Fischer, and Kepler Science project director Ilkay Altintas), and Sociologist Ren Feng, who is running the testbed classes for use of CoSSci with the SCCS dataset in the teaching of cross-cultural research at Xiamen University’s “Advances in Sociology” courses.