Conference Agenda

August 30th(Friday)

8:00 - 8:30 Check-in and Guest Reception
8:30 - 8:40 Opening Ceremony
8:40 - 9:10 Simulating Subjects: The Promise and Peril of AI Stand-ins for Social Agents and Interactions
James Evans (Department of Sociology, University of Chicago, USA)
Abstract:Large Language Models (LLMs), through their exposure to massive collections of online text, learn the ability to reproduce the perspectives and linguistic styles of diverse social and cultural groups. This capability suggests a powerful social scientific application – the simulation of empirically realistic, culturally situated human subjects. Synthesizing recent research in artificial intelligence and computational social science, we outline a methodological foundation for simulating human subjects and their social interactions. We then identify nine characteristics of current models that are likely to impair realistic simulation human subjects, including atemporality, social acceptability bias, uniformity, and poverty of sensory experience. For each of these areas, we discuss promising approaches for overcoming their associated shortcomings. Given the rate of change of these models, we advocate for an ongoing methodological program on the simulation of human subjects that keeps pace with rapid technical progress.
9:10 - 9:40 IIDS: Intelligent Innovation Dataset for Scientific and Technological Knowledge Discovery and Advanced Data Analytics
Xiaoming Fu (Institute of Computer Science, University of Goettingen, Germany)
Abstract: To be Announced
9:40 - 10:10 An Exploration Study on the Interactive Effect of Diversity and Density on Innovation--A Interpretation from Yin-Yang Perspective
Jar Der Luo 罗家德 (School of Social Sciences, Tsinghua University)
Abstract: To be Announced
10:10 - 10:25 Tea Break
10:25 - 10:55 Modeling and Understanding Decision-Making in Societies of Humans and Artificial Agents
Mirco Musolesi (Department of Computer Science, University College London (UCL), UK)
Abstract: To be Announced
10:55 - 11:25 Investigating Fairness of Decision Making
Steffen Staab (Department of the Analytic Computing, University of Stuttgart, Germany)
Abstract: To be Announced
11:25 - 11:55 Experiences Collaborating with Social Scientists
Dah Ming Chiu (Department of Information Engineering, Hong Kong Chinese University, Hong Kong, China; Data Science Research Centre at Saint Francis University, Hong Kong, China)
Abstract: To be Announced
11:55 - 12:25 Urban Resilience and Equity in the Digital Age: Leveraging Social Computing for Crisis Management and Resource Allocation
Pan Hui 许彬 (Computational Media and Arts Thrust, The Hong Kong University of Science and Technology (Guangzhou))
Abstract: To be Announced
12:25 - 14:00 Lunch
14:00 - 14:30 Challenges in Content Moderation within Decentralized Social Networks
Gareth Tyson (Internet of Things Thrust, The Hong Kong University of Science and Technology (Guangzhou))
Abstract: Decentralized Social Networking has recently seen a renewed momentum, with a number of platforms like Mastodon gaining increasing traction (partly due to the controversial acquisition of Twitter by Elon Musk). These platforms offer alternatives to traditional 'centralized' web platforms like Twitter and YouTube by enabling the peer-to-peer operation of social network infrastructure without centralised ownership or control. They do, however, raise several key systems challenges, particularly related to scaling-up common machine learning driven features. One particularly challenging machine learning task is content moderation, i.e. the ability to detect and remove content that breaches the policies of the social networking platform (e.g. hate speech). In traditional platforms (e.g. Twitter), this is done by training large machine learning models that can automatically detect harmful content. However, in Decentralized Social Networks (like Mastodon), this is impossible as there is no single point that can aggregate data and train models. This talk will discuss recent work evaluating and building Decentralized Social Networking moderation tools, focusing on the challenges encountered in-the-wild. The presentation is based on several papers recently published at SIGCOMM, SIGMETRCS, IMC, and WWW.
14:30 - 15:00 AI+ECON for Market Power Regulation in Complex Markets
Yang Yu 于洋 (School of Economics and Management, China University of Petroleum (Beijing))
Abstract: To be Announced
15:00 - 15:30 Data Organization Limits the Predictability of Binary Classification
Zike Zhang 张子柯 (College of Media and International Culture, Zhejiang University)
Abstract: The structure of data organization is widely recognized as having a substantial influence on the efficacy of machine learning algorithms, particularly in binary classification tasks. Our research provides a theoretical framework suggesting that the maximum potential of binary classifiers on a given dataset is primarily constrained by the inherent qualities of the data. Through both theoretical reasoning and empirical examination, we employed standard objective functions, evaluative metrics, and binary classifiers to arrive at two principal conclusions. Firstly, we show that the theoretical upper bound of binary classification performance on actual datasets can be theoretically attained. This upper boundary represents a calculable equilibrium between the learning loss and the metric of evaluation. Secondly, we have computed the precise upper bounds for three commonly used evaluation metrics, uncovering a fundamental uniformity with our overarching thesis: the upper bound is intricately linked to the dataset's characteristics, independent of the classifier in use. Additionally, our subsequent analysis uncovers a detailed relationship between the upper limit of performance and the level of class overlap within the binary classification data. This relationship is instrumental for pinpointing the most effective feature subsets for use in feature engineering. This work is generally has potential applications in data driven researches to quantitively evaluate the dilemma of promoting algorithm performance and improving data quality.
15:30 - 16:00 EasyGraph 2.0: Towards a Better Interdisciplinary Network Analysis
Yang Chen 陈阳 (School of Computer Science and Technology, Fudan University)
Abstract: To be Announced
16:00 - 16:15 Tea Break
16:15 - 16:45 Knowledge-Enhanced Graph Neural Network for Corporate Fraud Detection
Wenzhong Li 李文中 (School of Computer Science, Nanjing University)
Abstract: Corporate fraud detection aims to automatically recognize companies that conduct wrongful activities such as fraudulent financial statements or illegal insider trading. Previous learning-based methods fail to effectively integrate rich interactions in the company network. To close this gap, we collect 18-year financial records in China to form three graph datasets with fraud labels. We analyze the characteristics of the financial graphs, highlighting two pronounced issues: (1) information overload: the dominance of (noisy) non-company nodes over company nodes hinders the message-passing process in Graph Convolution Networks (GCN); and (2) hidden fraud: there exists a large percentage of possible undetected violations in the collected data. The hidden fraud problem will introduce noisy labels in the training dataset and compromise fraud detection results. To handle such challenges, we introduce a novel graph-based method, namely, Knowledge-enhanced GCN with Robust Two-stage Learning (KeGCNR), which leverages Knowledge Graph Embeddings to mitigate the information overload and effectively learns rich representations.
16:45 - 17:15 Environmental Planning and Design with AI: New Trends and Opportunities
Steven Jige Quan (Department of Environmental Planning / Urban and Regional Planning, Seoul National University, Korea)
Abstract: To be Announced
17:15 - 17:45 Toward a Trustworthy Digital Future: Robust (Online) Learning and Fact-Checking
Yupeng Li 李钰鹏 (Department of Interactive Media, Hong Kong Baptist University, Hong Kong, China)
Abstract: To be Announced
17:45 - 18:15 Information consumption and firm size
Eddie Lee (Complexity Science Hub, Austria)
Abstract: Social and biological collectives exchange information through internal networks to function. Less studied is the quantity and variety of information transmitted. We shed light on this aspect by characterizing the information flow into organizations, primarily business firms. We measure online reading using a large data set of articles accessed by employees across millions of firms. We measure and relate quantitatively three aspects: reading volume, variety, and firm size. We compare volume with size, showing that firm sizes grow sublinearly with reading volume. This is like an economy of scale in information consumption that exaggerates the classic Zipf's law inequality for firm economics. We connect variety and volume to show that reading variety is limited. Firms above a threshold size read repetitively, consistent with the onset of a coordination problem between teams of employees in a simple model. Finally, we relate reading variety to size. The relationship is consistent with large firms that accumulate interests as they grow. We argue that this reflects structural constraints. Taking the scaling relations as a baseline, we show that excess reading is strongly correlated with returns and valuations. The results indicate how information consumption reflects internal structure, beyond individual employees, as is important for collective information processing in other systems.
18:15 - 20:00 Poster Exhibition & Banquet

August 31th(Saturday)

8:00 - 8:20 Mapping Urban Villages in China: Progress and Challenges
Rui Cao 曹瑞 (Urban Governance and Design Thrust, The Hong Kong University of Science and Technology (Guangzhou))
Abstract: To be Announced
8:20 - 8:40 Urban Intelligence for Addressing Future Human Habitat Challenges
Yuan Lai 来源 (School of Architecture, Tsinghua University)
Abstract: As technology continues to penetrate living spaces and urban life, the physical, social, and cyber dimensions of cities are increasingly integrated. This has led to the emergence of various intelligent applications, which, through continuous interaction with people, have created numerous new use cases. Meanwhile, it has introduced new problems, risks, and challenges for the future development of human settlements. This report explores the definition and fundamental functions of urban intelligence, and, in light of research topics, introduces typical use cases and future trends of urban intelligence. The increasingly information offers new opportunities for scientifically understanding the complex urban systems. Urban intelligence, by integrating data resources and artificial intelligence, fully utilizes multi-source heterogeneous data across different spatial scales and temporal frequencies to promote synergy between various subsystems and domains, aiming to support more responsive, precise, and forward-looking planning and operation. Urban intelligence will also help scientifically understand the nonlinear relationships between ecological, economic, and social factors related to human habitation, providing support for the coordinated digital transformation and urban management. In response to the challenges posed by the rapid emergence of information technology, such as data silos, information overload, disorderly development, and technological ethics, the report further discusses the guiding significance of planning in the development and utilization of urban information technology, as well as the basic principles that urban intelligence should follow. Looking forward, urban intelligence will provide scientific approaches and technical solutions for sustainable cities, and it will also become an important driving force for new productive capacities and high-quality urban development.
8:40 - 9:00 Urban Segregation Going beyond Residential Neighbourhoods
Xiaowen Dong (Department of Engineering, University of Oxford, UK)
Abstract: Urban income segregation is a widespread phenomenon that challenges societies across the globe. Classical studies on segregation have largely focused on the geographic distribution of residential neighbourhoods. In this talk, I will present some of our recent work on understanding segregation from a behavioural perspective, that is, segregation measured based on patterns of individual behaviours and social interactions. These results provide novel insights into a new form of urban segregation with socioeconomic implications.
9:00 - 9:20 Sociocognitive Observatories Discovering and Augmenting Cognitive Structure using Large-Scale Social Data and Artificial intelligence
Douglas Guilbeault (Stanford University, USA)
Abstract: What can we learn about the structure of the mind, both human and artificial, using large-scale social data, such as the textual and visual data flowing through search engines and social media platforms? In this keynote, I present a diverse range of studies showing how large-scale social data online can serve as a sociocognitive observatory for measuring the structure of both individual and collective minds, ranging from the structure of embodied cognition to the psychological biases that drive the formation of stereotypes. I will give special attention to presenting the results of a study we recently published in Nature which demonstrates how combining large-scale image and text data from online sources, analyzed via artificial intelligence, can reveal the multimodal structure of gender stereotypes. I will then show how these methods can further identify the cognitive structure of multidimensional stereotypes (e.g., gendered ageism) not only in human minds, but also in the representations and judgments formed by generative AI. Throughout, I will emphasize that sociocognitive observatories are useful not only for testing existing theories, but also for enabling cognitive and cultural discoveries. As an example, I will discuss ongoing work that harnesses sociocognitive observatories to unveil hidden connections between the cognitive structure of social types (e.g., gender) and the concreteness and abstractness of concepts across domains, using both textual and visual data online, as well as large-scale human crowdsourcing and the representational embeddings of large language models. The ability to measure and predict the evolution of such cognitive structure using real-time online data streams naturally raises the question of how AI can leverage such insights to augment collective intelligence and creativity. Opportunities for augmentation, which become increasingly accessible through the integration of computer science, cognitive science, and cultural sociology, will be discussed.
9:20 - 9:40 The Social Psychology of Large-Scale Societies
Joshua Jackson (Booth School of Business, University of Chicago, USA)
Abstract: Most modern humans live in large-scale societies filled with strangers. How can we navigate these societies without social life descending into conflict and chaos? Scholars since antiquity have primarily pointed to societal institutions like legal codes and moralizing religions for enforcing and coordinating large-scale cooperation. Here, I will focus on the overlooked role of social psychological heuristics that help people infer their partners’ cooperative intent, priorities, and abilities. Concepts like “morality,” traits like “agreeableness,” and social identities like “Canadian” are all inference heuristics that help us use known information to predict the contents of unknown minds. I suggest that these heuristics are more widespread than previously acknowledged, and became especially pervasive as societies became larger and more diverse. I will present converging research streams supporting the rise of these social heuristics in large-scale societies. First, field studies of Hadza hunter-gatherers, cross-cultural survey data, and natural language processing (NLP) analyses of historical English literature document the emergence of broad, generalized traits connoting morality and warmth in modern language and social cognition. Second, NLP analyses of social media communities, ethnographic analyses of non-industrial societies, and large-scale NLP analyses of Chinese historical records show that social identities become more salient and affectively polarized as groups become larger. I will conclude by suggesting that social psychological phenomena such as traits, social identities, and even values function as linguistically embedded and transmitted heuristics that act as inference tools for navigating partner selection in large-scale groups.
9:40 - 10:00 Enhancing Return Forecasting Using LSTM with Agent-based Synthetic Data
Lijian Wei 韦立坚 (School of Business, Sun Yat-sen University)
Abstract: To be Announced
10:00 - 10:15 Tea Break
10:15 - 10:35 SCHOLAT: a Scholar-Centered Social Network
Yong Tang 汤庸 (School of Computer Science, South China Normal University)
Abstract: Social networks are changing our daily lives. In order to meet the needs of research and teaching, we designed a social network named SCHOLAT, which provides a platform for scholars to cooperate in research and teaching. In this talk, I will briefly introduce the usage of SCHOLAT through real examples, analysis the big data and knowledge in SCHOLAT, and propose an application mode of SCHOLAT+. Finally, I'll introduce several applications based on SCHOLAT.
10:35 - 10:55 On the Responsible Use of Pseudo-Random Number Generators in Applied Scientific Research
Charles Rahal (Leverhulme Centre for Demographic Science, University of Oxford, UK)
Abstract: The current best practice in 'Open', 'Reproducible', and 'Responsible' research is to hide the variation caused by pseudo-random number generators (PRNGs) through the arbitrary use of a 'seed' or 'random state' in algorithmic pipelines. However, in the process of doing so, we argue that researchers index into the scientific record an insurmountable number of outcomes with seeming certainty, when they are in fact anything but certain: eliminating this variation is the opposite of what responsible researchers and practitioners should be doing. PNRGs are almost ubiquitous in some research areas -- occurring in a large proportion of quantitative and computational research designs -- and the potential variation in the estimand or outcome of interest is hitherto significantly under-appreciated. We present a substantial replication project involving highly published work, primarily in the form of Monte Carlo simulations, Machine Learning, and many more traditional inferential designs across the quantitative and computational health and social sciences. We show just how large the variation caused by the instantiation of PRNGs can be. We conclude with recommendations on how to embrace this variation for the betterment of scientific society, and how to responsibly design research which legitimately reduces and visualizes it where possible.
10:55 - 11:15 identifying Disinformation from Online Social Media via Dynamic Modeling across Propagation Stages
Shuai Xu 胥帅 (School of Computer Science, Nanjing University of Aeronautics and Astronautics)
Abstract: Identifying disinformation from online social media is crucial for maintaining a credible cyberspace. Although features from the content and propagation topology are widely exploited by existing studies to distinguish disinformation from normal ones, they are becoming less effective as content can be intentionally written to mislead readers and topological features are difficult to be extracted due to the high variance and diversity of reposting trees. Moreover, related works mainly focus on modeling the complete information propagation event, ignoring the staged evolution patterns along with propagation, which may also degrade the detection performance. In this paper, we conceive and implement a novel framework called DMPS for identifying disinformation, which dynamically models diverse topological structures of reposting trees as well as the textual content streams across different propagation stages. Particularly, DMPS learns expressive representations of the structural features via meta-trees and extracts sequential features of the content for intra-stage modeling, then it captures temporal dependencies for inter-stage modeling. The whole framework is optimized in a binary classification manner. Experiments based on multilingual social media datasets validate the effectiveness and superiority of DMPS over state-of-the-art models. We believe that this study can provide insights for crisis management in response to disinformation in social network campaigns.
11:15 - 11:35 A Theory-Driven Deep Learning Model for Predicting Co-Investment Partner Selection
Hu Yang 杨虎 (School of Information, Central University of Finance and Economics)
Abstract: To be Announced
11:55 - 12:15 ChatGPT vs Social Survey: Understanding the Objective and Subjective Society
Muzhi Zhou 周穆之 (Urban Governance and Design Thrust, The Hong Kong University of Science and Technology (Guangzhou))
Abstract: To be Announced
12:15 - 14:00 Award Ceremony & Lunch

Workshop

14:00 - 15:00 Community Governance: Intelligent Grassroots Governance and Community Construction
Jar Der Luo 罗家德 (School of Sociology, Tsinghua University)
15:10 - 16:40 Large Language Models for Urban Studies (Using R)
Chaosu Li 李超骕 (Thrust of Urban Governance and Design, Hong Kong University of Science and Technology (Guangzhou))
15:10 - 17:40 Geospatial Data Analysis (using GeoDa)
Ge Lin Kan 阚林戈 (Thrust of Urban Governance and Design, Hong Kong University of Science and Technology (Guangzhou))

End of Conference