S9-AI for Social Work Research, Data Science, and Methodology

Session 9

AI for Social Work Research, Data Science, and Methodology

Moderator: Dr. Wenjin Wang (Assistant Professor, City University of Macau, China)

O9.1 Examining the Boundaries of LLM-Generated Silicon Samples as Substitutes for Human Respondents in Social Surveys: A Validation Study of Synthetic Data Based on the KAP Framework

*Wenjie Duan¹, Yao Tang¹

¹Department of Social Work, East China University of Science and Technology, Shanghai, China

Abstract

Background and Purpose: Traditional social science and social work surveys are often constrained by high costs, long timelines, and limited representativeness. With advances in large language models (LLMs) for content generation and role simulation, synthetic surveys using silicon samples to emulate human respondents have gained traction. However, their capacity to replace real participants in evidence-based social work research remains debated. Prior work has typically lacked a clear psychological and behavioral framework and relied on fragmented indicators, leaving the limits of substitutability insufficiently tested. This study empirically delineates when LLM-generated silicon samples may fail or remain defensible as substitutes for human respondents in social work research.

Method: This study adopts the Knowledge–Attitude–Practice (KAP) framework and draws on data from the 2022 China Social Work Longitudinal Survey (CSWLS 2022). Silicon samples were generated by assigning unified demographic characteristics and contextual prompts to the LLM. Using correlation analysis, paired-sample t-tests, and distributional comparisons, we systematically evaluate replication performance at both the item level and the aggregate level, comparing response patterns, mean differences, and distributional properties across knowledge, attitude, and behavior dimensions.

Results: Grounded in the Knowledge–Attitude–Practice (KAP) framework, this study finds that LLM replication performance varies by construct due to differences in generative mechanisms. At the attitude level, attitudes are linguistically mediated evaluative structures reflecting semantic associations and value orientations. These features are readily captured by probabilistic text generation, so silicon samples perform best, closely matching human response directions and distributions, albeit with systematically more positive evaluations. At the knowledge level, human cognition is constrained by information asymmetries, whereas LLMs draw on extensive training data and strong retrieval capacity. Consequently, knowledge scores are markedly inflated and highly concentrated, producing the largest divergence from real samples and masking genuine individual differences. At the behavioral level, real actions are shaped by contextual and material constraints. LLMs, lacking embodied experience, tend to generate normative and idealized responses, failing to capture the variability and situational contingency of actual human behavior.

Conclusions and Implications: In social work research and education, silicon samples cannot yet directly replace real respondents, and their use in surveys and evaluations has clear limits. They are unsuitable for practice-based assessments that seek to capture clients’ real needs, identify knowledge gaps, or observe behavior in complex contexts. Instead, large language models are better used as auxiliary tools, for questionnaire pretesting and optimization, exploration of macro-level attitudinal patterns, and benchmarking professional ethics and behavioral norms in social work. Rather than pursuing ever higher-fidelity substitution, future work should shift toward a human–AI collaborative survey model. Silicon samples can assist with semantic refinement and logical checks during instrument development, whereas formal data collection and practice interventions must rely on human participants to reflect group heterogeneity and lived experience. Their fidelity may be further improved by specifying finer demographic attributes in prompts and incorporating explicit ethical scenarios or resource constraints.

O9.2 From Digital Operation to Evaluative Agency: Developing a Cognitively Calibrated eHealth Literacy Scale for Urban Older Adults

*Yajun SONG¹, Yuling ZHANG², Zey¹ QI¹, Xuesong HE¹

¹Department of Social Work, East China University of Science and Technology, Shanghai, China

Abstract

Background and Objectives: Electronic health literacy (eHL) remains operationally ambiguous in geriatric research and clinical practice. While community programmes frequently target eHL improvements, frontline practitioners lack calibrated instruments to identify specific ability levels or justify intervention intensity. Furthermore, older adults’ digital engagement is mediated by contextual factors such as trust and family support, necessitating age-appropriate measurement. This study developed a practice-facing eHL scale for older adults featuring evidence-based content validity and an explicit difficulty-grading framework to support clinical triage and programme matching. The research investigated which items best capture geriatric eHL and whether expert consensus can establish difficulty levels that reflect practical utility rather than abstract cognitive taxonomies.

Methods: A three-stage consensus study was conducted, comprising two Delphi rounds and an intervening online expert panel. In Round 1, 19 experts evaluated a 34-item pool for relevance and clarity using 4-point scales and assessed proposed difficulty levels (L1–L6). Item-level content validity indices (I-CVI) were calculated, with retention requiring a relevance I-CVI of at least 0.78 and a clarity I-CVI of 0.90. Difficulty consensus was defined by a support rate of 75 percent or higher, a median of 4 or higher, and an interquartile range (IQR) of 1 or less. Round 2 focus groups refined construct boundaries and item wording, while Round 3 re-assessed the revised items to finalise the instrument.

Results: Initial results indicated high perceived importance but limited age-appropriateness in phrasing, with most items failing clarity thresholds. Experts eliminated four items that measured general digital operations rather than specific eHL. Difficulty grading achieved partial convergence on 13 items, revealing that judgements were sensitive to how older adults enact tasks in real-world contexts. Qualitative synthesis from Round 2 yielded five themes guiding revision: construct boundary focus, multi-step behaviours, ability versus habit, linguistic accessibility, and contextual realism. The final 21-item scale demonstrated reduced redundancy, with difficulty levels primarily clustering at Levels 3 and 4.

Conclusions and Implications: This study produced a concise 21-item eHL scale tailored for older adults and graded for practice-facing assessment. The concentration of items at Levels 3 and 4 suggests that the most actionable distinctions for social work practice lie in mid-level competencies, such as navigating services and managing digital risks, rather than basic awareness. These findings highlight a tension between cognitive classifications and real-world performance, reinforcing the necessity of empirically validated grading. Future research should examine the psychometric structure and criterion validity of the scale, particularly regarding service navigation outcomes, while acknowledging that creative digital outputs may not be a definitive indicator of high eHL in later life.

O9.3 Groundbreaking Connections Across Generations Through Artificial Intelligence

*Alan Lai¹, Ricky Hou¹

¹Beijing Normal-Hong Kong Baptist University, China

Abstract

This presentation examines how artificial intelligence (AI) is revolutionizing intergenerational engagement by introducing the Scale of Intergenerational Engagement — AI version (SIE-AI), a seven-level framework that maps the progression from basic AI-powered sentiment analysis to the integration of AI companions and smart communities. Drawing on real-world examples such as AI-driven sentiment monitoring to combat ageism, therapeutic robots that facilitate cross-generational interaction, digital storytelling platforms, and smart village initiatives, we demonstrate that AI technologies are already enabling new forms of connection and collaboration across age groups. The discussion highlights both the opportunities and challenges of leveraging AI to address issues like social isolation, digital literacy gaps, and generational fragmentation. Ultimately, the presentation argues that while foundational tools and programs are in place, realizing the full potential of AI for social innovation requires intentional collaboration and a commitment to taking the “one more step” toward inclusive, resilient communities where all generations can thrive together.

O9.4 Measurement and Influence Mechanism of Digital Security among Older People

*Nan Qin¹, Xuan Gu², Qianyu Wen¹, Sasa Song¹, Yixuan Liu¹, Yuan Liu¹, Yashi Li¹, Yuyu Zhang¹, Fulin Wang¹

¹Dept. of Applied Sociology, Guangdong University of Finance and Economics, Guangzhou, China; ²Dept. of Social Work, School of Philosophy and Social Development, Shandong University, Jinan, China

Abstract

Background and Objectives: The advancement of digital intelligence technology presents a critical opportunity for China to proactively address population aging. However, due to issues such as the "digital divide," the vast majority of elderly individuals have very limited access to and use of such technologies. In this context, subjective psychological factors exert a more significant influence on their willingness to adopt digital intelligence tools, among which the role of digital security cannot be overlooked. Currently, there is a lack of specialized measurement tools and research on the mechanisms influencing digital security among older adults. This study aims to explore the structure of this conceptualization and its influencing factors based on Bandura's triadic reciprocal determinism.

Methods: A 19-item initial scale was developed through literature review, expert consultation, and interviews with older participants. The scale was refined using exploratory factor analysis (EFA) based on data from 412 structured interviews conducted in Foshan, Guangdong in 2025. Structural equation modeling (SEM) was employed to test the hypothesized relationships between environmental, cognitive, and behavioral factors and digital security.

Results: EFA identified three dimensions of digital security: threat perception (α=0.858), sense of control (α=0.745), and belongingness (α=0.801), with a cumulative variance explained of 60.14%. SEM results revealed that: (1) the three-dimension factor structure confirmed good model fit. (2) digital risk environment directly influenced threat perception (β=-0.153, p<0.05) and belongingness (β=-0.171, p<0.01); (3) risk perception positively predicted threat perception (β=0.348, p<0.001); (4) risk coping significantly enhanced sense of control (β=0.183, p<0.01) and belongingness (β=0.341, p<0.001).

Conclusions and Implications: This study proposes a reliable scale for assessing digital security among older adults and clarifies its complex mechanism from the interactions between individual cognition, behavior, and environment. The findings suggest targeted interventions, like anti-fraud education, age-friendly digital skills training, and promoting intergenerational digital companionship.