Three meeting points between CA and AI – Saul Albert


I gave this keynote at the first European Conference on Conversation Analytics (ECCA 2020), which, due to C-19, had to be delivered in video form and not in the form of a stand-up talk.

I try to combine a film essay and a research presentation about a work in progress, so it doesn’t always work to include references on every slide. I’ve added it below with links to the data used where available.

Abstract

Sacks (1963) who first published a paper on ‘sociological description’ used the metaphor of the mysterious ‘talking and doing’ machine, in which researchers from different disciplines produced incompatible and contradictory descriptions of its function. We may soon find ourselves in a situation similar to that described by Sacks as AI continues to penetrate the social sciences, and CA begins to view AI either as an object of research, as a research tool, or more likely as a pervasive feature of both.

There is currently a thriving industry in ‘Conversational AI’ and AI-based tools that claim to emulate or analyze speech, but the study and use of AI in CA is still uncommon. Although a growing body of literature uses CA to study social robotics, voice interfaces, and conversational user experience design (Pelikan & Broth, 2016; Porcheron et al., 2018), few conversation analysts even use digital tools, let alone statistical and computational methods . that supports conversational AI. Likewise, conversational AI researchers and developers rarely cite CA research and have only recently become interested in CA as a possible solution to difficult problems in natural language processing (NLP). This situation presents an opportunity for mutual engagement between conversational AI and CA (Housley et al., 2019). To spark debate on this issue, I will present three projects that combine AI and CA in very different ways and discuss the implications and possibilities for combined research programs.

The first project used a series of single-case analyzes to explore recordings in which advanced conversational AI successfully secured a telephone appointment with the human making the call. The second section revisits the debate over the use of automatic speech recognition for CA transcription (Moore, 2015) in light of recent significant advances in AI-based speech-to-text, and includes a live demo of ‘Gailbot’, a Jeffersonian automated transcription system. The third project uses and studies AI in an applied CA context. Using video analysis, this research asks how a disabled person and their care worker interact when using an AI-based voice interface and a co-designed ‘home automation’ system as part of the household routine of waking up, eating and performing personal care. Data is drawn from a ~500-hour video dataset recorded by participants using a voice-controlled, AI-based ‘smart security camera’ system.

These three examples of potential interpretations of CA and uses of machines ‘talking and doing’ AI provide fodder for debate about how CA research programs can conceptualize AI, and use or combine it with CA in mutually informative ways.

Videos (in order of appearance)

The Senster. (2007, March 29).

MIT AI Lab. (2011, September 25).

Keynote Speaker (Google I/O ’18). (2018, May 9).

Online Data

Linguistic Data Consortium. (2013). CABank CallHome English Corpus [Data set]. talking bank.

Jefferson, G. (2007). CABank English Jefferson NB Corpus [Data set]. Talk Bank.

Bibliography

Agree, P. (1997). Toward critical technical practices: Lessons learned in AI reform efforts. Social Sciences, Technical Systems and Cooperation: Beyond the Great Divide. Erlbaum.

Alač, M., Gluzman, Y., Aflatoun, T., Bari, A., Jing, B., & Mozqueda, G. (2020). How Everyday Interactions with Digital Voice Assistants Deny Individuals Return. Aesthetics Finally, 9(1), 51.

Berger, I., Viney, R., & Rae, J. P. (2016). Are there any conversations that are just starting to progress? Journal of Pragmatics, 9129–44.

Brave, G. B. (2015). Transcripts as Research: “Manual” Transcription and Conversation Analysis. Language and Social Interaction Research, 48(3), 276–280.

Brooker, P., Dutton, W., & Mair, M. (2019). The new ghost in the machine: “Pragmatic” AI and the conceptual dangers of anthropomorphic description. Ethnographic Studies, 16272–298.

Button, Graham. (1990). Into the Blind Road: Combining Conversation Analysis and Computational Modeling. In P. Luff, N. Gilbert, & D. Frolich (Eds.), Computers and Conversation (pp. 67–90). Academic Press.

Button, Graham, & Dourish, P. (1996). Technomethodology: Paradoxes and possibilities. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.

Button, G., & Sharrock, W. (1996). Project work: Organization of collaborative design and development in software engineering. Computer Supported Cooperative Work (CSCW), 5(4), 369–386.

Casino, T., & Freenor, Michael. (2018). Introduction to Google Duplex and natural conversation, willow tree.

Duca, D. (2019). Who interferes with transcription in academia? — Ocean Sage | Big Data, New Technologies, Social Sciences. Ocean Sage.

Fischer, J.E., Reeves, S., Porcheron, M., & Sikveland, R.O. (2019). Progress for voice interface design. Proceedings of the First International Conference on Conversational User Interfaces – CUI ’191–8.

Garfinkel, H. (1967). Studies in the field of ethnomethodology. Prentice-Hall.

Goodwin, C. (1996). Transparent vision. In EA Schegloff & SA Thompson (Eds.), Interaction and Grammar (pp. 370–404). Cambridge University Press.

Heath, C., & Luff, P. (1992). Collaboration and control: Crisis management and multimedia technology in the London Underground Control Room. Computer Supported Cooperative Work (CSCW), 1(1–2), 69–94.

Heritage, J. (1984). Garfinkel and ethnomethodology. Political Press.

Heritage, J. (1988). Explanations as accounts: A conversation analytic perspective. In C. Antaki (Ed.), Analyzing Everyday Explanations: A Method Casebook (pp. 127–144). Sage Publications.

Hoi, E.M. (2017). Loss of organization in interactions [PhD Thesis, Max Planck Institute for Psycholinguistics, Radbound University, Nijmegen].

Housley, W., Albert, S., & Stokoe, E. (2019). Natural Action Processing. In J. E. Fischer, S. Martindale, M. Porcheron, S. Reeves, & J. Spence (Eds.), Proceedings of the 2019 Halfway to the Future Symposium (pp. 1–4). Association for Computing Machinery.

Kendrick, K. H. (2017). Using Conversation Analytics in the Lab. Language and Social Interaction Research, 50(1), 1–11.

Lee, S.-H. (2006). The second call in the opening Korean telephone conversation. Language in Society, 35(02).

Leviathan, Y., & Matias, Y. (2018). Google Duplex: An AI System for Completing Real-World Tasks Over the Phone [Blog]. Google AI Blog.

Local, J., & Walker, G. (2005). Methodological Imperatives for Investigating the Phonetic Organization and Phonological Structure of Spontaneous Speech. Phonetics, 62(2–4), 120–130.

Luff, P., Gilbert, N., & Frolich, D. (Eds.). (1990). Computers and Conversation. Academic Press.

Moore, R. J. (2015). Automatic Transcription and Conversation Analysis. Language and Social Interaction Research, 48(3), 253–270.

Ogden, R. (2015). Data Always Invite Us to Listen Again: An Argument for Mixing Our Methods. Language and Social Interaction Research, 48(3), 271–275.

O’Leary, D. E. (2019). Google Duplex: Pretend to be human. Intelligent Systems in Accounting, Finance and Management, 26(1), 46–53.

Pelikan, HRM, & Broth, M. (2016). Why is that Nao? Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems – CHI \textquotesingle16.

Pelikan, H.R.M., Broth, M., & Keevallik, L. (2020). “Are You Sad, Cozmo?”: How Humans Understand Home Robots’ Displays of Emotion. Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction461–470.

Porcheron, M., Fischer, J.E., Reeves, S., & Sharples, S. (2018). Voice Interfaces in Everyday Life. Proceedings of the 2018 ACM Conference on Human Factors in Computing Systems (CHI’18).

Reeves, S. (2017). Some conversational challenges when talking to a machine. Talking with Conversation Agents in Collaborative Action, Workshop at the 20th ACM Conference on Computer-Supported Cooperative Work and Social Computing.

Relieu, M., Sahin, M., & Francillon, A. (2019). Lenny the bot as a resource for sequential analysis: Exploring the treatment of Next Round Repair Initiation at the start of an unsolicited call.

Robles, J. S., DiDomenico, S., & Raclaw, J. (2018). Become a regular user of technology and social media. Language & Communication, 60150–167.

Sack, H. (1984). While doing “being a normal person”. In J. Heritage & J.M. Atkinson (Eds.), The structure of social action: Studies in conversation analysis (pp. 413–429). Cambridge University Press.

Sack, H. (1987). On preferences for agreement and proximity in conversational sequences. In G Button & JR Lee (Eds.), Speech and social organization (pp. 54–69). Multilingual Issues.

Sack, H. (1995a). Lectures on conversation: Vol. II (G. Jefferson, Ed.). Wiley-Blackwell.

Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). The simplest systematics for organizing turn-taking in conversation. Language, 50(4), 696–735.

Sahin, M., Relieu, M., & Francillon, A. (2017). Using chatbots against voice spam: Analyzing Lenny’s effectiveness. Proceedings of the Thirteenth Symposium on Usable Privacy and Security319–337.

Schegloff, E. A. (1988). On Actual Virtual Servo Mechanisms for Guessing Bad News: A Single Case Conjecture. Social Issues, 35(4), 442–457.

Schegloff, E. A. (1993). Reflections on Quantification in Conversation Studies. Research on Language & Social Interaction, 26(1), 99–128.

Schegloff, E. A. (2004). Answering the Phone. In G. H. Lerner (Ed.), Conversation Analysis: Studies from the First Generation (pp. 63–109). John Benjamins Publishing Company.

Schegloff, E. A. (2010). A few more “Uh(m)s.” Discourse Process, 47(2), 130–174.

Soltau, H., Saon, G., & Kingsbury, B. (2010). IBM Attila speech recognition device. 2010 IEEE Spoken Language Technology Workshop97–102.

Stivers, T. (2015). Coding Social Interactions: A Misguided Approach to Conversation Analysis? Language and Social Interaction Research, 48(1), 1–19.

Stokoe, E. (2011). Simulated Interaction and Communication Skills Training: ‘Conversation-Analytical Role Play Method’. In the Applied Conversation Analysis (pp. 119–139). Palgrave Macmillan UK.

Stokoe, E. (2013). Authenticity (In) Talking Simulations: Comparing Role Interactions and Actual Interactions and Their Implications for Communication Training. Research on Language & Social Interaction, 46(2), 165–185.

Stokoe, E. (2014). Conversation Analytical Role Playing Method (CARM): Communication Skills Training Method as an Alternative to Simulation Role Playing. Language and Social Interaction Research, 47(3), 255–265.

Stokoe, E., Sikveland, R.O., Albert, S., Hamann, M., & Housley, W. (2020). Can humans simulate speaking like other humans? Compare simulated clients with real customers in service inquiries. Discourse Studies, 22(1), 87–109.

Turing, A. (1950). Computing machines and intelligence. Thought, 49433–460.

Walker, G. (2017). Pitch and Projections More Talk. Language and Social Interaction Research, 50(2), 206–225.

Wong, J. C. (2019, May 29). “White collar workplace”: Google Assistant contractor alleges wage theft. Guard.



Digital Agency


we specialize in maximizing your online visibility and driving measurable results through strategic SEO solutions. We’re here to help businesses like yours rank higher, attract quality traffic, and achieve long-term growth in the ever-evolving digital landscape.

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may also like these