Comparar texto

Encontre a diferença entre dois arquivos de texto

Real-time diff

Unified diff

Collapse lines

Highlight change

Syntax highlighting

Ferramentas

Diffchecker Desktop The most secure way to run Diffchecker. Get the Diffchecker Desktop app: your diffs never leave your computer!Get Desktop

Ted Underwood Podcast

Created 4 years agoDiff never expires

601 removals

33 lines

579 additions

68 lines

Hi, Ben. It's a pleasure to be here,

Ted Underwood: Hi Ben, it’s a pleasure to be here.

So I want to get started by, uh, asking, uh, how you got started integrating digital methods into your research. Since as I understand it, your formal academic background is in English literature.

Ben: I wanted to get started by asking how you got started integrating digital methods into your research, since as I understand it your formal academic background is in English literature.

right? It's, um it seems like a bit of an unusual turn, but it actually has a long history. Back in the 19 nineties, when I was in grad school, it was already beginning to be clear that there were going to be opportunities as our digital libraries got bigger to pose questions about. Oh, you know, the evolution of ideas, development of literary form. And I tried to do that a bit in the nineties, using the very limited collections of texts we had then and I published an article. But, you know, um, I didn't go much further with it because the collections were very limited. And also it wasn't easy to do things then. So, you know, fast forward to like 2000 and nine and John Unsworth, then dean of what was called the Graduate School of Library Information Sciences here at Illinois, um, got in touch with me and drew me into a project I discovered while we have Google Books now, first of all and then secondarily, and I think just as importantly, it's easy to learn stuff now, like you can just go on the Web and and, you know, search how to do something Mhm. and sort of teach yourself. It did help that I have a little bit of programming background from the eighties, but that was pretty dusty by that point. But, you know, it's just things have gotten to where we have the resources, and it was easy to to teach yourself how to do stuff. Yeah. Yeah.

Ted: Right, it seems like a bit of an unusual turn, but it actually has a long history. Back in the 1990s when I was in grad school, it was already beginning to be clear that there were going to be opportunities, as our digital libraries got bigger, to pose questions about, you know, the evolution of ideas, development of literary form, and I tried to do that a bit in the ’90s using the very limited collections of texts we had then, and I published an article. But, you know, I didn’t go much further with it, because the collections were very limited, and also it wasn’t easy to do things then, so, you know, fast forward to, like 2009, and John Unsworth, then Dean of what was called the Graduate School of Library and Information Sciences here at Illinois, got in touch with me and drew me into a project, and I discovered, “Wow, we have Google Books now,” first of all, and then secondarily, and I think just as importantly, it’s easy to learn stuff now, like you can just go on the web and search how to do something, and sort of teach yourself. It did help that I had a little bit of programming background from the ’80s, but that was pretty dusty by that point. But, you know, things had just gotten to the point where we had the resources, and it was easy to teach yourself how to do stuff.

Great. So, um, I'm gonna ask you possibly an annoying question. Um, which I think every person who works in digital humanities inevitably has to answer at some point. Um, but, uh,

Ben: Great, so, I’m going to ask you possibly an annoying question…

Go for it.

Ted: Go for it.

it's a question that often comes up, which is how you define the digital humanities. And, um, to what extent does that definition matter? Because, um, everybody seems to have their own definition. And, um, inevitably, it leads to perhaps interesting conversation. So I'm wondering what your thoughts are about that conversation.

Ben: …which I think every person who works in the Digital Humanities inevitably has to answer at some point, but, it’s a question that comes up, which is how you define the Digital Humanities, and to what extent does that definition matter, because, everybody seems to have their own definition, and inevitably it leads to, perhaps, interesting conversations. So I’m wondering what your thoughts are about that conversation?

Yeah, yeah, thank you. It's not an annoying question. It's an It's an inevitable, um, an important question, I think, actually, even though it's true that people try to avoid it because, as you say, it's a complex term, a term that people use in different ways, and the slipperiness of the term does generate sometimes friction. And but I don't think that that is an accident or something that we can sort of step around. Its is inherent to the term, and it's not. Actually, it's not accidental that the term is deliberately vague digital humanities. It could encompass, say, you know, using humanistic methods to study podcasts or blog posts or digital media. Generally, um, using traditional humanistic methods, say, to study those things, or it could mean using digital methods whatever those are. Maybe computational methods. Statistics? What have you to study? Well, podcast. But perhaps also printed books, perhaps movies from the 19 fifties. So digital methods to study more traditional media. Or, you know, it could be it could be the way scholarship is produced. It could be digital. Humanities is whatever you're doing. If you put it on the Web, it's digital humanities, and that's that's a valid way of using the term to. So there's a lot of looseness there, and I don't think it's an accident. I think it is deliberate vagueness that's constructed in order to create a concept that is loose enough to be welcoming to welcome lots of different people. Because there's a real danger that there's a lot of tension at this at this intersection between, Mhm. um the traditions of the humanities and computational media computational methods. So if there's a lot of risk that if you say, say, specifically computational humanities or humanities using numbers, some people will be like, Whoa, that is not what I signed up for. Get out of here. You know, that's a that's a real risk. Conversely, if you if you say okay, we're gonna study digital media. Some people will say, Well, I'm actually more interested in the 20th century Mhm. Yeah. more the 19th century, and I'm I'm interested in history. That's what the humanities mean to me. So there's there's these tensions there, and we tried to bridge them by constructing a term that is deliberately baggy and that works somewhat. It's worked to sort of create and a broad community of people and a lively conversation. But we shouldn't be surprised when then that dissolves and breaks apart. It was Mhm. the instability was built into that term from the beginning, so it does matter. It doesn't matter that we understand the term, but we shouldn't be surprised that it it doesn't come down to a crisp definition.

Ted: Yeah, thank you, it’s not an annoying question. It’s an inevitable and important question, I think, actually, even though it’s true that people try to avoid it, because, as you say, it’s a complex term, a term that people use in different ways, and the slipperiness of the term does generate sometimes friction. But I don’t think that that is an accident, you know, or something that we can sort of step around. It is inherent to the term, and it’s not accidental. The term is deliberately vague, digital humanities, it could encompass, say, you know, using humanistic methods to study podcasts or blog posts, or digital media generally using traditional humanistic methods to study those things. Or it could mean using digital methods, whatever those are, maybe computational methods, statistics, what have you, to study, well, podcasts, but perhaps also printed books, perhaps movies from the 1950s. So, digital methods to study more traditional media, or it could be the way scholarship is produced. It could be digital humanities is whatever you’re doing if you put it on the web, it’s digital humanities, and that’s a valid way of using the term too. So there’s a lot of looseness there, and I don’t think it’s an accident. I think it is deliberate vagueness that’s constructed in order to create a concept that is loose enough to be welcoming, to welcome lots of different people, because there’s a real danger, there’s a lot of tension at this intersection between the traditions of the humanities and computational media, computational methods, so there’s a lot of risk that if you say, specifically, computational humanities, or humanities using numbers, some people will be like “Woah, that is not what I signed up, get out of here,” you know, that’s a real risk. Conversely, if you say, okay we’re going to study digital media, some people will say, “Well, I’m actually more interested in the 20th century, or the 19th century, and I’m interested in history, that’s what the humanities mean to me.” So there’s these tensions there, and we’ve tried to bridge them by constructing a term that is deliberately baggy, and that works, somewhat, it’s worked to sort of create a broad community of people and a lively conversation, but we shouldn’t be surprised when that dissolves and breaks apart. It was, the instability was built into that term from the beginning. So it does matter, it does matter that we understand the term, but we shouldn’t be surprised that it doesn’t come down to a crisp definition.

Yeah. Great. Uh, and that kind of leads me to my next question because on our previous episode, we had Spencer Corrales, who is the digital humanities librarian here at the University of Illinois. Um, and they talked about, uh, some of this tension, particularly between research that is published an interactive digital formats, such as on a platform like scalar or America. Um, versus, uh, using digital methods in support of a more traditional scarlet communication, like a journal, article or monograph.

Ben: Yeah, great, and that kind of leads me to my next question, because on our previous episode we had Spencer Keralis, who is the digital humanities librarian here at the University of Illinois, and they talked about some of this tension, particularly between research that is published in an interactive digital format, such as on a platform like Scalar or Omeka, versus using digital methods in support of a more traditional communication, like a journal article or a monograph.

Yeah,

Ted: Yep

Um, do you agree that such a divide exists and what your thoughts are about that, And are there ways that perhaps they could work better in tandem together? Or is it perhaps to be expected, at least, that there might be somewhat of a divide there within digital humanities umbrella?

Ben: Do you agree that such a divide exists and what your thoughts are about that and are there ways that perhaps they could work better in tandem together or is it perhaps to be expected, at least, that there would be somewhat of a divide there within the digital humanities umbrella?

yeah, I mean, I do agree that the divide exists there. It's not a it's not a crisp one. Um, because I mean, even if even if your research is setting out to produce basically articles and books, if you're if you're doing that using digital methods, you're going to have data or code that needs to be preserved. Um, probably online, and you're probably also gonna visualizations. So the lines between sort of traditional scholarly formats and new platforms do get blurry. But I do agree that there, um, there are some people in digital humanities for whom stretching the boundaries of the publication format and of what counts a scholarship to include maybe, you know, digital editing, for instance, rather than thesis driven argument. There are people for whom that's central. And then there are other people who may be, you know, welcome digital editing. But they're primarily interested in producing new arguments, argument driven scholarship. And, um, it's not. There's no necessary conflict between those things, but they rub up against the external world in different places. They conflict with existing institutions in different places. So, for instance, if you're doing, um, you know, uh, if you're building digital exhibitions on a Mecca, then one of the it's very important to redefine what counts a scholarship in terms of sort of promotion and tenure review, because the role of editing and of building collections, um, is often ambiguous, at least at research universities, um, on the other, you know. So that's then becomes the point of friction between digital humanities and the rest of the world. Um, whereas if you're doing, say, like quantitative scholarship, but scholarship that's ultimately going to produce an article, Then the point of friction might be say, like, how do we How do we go about training students to do this? Because it's not in the curriculum. So and frankly, ideally, it would require a sequence of three or four courses, really, To prepare students to do that. Statistics, programming, you know, it can easily be done in one or two courses. So there there's It's not that those two things are in conflict, but that they have sort of different battles to fight. And I do think that that sometimes produces a conflict in the sense that, um, you know, people are like, Hey, we need some help over here. You know, that's that's where the conflict comes from. Mhm.

Ted: Yeah, I mean I do agree that a divide exists there, it’s not a crisp one, um, because, I mean, even if your research is setting out to produce basically articles and books, if you’re doing that using digital methods, you’re going to have data or code that needs to be preserved probably online, and you’re probably also, you know, visualizations. So the lines sort of between traditional scholarly formats and new platforms do get blurry, but I do agree that there are some people in digital humanities for whom stretching the boundaries of the publication format counts as scholarship, to include maybe digital editing, for instance, rather than thesis-driven argument. There are people for whom that’s central. And, then there are people for whom may welcome digital editing, but they’re primarily interested in producing new arguments, argument-driven scholarship. And, there’s no necessary conflict between those things, but they rub against the external world in different places. The conflict with existing institutions in different places. So for instance, if you’re doing, you know, if you’re building digital exhibitions on Omeka, then it’s very important to redefine what counts as scholarship in terms of promotion and tenure review, because the role of editing and building collections is often ambiguous, at least at research universities. That then becomes the point of friction between digital humanities and the rest of the world, whereas if you’re doing, say like, quantitative scholarship but scholarship that’s ultimately going to produce an article, then the point of friction might be, say, how do we go about training students to do this, because it’s not in the curriculum. So, and, frankly, ideally it would require a sequence of three or four courses, really to prepare students to do that, statistics programming, you know, it can’t easily be done in one or two courses. So, it’s not that two things are in conflict, but that they have sort of different battles to fight, and I do think that that produces a conflict in the sense that, you know, people are like, “Hey we need some help over here,” you know, that’s where the conflict comes from.

Yeah. And that leads me to a follow up question, which is, um, as a you know, a professor who teaches courses. Uh, in my experience, there is definitely a certain challenge in digital humanities of training students. Um, in digital methods that, uh, typically it seems like digital humanity. Students come from a humanities background, as opposed to from a computer science background. Typically anyways, um, and so oftentimes there can be quite a learning curve for students. Um, Yep, yep, yep, and often times they can be scared away by the

Ben: Yeah, and that leads me to a follow up question, which is, as a professor who teaches courses, in my experience there is definitely a certain challenges in digital humanities of training students in digital methods that, typically it seems like digital humanities students come from a humanities background as opposed to, from a computer science background, typically anyways.

yep.

Ted: Yeah

challenges involved. So how do you deal with that? And what What are ways we could perhaps do better at that.

Ben: And so oftentimes there can be quite a learning curve for students…

Yeah. I don't think we have a good solution there yet. Actually, um, that is definitely the case. And the, um I've been involving, and I'll tell you what the direction I've been involving on that and I think I'm still evolving. Um, so, like, 10 years ago or nine years ago, back in 2011 2012. Um uh, I had the idea that it would be possible to do all of this and of course, in the English department, which is where I was located then. And maybe we'd have a graduate course in the English department where it would be something like digital humanities or digital methods in literary study. And we'd we'd explore the controversies about the nature of digital humanities and then maybe along the way, introduce students to some programming and statistics and that that is so impossible that that's now Ludacris, right? Like that's actually like five courses that you're trying to compress into one. Mhm. Um, but it it seemed necessary, and in some ways it was necessary at the time because you couldn't assume that students say in an English department, we're going to expect to have to take three or four courses in this area or be willing to, because it was still very new and controversial. So you're going to get one course real realistically that was going to be the curriculum. So there were there were just a limit on what you could actually do. You could not really teach computational methods and, of course, like that. And, um so you know, I've dealt with that partly by expanding my role in university, where now I'm teaching in, um, School of Information Sciences, where there is a bigger D H curriculum. And there are more students who are likely to have taken courses in programming or data science and be able to maybe take a a more advanced course where we apply that to, say, Look at unstructured data, Look at text or images. Um, but it's still it is still definitely a challenge because the real realistically, like I say, it actually would be a three or four course sequence, not a two course sequence. And so, um, there's still considerable risk of of Russian things, and I don't think that I'm avoiding that successfully yet. To be honest with you, Mhm. it's It's going to be, uh, it's sort of like a co evolution between the way we teach this and the way that the the curricular institutions around us sort of mhm frame the topic and what they suggest is possible. There's it's, there's a There's another approach. I should say there isn't. There is another way to go about this, which is very popular, and I just I just don't think it works, either, which is to try to to try to fit it all into one course by basically ditching the programming part and say there are some user friendly tools out there. We can use those. Like, Um, um, buoyant is one tool. Very good. It's about as good as can be done in that space of doing sort of text analysis in your browser on the web. Um, there are There are some other similar tools that promise to be user friendly, and they are to a certain point. But then you rapidly will run up against the limits of what you can actually do in those to graphical user interfaces. So if we do it that way, we can squeeze it into one course or two courses. Maybe, But then then where do students go from? There is what I'm not certain. So it's that is a really big challenge. But I think ideally, um, uh, my my view would be it means what we need to do is maybe define a little better, say, three course sequence in this space.

Ted: Yep [laughs]

Yeah, that's definitely something that I've had to deal with. Um, in my experience, because, um, for listeners who don't know me, uh, was previously, uh, the technician for the Irish Centre, uh, at Southern Illinois University, Edwardsville, which that's digital humanities center there. And, um, we have a digital humanities minor there, and I was actually the first student to receive that minor. Um, but we in working on the curriculum after I received that minor, a big challenge we've had to deal with is, uh Yeah, do students actually need to have programming experience to receive that minor? And I did. But many students, our, um, have some hesitation about that, and it seems like to a certain extent, your there's a I think there's a fear, at least of those who design curriculum, that if you have too much programming involved, you're gonna scare students away. And yeah, yeah, I don't really have a good answer for that, but it's a definite challenge.

Ben: …and oftentimes they can be scared away by the challenges involved, so how do you deal with that, and what are ways we can perhaps do better at that?

I don't either. I mean, the the but here's here's what I see is likely to happen is so we can build programs in the humanities that don't have programming programming as part of them, and that's that can be a valid thing. I'm not against doing that, but there will. If we do that, there will also be programs that arise in the social sciences and in information science and in computer science, for that matter. They're beginning to happen in Department of Computer Science already that do that, use more flexible and adventurous kinds of computation. Then you can easily fit into a graphical tool because, you know, social scientists are used to using statistics and departments of information science. Computer science exists also, so it's like if we don't do it, they definitely will, because they they can. They, you know, and and they and humanities materials, movies, Mhm books, um, are are fascinating. They're you know, they're really appealing, so Department of Computer Science will definitely go for that. They're not going to wait for us to do it, so it's, you know, it's fine that we could do it both ways, but mhm. the computational way is going to is going to happen somewhere. It's just a question of where

Ted: Yeah I don’t think we have a good solution there yet, actually. That is definitely the case, and I’ve been evolving in, I’ll tell you the direction I’ve been evolving on that, and I think I’m still evolving, so like ten years ago, or nine years ago, back in 2011, 2012, I had the idea that it would be possible to do all of this in a course in the English department, which is where I was located then. And maybe we’d have a graduate course in the English department where it would be something like “Digital Humanities” or “Digital Methods in Literary Study,” and we’d, oh we’d explore the controversies about the nature of digital humanities and maybe along the way introduce students to some programming and statistics, and that is so impossible. [Both laugh] That’s now ludicrous, right, that’s actually like five course that you’re trying to compress into one, but it seemed necessary, and in some ways it was necessary at the time, because you couldn’t assume that students, say, in an English department were going to expect to have to take three or four courses in this area, or be willing to, because it was still very new and controversial, so you were going to get one course, realistically that was going to be the curriculum. So there was just a limit on what you could actually do. You could not really teach computational methods in a course like that. And so, you know, I’ve dealt with that partly by expanding my role in the university where now I’m teaching in the School of Information Sciences, where there is a bigger DH curriculum and there are more students who are likely to have taken courses in programming or data science and be able to maybe take a more advanced course to be able to apply that to say look at unstructured data, look at text or images, but it’s still definitely a challenge, because realistically, like I say, it actually would be a three or four course sequence, not a two course sequence. And so, there’s still considerable risk of rushing things, and I don’t think I’m avoiding that successfully yet, to be honest with you. It’s sort of like a coevolution between the way we teach this and the way that the curricular institutions around us sort of frame the topic, and what they suggest is possible. There’s another approach, I should say, there is another way to go about this, which is very popular, and I just don’t think it works either, which is to try to fit it all into one course by basically ditching the programming part, and say, “there are some user friendly tools out there, we can use those.” Like Voyant is one tool, a very good, it’s about as good as can be done in that space of sort of text analysis in your browser on the web. There are some other similar tools that promise to be user friendly, and they are to a certain point, but then you rapidly will run up against the limits of what you can actually do in those, say, graphical user interfaces. So, if we do it that way, we can squeeze it into one course or two courses, maybe, but then where do students go from there, is what I’m not certain. So that is a really big challenge, but I think ideally, my view would be, it means what we need to do is maybe define a little better, say a three course sequence in this space.

Um, and do you think it's better off happening from, like, the end of the humanity is going to the computer or the other way? Or

Ben: Yeah, that’s definitely something that I’ve had to deal with in my experience, because, for listeners who don’t know me, I was previously the technician for the IRIS Center at Southern Illinois University Edwardsville, which that’s the digital humanities center there, and we have a digital humanities minor there, and I was actually the first student to receive that minor, but…

I would love them to be a bridge. I would love it to be a bridge, and I think it can be. I mean, in some ways, that's the That's the promise of information. Science as a as a place is that, do you think that met? Yeah. Mhm. um you can you can have a single institution where really both of those perspectives are represented and are joining hands and collaborating. It can work, you know, It can work across campus to if you've got a humanist in a English department or history department collaborating with someone in computer science that can that can also work. But I like the the, um, feeling of, uh, school, where you've got people from a lot of different disciplinary backgrounds collaborating. So, yeah, I hope it, I hope, were able to hold that all together. But it's, you know, just as with the term digital humanities itself, you end up describing a very big arch or bridge that's fragile at lots of points, and Mhm. it's hard to hold together.

Ted: Okay

Yeah. And, uh, as we've been talking, I realized we haven't really had a whole lot of opportunity for you to talk about, um, very particular research interests. I wanted to give you the opportunity to talk about, um, some of what the computational work you've done in your research.

Ben: …in working on the curriculum after I received that minor, a big challenge we’ve had to deal with is, do students actually need to have programming experience to receive that minor, and I did, but many students are… have some hesitation about that, and it seems like, to a certain extent, I think there’s a fear at least of those that design curriculum that if you have too much programming involved you’re going to scare students away, and…

Yeah. Oh, sure.

Ted: Yeah

I mean, I mean, that's a big question. So yeah,

Ben: …I don’t really have a good answer for that [laughs], but it’s a definite challenge.

I mean, it varies. I'll organize it into had sort of things I've done and things of that, sort of maybe looking forward things that I think are getting exciting. yeah, But, um, a lot of what I've done is to use large digital libraries to pose questions about long timelines in literary history. So you know, how has the pace of narration changed? How much time passes in a typical page of a of a novel? Are we talking about a week of fictional time that passes in each page of reading Or is it the day, or is it sometimes increasingly, in recent years? It's more like two minutes per page. The pace has slowed down, and when that slowdown happened is not something. We had a good picture of a lot of literary critics, for instance, thought that that happened at the beginning of the 20th century with modernism, and now we have a big picture we can see. It was actually much more gradual and similar similar sorts of questions about concreteness, the development of concreteness in fiction. But really, it's, um you know, all these things come together in a way to to make a bigger story about literary history and the study of literature, which is to a large extent, um, our idea of what literature is for has been shaped by certain aspects of literature that only developed very recently like this emphasis on concrete particulars and brief sort of fragmentary moments that we now think is sort of that experiential vividness and particular charities is crucial to the mission of literature, so much so that it shaped the way we we think we ought to be reading and interpreting literature. But it's actually if you look at the big picture, you can see where we got that. It's a long story, a gradual story, Um, and in fact, sort of our idea that you can't use big, big numbers and, you know, panoramas to understand literature is a product of a history that if you back up, we can we can actually see the panorama that generated that. So that's that's the story I've been telling. But I think, um, in years to come, I'm interested in looking at not just at sort of, well, you know, big digital libraries and long surveys across long timelines. But I'm getting increasingly interested in understanding literature in detail, like how plot works, how suspense works, and I think it's going the the nature of machine learning and computation is evolving so rapidly that it's going to become increasingly possible for us to pose some questions that seem like less social science. The more interpretive, um, using machine learning.

Ted: I don’t either, but here’s what I see as likely to happen. We can build programs in the humanities that don’t have programming as part of them, and that can be a valid thing. I’m not against doing that. But if we do that, there will also be programs that arise in the social sciences, and in information science, and in computer science for that matter. They’re beginning to happen in departments of computer science already that use more flexible and adventurous kinds of computation than you can easily fit into a graphical tool, because, you know, social scientists are used to using statistics, and departments of information science and computer science exist also. So, it’s like if we don’t do it, they definitely will, because they can, you know [both laugh]. And humanities materials, movies, books, art, are fascinating, they’re really appealing, so departments of computer science will definitely go for that, they’re not gonna wait for us to do it, so it’s fine that, we could do it both ways, but the computational way is gonna happen somewhere, it’s just a question of where.

yeah, I've seen, um, arguments, and this is largely in a non academic context. So take this kind of with a grain of salt, but that, like, plots are becoming increasingly complex over time in media in general. Um, and, uh, Yeah. Mm mm. at least I think the argument was particularly for TV shows that, like it used to be TV shows were just linear, like, Yeah, beginning to end. And nowadays you have a lot of, like, not just time travel, but flashbacks and nonlinear storytelling. So I'd be interested in questions like that about, like, yeah, yeah, ways of seeing how, um the way narrative is constructed is yeah, changing because the conventional wisdom, oftentimes, like people are getting dumb and like which, which I don't agree with, but yeah. yeah,

Ben: And do you think it’s better off happening from the end of the humanities going to the computer or the other way, do you think…

yeah. No, no, no. It's pretty clear. It's pretty clear, I think. I think there's actually some agreement that TV has gotten more adventurous for a lot. I mean, one thing. There's a There's a classic thesis, and I don't know who to attribute it to, but just the development of the VCR or DVR meant that you could. You could return and pay attention to the texture of television. You could slow down and get the joke, whereas if you're if it's broadcast television in the 19 seventies, it's gone. You know, you you can't you can't return to it. So, yeah, um, that I think we know. But there are going to be all kinds of other questions. Um, there's, um there's some good work that's come out recently in the journal Cultural Analytics, studying the sitcom Uh huh. using face recognition and then, like which characters get attention. Um, when When in the plot, when in the Ark of the sitcom do they get attention? And we're going to be able to to study that kind of formal the formal architecture of TV genres as well as literary genres in, um, in new ways. I think, yeah, Yeah, that's exciting.

Ted: I would love them to be a bridge, I would love it to be a bridge, and I think it can be, I mean in some ways that’s the promise of information science as a place, is that you can have a single institution where really both of those perspectives are represented are joining hands and collaborating. It can work across campus too, if you’ve got a humanist in a[n] English department or history department collaborating with someone in computer science, that can also work. But I like the feeling of a school where you’ve got people from a lot of different disciplinary backgrounds collaborating. So, yeah I hope we’re able to hold that all together, but you know, just as with the term digital humanities itself, you end up describing a very big arc or bridge that’s fragile at lots of points, and it’s hard to hold together.

And, um, that leads me to the last formal question I had prepared, which is, um, and you address this already, But maybe, uh perhaps the big picture, I'm sure. I'm sure there's always more to say. Uh huh. And what? The question is, what do you see as the future of data science and the humanities?

Ben: Yeah, as we’ve been talking, I realize we haven’t really had a whole lot of opportunity for you to talk about your particular research interests, so I wanted to give you the opportunity to talk about some of the computational work you’ve done in your research.

there's more to say Yeah, yeah, yeah, I see it as being really, um capacious, Really? So you know, the kind of thing I've done, like, where it's where it's explicitly quantitative and it's about big historical stories. That's that's never going to be everything humanities are about because we're also about individuals and that's that's valid. We're about individual stories, but I I think on what's happening now, the line between data science and machine learning, or to use the term but sometimes used artificial intelligence. I kind of prefer machine learning, but it's different sides of the same coin that is a very blurry continuum. So, um, and that means that we are not just going to be studying culture in a kind of large scale social science the way, but we're going to be able to, for instance, do the things that, um, large language models do where you can give them the beginning of the story, and then they can continue it like Okay, that's how the story begins in that style, then this would be a plausible next paragraph, or they can do something somewhere to that with images. And that means we can begin to pose questions about the sort of the frame to frame movement of a video or the paragraph to paragraph movement of a story where we start to ask questions about, for instance, what makes some stories more predictable than others like, Is it easier to predict where this plot is going to go? Or we could pose questions about, like, where could this story have alternatively gone? Where could it plausibly have gone? Suppose it's written in the 18 seventies? You know what? What are the plausible alternative endings for this story? What about if we move it forward a decade? You can. We're getting models that are good enough to be able to pose that kind of question, like Mhm. what could have happened in the story under alternate circumstances and that opened. That's going to open up just a huge range of questions that are not limited to the kinds of questions we think of as appropriate for data science or social science. There are more. To be honest, they're akin to creative questions that one of the things about these generative models, generative models of images or of text is they're basically they're doing something like generating the artwork. And I don't think that that means that we're going to, like, you know, have robots, right? All our novels for us, they're they're not that good. Um, and the arc of a whole plot is something they still struggle with. Um, and I'm not sure we want that anyway. Actually, we I think it's more fun. People enjoy fan fiction because they enjoy going back and forth with the story world, right? They enjoy participation. But it does mean that the line between what we think of as analytical or critical tasks and what we think of as creative play could get really blurry. And that, to me suggests that the future of data science and humanities is not like not something that's going to want to keep walled out because it's too much like social science. It actually could be really, um, central to what we think of as things like play that are central to the purpose of the humanities. But for that to happen, it's going to need to become kind of like a lingua franca that we're more comfortable with than we are right now. Um and I think that will happen, actually, because the opportunities are just too huge. How it will happen, you know, where that happens in the curriculum is remains a bit of a mystery.

Ted: Oh sure, I mean…

Yeah. And I think and correct me if I'm wrong. But there is a certain level of, I think, like fear out in the world about, like, artificial intelligence or machine learning. Yeah. I mean, certainly more in the context of facial recognition in the use of like, uh, surveillance or what have you and

Ben: That’s a big question so… [Both laugh]

You are not wrong. You are not wrong.

Ted: …it varies, I’ll organize it under two heads, things I’ve done and thing’s, sort of, maybe looking forward, things that I think are getting exciting. But a lot of what I’ve done is use large digital libraries to pose questions about long timelines in literary history. So, you know, how has the pace of narration changed, how much time passes in a typical page of a novel? Are we talking about a week of fictional time that passes in each page of reading, or is it a day, or is it sometimes increasingly in recent years, it’s more like two minutes per page. The pace has slowed down. And when that slowdown happened, is not something we had a good picture of. A lot of literary critics, for instance, thought that that happened at the beginning of the 20th century with modernism, now that we have a big picture we can see it was much more gradual. And similar sorts of questions about concreteness, the development of concreteness in fiction. And really it’s, you know all of these things come together in a way to make a bigger story about literary history and the study of literature, which is to a large extent, our idea of what literature is for has been shaped by certain aspects of literature that only developed very recently, like this emphasis on concrete particulars and brief, sort of, fragmentary moments that we now think is sort of, that experiential vividness and particularities is crucial to the mission of literature, so much so that it’s shaped the way we think we ought to be reading and interpreting literature. But it’s actually, if you look at the big picture, you can see where we got that. It’s a long story, a gradual story, and in fact, sort of our idea that you can’t use big numbers and panoramas to understand literature is a product of a history that, if you back up, we can actually the panorama that generated that. So, that’s the story I’ve been telling. But I think in years to come, I’m interested in looking at, not just at, sort of, big digital libraries and long surveys across long timelines. But I’m getting increasingly interested in understanding literature in detail, like how plot works, how suspense works, and I think it’s going, the nature of machine learning and of computation is evolving so rapidly that is gonna become increasingly possible for us to pose some questions that seems like less social sciency, more interpretive, using machine learning.

Yes, right. And social media and yeah, yeah, big tech corporations, these kinds of fear are all interwoven. Yes. yeah.

Ben: Yeah, I’ve seen arguments, and this was largely in a non-academic context..

So I imagine that bleeds into the, uh, the academy and hesitation people might have about what you do,

Ted: Yeah

Oh, yes. Oh, yes. It doesn't just bleed into the academy a lot. A lot of that is sort of generated in the academy and and its I mean, if I'm if I'm going to be completely candid about that, partly that's a

Ben: …so take this kind of with a grain of salt, but that, like, plots are becoming increasingly complex over time in media…

Ted: Mmm

Ben: …in general, and, at least I think the argument was particularly for TV shows…

Ted: Yeah

Ben: …that there used to be TV shows were just linear, beginning to end, and nowadays you have a lot like, not just time travel, but flashbacks…

Ted: Yeah

Ben: …and non-linear storytelling, so I’d be interested in questions like that about like, ways of seeing the way how narrative is constructed…

Ted: Yeah

Ben: …is changing, because the conventional wisdom oftentimes is like, “People are getting dumb,” and like…

Ted: [Laughs] Yeah, no, no

Ben: …which, I don’t agree with, but yeah.

Ted: It’s pretty clear, I think there’s actually some agreement that TV has gotten more adventurous, for a lot of, I mean one thing, there’s a classic thesis, and I don’t know who to attribute it to, but just the development of the VCR or DVR meant that you could return and pay attention to the texture of of television, you could slow down and get the joke, whereas if it’s broadcast television in the 1970s…

Ben: [Laughs]

Ted: …it’s gone, you know, you can’t return to it. So yeah, that I think we know, but I think there are gonna be all kinds of other questions. There’s some good work that’s come out recently in the Journal of Cultural Analytics studying the sitcom using face recognition, and they’re like, “Which characters get attention, when in the plot, when in the arc of the sitcom do they get attention,” and we’re gonna be able to study that kind of formal, the formal architecture of TV genres, as well as literary genres, in new ways, I think, yeah.

Ben: Yeah, that’s exciting, and that leads me to the last formal question I had prepared, which is, and you addressed it already, but perhaps big picture

Ted: There’s more to say.

Ben: I’m sure, there’s always more to say.

Ted: Yeah.

Ben: And the question is, what do you see as the future of data science in the humanities?

Ted: Yeah, I see it as being really capacious, really, so, you know, the kind of thing I’ve done, like, where it’s explicitly quantitative and it’s about big, historical stories, that’s never going to be everything the humanities are about, because we’re also about individuals, and that’s valid. We’re about individual stories. But I think, what’s happening now, the line between data science and machine learning, or, to use a term that’s sometimes used, artificial intelligence. I kind of prefer machine learning, but it’s different sides of the same coin, that is a very blurry continuum. And that means that we are not just going to be studying culture in a kind of large-scale social sciency way, but we’re going to be able to, for instance, do the things that large language models do, where you can give them the beginning of a story, and then they continue it. Like okay, if that’s how the story begins in that style, then this would be a plausible next paragraph. Or they can do something similar to that with images, and that means we can begin to pose questions about the sort of, the frame-to-frame movement of a video, or the paragraph-to-paragraph movement of a story, where we start to ask questions about, for instance, what makes some stories more predictable than others? Like is it easier to predict where this plot is gonna go? Or we can pose questions about, like, where could this story have alternatively gone? Where could it plausibly have gone, suppose it’s written in the 1870s, you know, what are the plausible alternative endings for this story. What about if we move it forward a decade? You can, we’re getting models that are good enough to be able to pose that kind of question, like what could have happened in the story under alternate circumstances? And that’s gonna open up just a huge range of questions that are not limited to the kinds of questions we think of as appropriate for data science or social science. They’re more, to be honest, they’re akin to creative questions. One of the things about these generative models, generative models of images or of text, is they’re basically, they’re doing something like generating the artwork. Now I don’t think that that means that we’re gonna, like, you know, have robots write all our novels for us. [Ben laughs] They’re not that good, and the arc of a whole plot is something they still struggle with, and I’m not sure we want that anyway, actually. I think it’s more fun, people enjoy fan fiction because they enjoy going back and forth with a story world, right. They enjoy participation. But it does mean that the line between, what we think of as analytical or critical tasks, and what we think of as creative play, could get really blurry, and that to me suggests that the future of data science in the humanities is not something we’re gonna want to keep walled out because it’s too much like social science, it actually could be really central to what we think of as things like play, that are central to the purpose of the humanities. But for that to happen, it’s gonna need to become kind of like a lingua franca that we’re comfortable than we are right now. And I think that will happen actually because the opportunities are just too huge, but how it will happen, you know, where that happens in the curriculum, remains a bit of a mystery.

Ben: Yeah, and I think, and.. correct me if I’m wrong, but there is certain level of I think fear out in the world about like artificial intelligence or machine learning…

Ted: You are not wrong. [Both laugh]

Ben: …I mean certainly more in like the context of facial recognition.

Ted: Yes

Ben: …and the use of like surveillance or what have you…

Ted: Right and social media…

Ben: Yeah

Ted: …big tech corporations. These kinds of fear are all interwoven, yes.

Ben: Yeah so, I imagine that bleeds into the academy…

Ted: Oh yes

Ben: …in hesitations people might have about um…

Ted: Oh yes

Ben: …what you do.

Ted: It doesn’t just bleed into the academy, a lot of that is sort of generated in the academy…

Ben: Right

Ted: …and it’s, I mean if I’m gonna be completely candid about that, partly that’s a reflection of an emerging competition between universities and tech companies, which are both like, there’s this space which is like, intellectual institutions and society, which universities have had kind of a monopoly there, like oh there’s gonna be research in computer science, we’re going to be doing it. Now the tech companies are kind of claiming to be the leading edge of CS research, which does not make universities comfortable, and so that’s, part of the anxiety is fueled by just sort of general social things with social media and fears of surveillance. But within the academic context, we also have to be candid that Google is a competitor for universities, and it’s not surprising that university professors like myself are real… wary of it. So, you know, but I think it’s also valid, that, to be sure, you know the steam engine was socially disruptive and was a problem and magnified social problems and was not handled well, and is machine learning going to do all those things too? Magnify social problems, increase concentration of power, not be handled well, need kinds of regulation that we don’t yet have, yes. I mean all of that will be true, for sure. So it’s going to be, at the same time, I also think everything

Diffs salvos

Texto original

Abrir arquivo

Hello and welcome back to another episode of it takes a campus. My name is Ben, and I am currently a graduate assistant at Scali Commons. And today I'm joined with Dr Ted Underwood, who is a professor at the high school here at the University of Illinois. Dr. Underwood, welcome to the podcast. Thank you for taking time to talk to me today. 
Hi, Ben. It's a pleasure to be here,
So I want to get started by, uh, asking, uh, how you got started integrating digital methods into your research. Since as I understand it, your formal academic background is in English literature. 
right? It's, um it seems like a bit of an unusual turn, but it actually has a long history. Back in the 19 nineties, when I was in grad school, it was already beginning to be clear that there were going to be opportunities as our digital libraries got bigger to pose questions about. Oh, you know, the evolution of ideas, development of literary form. And I tried to do that a bit in the nineties, using the very limited collections of texts we had then and I published an article. But, you know, um, I didn't go much further with it because the collections were very limited. And also it wasn't easy to do things then. So, you know, fast forward to like 2000 and nine and John Unsworth, then dean of what was called the Graduate School of Library Information Sciences here at Illinois, um, got in touch with me and drew me into a project I discovered while we have Google Books now, first of all and then secondarily, and I think just as importantly, it's easy to learn stuff now, like you can just go on the Web and and, you know, search how to do something Mhm. and sort of teach yourself. It did help that I have a little bit of programming background from the eighties, but that was pretty dusty by that point. But, you know, it's just things have gotten to where we have the resources, and it was easy to to teach yourself how to do stuff. Yeah. Yeah.
Great. So, um, I'm gonna ask you possibly an annoying question. Um, which I think every person who works in digital humanities inevitably has to answer at some point. Um, but, uh,
Go for it.
it's a question that often comes up, which is how you define the digital humanities. And, um, to what extent does that definition matter? Because, um, everybody seems to have their own definition. And, um, inevitably, it leads to perhaps interesting conversation. So I'm wondering what your thoughts are about that conversation.
Yeah, yeah, thank you. It's not an annoying question. It's an It's an inevitable, um, an important question, I think, actually, even though it's true that people try to avoid it because, as you say, it's a complex term, a term that people use in different ways, and the slipperiness of the term does generate sometimes friction. And but I don't think that that is an accident or something that we can sort of step around. Its is inherent to the term, and it's not. Actually, it's not accidental that the term is deliberately vague digital humanities. It could encompass, say, you know, using humanistic methods to study podcasts or blog posts or digital media. Generally, um, using traditional humanistic methods, say, to study those things, or it could mean using digital methods whatever those are. Maybe computational methods. Statistics? What have you to study? Well, podcast. But perhaps also printed books, perhaps movies from the 19 fifties. So digital methods to study more traditional media. Or, you know, it could be it could be the way scholarship is produced. It could be digital. Humanities is whatever you're doing. If you put it on the Web, it's digital humanities, and that's that's a valid way of using the term to. So there's a lot of looseness there, and I don't think it's an accident. I think it is deliberate vagueness that's constructed in order to create a concept that is loose enough to be welcoming to welcome lots of different people. Because there's a real danger that there's a lot of tension at this at this intersection between, Mhm. um the traditions of the humanities and computational media computational methods. So if there's a lot of risk that if you say, say, specifically computational humanities or humanities using numbers, some people will be like, Whoa, that is not what I signed up for. Get out of here. You know, that's a that's a real risk. Conversely, if you if you say okay, we're gonna study digital media. Some people will say, Well, I'm actually more interested in the 20th century Mhm. Yeah. more the 19th century, and I'm I'm interested in history. That's what the humanities mean to me. So there's there's these tensions there, and we tried to bridge them by constructing a term that is deliberately baggy and that works somewhat. It's worked to sort of create and a broad community of people and a lively conversation. But we shouldn't be surprised when then that dissolves and breaks apart. It was Mhm. the instability was built into that term from the beginning, so it does matter. It doesn't matter that we understand the term, but we shouldn't be surprised that it it doesn't come down to a crisp definition.
Yeah. Great. Uh, and that kind of leads me to my next question because on our previous episode, we had Spencer Corrales, who is the digital humanities librarian here at the University of Illinois. Um, and they talked about, uh, some of this tension, particularly between research that is published an interactive digital formats, such as on a platform like scalar or America. Um, versus, uh, using digital methods in support of a more traditional scarlet communication, like a journal, article or monograph.
Yeah,
Um, do you agree that such a divide exists and what your thoughts are about that, And are there ways that perhaps they could work better in tandem together? Or is it perhaps to be expected, at least, that there might be somewhat of a divide there within digital humanities umbrella?
yeah, I mean, I do agree that the divide exists there. It's not a it's not a crisp one. Um, because I mean, even if even if your research is setting out to produce basically articles and books, if you're if you're doing that using digital methods, you're going to have data or code that needs to be preserved. Um, probably online, and you're probably also gonna visualizations. So the lines between sort of traditional scholarly formats and new platforms do get blurry. But I do agree that there, um, there are some people in digital humanities for whom stretching the boundaries of the publication format and of what counts a scholarship to include maybe, you know, digital editing, for instance, rather than thesis driven argument. There are people for whom that's central. And then there are other people who may be, you know, welcome digital editing. But they're primarily interested in producing new arguments, argument driven scholarship. And, um, it's not. There's no necessary conflict between those things, but they rub up against the external world in different places. They conflict with existing institutions in different places. So, for instance, if you're doing, um, you know, uh, if you're building digital exhibitions on a Mecca, then one of the it's very important to redefine what counts a scholarship in terms of sort of promotion and tenure review, because the role of editing and of building collections, um, is often ambiguous, at least at research universities, um, on the other, you know. So that's then becomes the point of friction between digital humanities and the rest of the world. Um, whereas if you're doing, say, like quantitative scholarship, but scholarship that's ultimately going to produce an article, Then the point of friction might be say, like, how do we How do we go about training students to do this? Because it's not in the curriculum. So and frankly, ideally, it would require a sequence of three or four courses, really, To prepare students to do that. Statistics, programming, you know, it can easily be done in one or two courses. So there there's It's not that those two things are in conflict, but that they have sort of different battles to fight. And I do think that that sometimes produces a conflict in the sense that, um, you know, people are like, Hey, we need some help over here. You know, that's that's where the conflict comes from. Mhm.
Yeah. And that leads me to a follow up question, which is, um, as a you know, a professor who teaches courses. Uh, in my experience, there is definitely a certain challenge in digital humanities of training students. Um, in digital methods that, uh, typically it seems like digital humanity. Students come from a humanities background, as opposed to from a computer science background. Typically anyways, um, and so oftentimes there can be quite a learning curve for students. Um, Yep, yep, yep, and often times they can be scared away by the
yep.
challenges involved. So how do you deal with that? And what What are ways we could perhaps do better at that.
Yeah. I don't think we have a good solution there yet. Actually, um, that is definitely the case. And the, um I've been involving, and I'll tell you what the direction I've been involving on that and I think I'm still evolving. Um, so, like, 10 years ago or nine years ago, back in 2011 2012. Um uh, I had the idea that it would be possible to do all of this and of course, in the English department, which is where I was located then. And maybe we'd have a graduate course in the English department where it would be something like digital humanities or digital methods in literary study. And we'd we'd explore the controversies about the nature of digital humanities and then maybe along the way, introduce students to some programming and statistics and that that is so impossible that that's now Ludacris, right? Like that's actually like five courses that you're trying to compress into one. Mhm. Um, but it it seemed necessary, and in some ways it was necessary at the time because you couldn't assume that students say in an English department, we're going to expect to have to take three or four courses in this area or be willing to, because it was still very new and controversial. So you're going to get one course real realistically that was going to be the curriculum. So there were there were just a limit on what you could actually do. You could not really teach computational methods and, of course, like that. And, um so you know, I've dealt with that partly by expanding my role in university, where now I'm teaching in, um, School of Information Sciences, where there is a bigger D H curriculum. And there are more students who are likely to have taken courses in programming or data science and be able to maybe take a a more advanced course where we apply that to, say, Look at unstructured data, Look at text or images. Um, but it's still it is still definitely a challenge because the real realistically, like I say, it actually would be a three or four course sequence, not a two course sequence. And so, um, there's still considerable risk of of Russian things, and I don't think that I'm avoiding that successfully yet. To be honest with you, Mhm. it's It's going to be, uh, it's sort of like a co evolution between the way we teach this and the way that the the curricular institutions around us sort of mhm frame the topic and what they suggest is possible. There's it's, there's a There's another approach. I should say there isn't. There is another way to go about this, which is very popular, and I just I just don't think it works, either, which is to try to to try to fit it all into one course by basically ditching the programming part and say there are some user friendly tools out there. We can use those. Like, Um, um, buoyant is one tool. Very good. It's about as good as can be done in that space of doing sort of text analysis in your browser on the web. Um, there are There are some other similar tools that promise to be user friendly, and they are to a certain point. But then you rapidly will run up against the limits of what you can actually do in those to graphical user interfaces. So if we do it that way, we can squeeze it into one course or two courses. Maybe, But then then where do students go from? There is what I'm not certain. So it's that is a really big challenge. But I think ideally, um, uh, my my view would be it means what we need to do is maybe define a little better, say, three course sequence in this space.
Yeah, that's definitely something that I've had to deal with. Um, in my experience, because, um, for listeners who don't know me, uh, was previously, uh, the technician for the Irish Centre, uh, at Southern Illinois University, Edwardsville, which that's digital humanities center there. And, um, we have a digital humanities minor there, and I was actually the first student to receive that minor. Um, but we in working on the curriculum after I received that minor, a big challenge we've had to deal with is, uh Yeah, do students actually need to have programming experience to receive that minor? And I did. But many students, our, um, have some hesitation about that, and it seems like to a certain extent, your there's a I think there's a fear, at least of those who design curriculum, that if you have too much programming involved, you're gonna scare students away. And yeah, yeah, I don't really have a good answer for that, but it's a definite challenge.
I don't either. I mean, the the but here's here's what I see is likely to happen is so we can build programs in the humanities that don't have programming programming as part of them, and that's that can be a valid thing. I'm not against doing that, but there will. If we do that, there will also be programs that arise in the social sciences and in information science and in computer science, for that matter. They're beginning to happen in Department of Computer Science already that do that, use more flexible and adventurous kinds of computation. Then you can easily fit into a graphical tool because, you know, social scientists are used to using statistics and departments of information science. Computer science exists also, so it's like if we don't do it, they definitely will, because they they can. They, you know, and and they and humanities materials, movies, Mhm books, um, are are fascinating. They're you know, they're really appealing, so Department of Computer Science will definitely go for that. They're not going to wait for us to do it, so it's, you know, it's fine that we could do it both ways, but mhm. the computational way is going to is going to happen somewhere. It's just a question of where
Um, and do you think it's better off happening from, like, the end of the humanity is going to the computer or the other way? Or
I would love them to be a bridge. I would love it to be a bridge, and I think it can be. I mean, in some ways, that's the That's the promise of information. Science as a as a place is that, do you think that met? Yeah. Mhm. um you can you can have a single institution where really both of those perspectives are represented and are joining hands and collaborating. It can work, you know, It can work across campus to if you've got a humanist in a English department or history department collaborating with someone in computer science that can that can also work. But I like the the, um, feeling of, uh, school, where you've got people from a lot of different disciplinary backgrounds collaborating. So, yeah, I hope it, I hope, were able to hold that all together. But it's, you know, just as with the term digital humanities itself, you end up describing a very big arch or bridge that's fragile at lots of points, and Mhm. it's hard to hold together.
Yeah. And, uh, as we've been talking, I realized we haven't really had a whole lot of opportunity for you to talk about, um, very particular research interests. I wanted to give you the opportunity to talk about, um, some of what the computational work you've done in your research.
Yeah. Oh, sure.
I mean, I mean, that's a big question. So yeah,
I mean, it varies. I'll organize it into had sort of things I've done and things of that, sort of maybe looking forward things that I think are getting exciting. yeah, But, um, a lot of what I've done is to use large digital libraries to pose questions about long timelines in literary history. So you know, how has the pace of narration changed? How much time passes in a typical page of a of a novel? Are we talking about a week of fictional time that passes in each page of reading Or is it the day, or is it sometimes increasingly, in recent years? It's more like two minutes per page. The pace has slowed down, and when that slowdown happened is not something. We had a good picture of a lot of literary critics, for instance, thought that that happened at the beginning of the 20th century with modernism, and now we have a big picture we can see. It was actually much more gradual and similar similar sorts of questions about concreteness, the development of concreteness in fiction. But really, it's, um you know, all these things come together in a way to to make a bigger story about literary history and the study of literature, which is to a large extent, um, our idea of what literature is for has been shaped by certain aspects of literature that only developed very recently like this emphasis on concrete particulars and brief sort of fragmentary moments that we now think is sort of that experiential vividness and particular charities is crucial to the mission of literature, so much so that it shaped the way we we think we ought to be reading and interpreting literature. But it's actually if you look at the big picture, you can see where we got that. It's a long story, a gradual story, Um, and in fact, sort of our idea that you can't use big, big numbers and, you know, panoramas to understand literature is a product of a history that if you back up, we can we can actually see the panorama that generated that. So that's that's the story I've been telling. But I think, um, in years to come, I'm interested in looking at not just at sort of, well, you know, big digital libraries and long surveys across long timelines. But I'm getting increasingly interested in understanding literature in detail, like how plot works, how suspense works, and I think it's going the the nature of machine learning and computation is evolving so rapidly that it's going to become increasingly possible for us to pose some questions that seem like less social science. The more interpretive, um, using machine learning.
yeah, I've seen, um, arguments, and this is largely in a non academic context. So take this kind of with a grain of salt, but that, like, plots are becoming increasingly complex over time in media in general. Um, and, uh, Yeah. Mm mm. at least I think the argument was particularly for TV shows that, like it used to be TV shows were just linear, like, Yeah, beginning to end. And nowadays you have a lot of, like, not just time travel, but flashbacks and nonlinear storytelling. So I'd be interested in questions like that about, like, yeah, yeah, ways of seeing how, um the way narrative is constructed is yeah, changing because the conventional wisdom, oftentimes, like people are getting dumb and like which, which I don't agree with, but yeah. yeah,
yeah. No, no, no. It's pretty clear. It's pretty clear, I think. I think there's actually some agreement that TV has gotten more adventurous for a lot. I mean, one thing. There's a There's a classic thesis, and I don't know who to attribute it to, but just the development of the VCR or DVR meant that you could. You could return and pay attention to the texture of television. You could slow down and get the joke, whereas if you're if it's broadcast television in the 19 seventies, it's gone. You know, you you can't you can't return to it. So, yeah, um, that I think we know. But there are going to be all kinds of other questions. Um, there's, um there's some good work that's come out recently in the journal Cultural Analytics, studying the sitcom Uh huh. using face recognition and then, like which characters get attention. Um, when When in the plot, when in the Ark of the sitcom do they get attention? And we're going to be able to to study that kind of formal the formal architecture of TV genres as well as literary genres in, um, in new ways. I think, yeah, Yeah, that's exciting.
And, um, that leads me to the last formal question I had prepared, which is, um, and you address this already, But maybe, uh perhaps the big picture, I'm sure. I'm sure there's always more to say. Uh huh. And what? The question is, what do you see as the future of data science and the humanities?
there's more to say Yeah, yeah, yeah, I see it as being really, um capacious, Really? So you know, the kind of thing I've done, like, where it's where it's explicitly quantitative and it's about big historical stories. That's that's never going to be everything humanities are about because we're also about individuals and that's that's valid. We're about individual stories, but I I think on what's happening now, the line between data science and machine learning, or to use the term but sometimes used artificial intelligence. I kind of prefer machine learning, but it's different sides of the same coin that is a very blurry continuum. So, um, and that means that we are not just going to be studying culture in a kind of large scale social science the way, but we're going to be able to, for instance, do the things that, um, large language models do where you can give them the beginning of the story, and then they can continue it like Okay, that's how the story begins in that style, then this would be a plausible next paragraph, or they can do something somewhere to that with images. And that means we can begin to pose questions about the sort of the frame to frame movement of a video or the paragraph to paragraph movement of a story where we start to ask questions about, for instance, what makes some stories more predictable than others like, Is it easier to predict where this plot is going to go? Or we could pose questions about, like, where could this story have alternatively gone? Where could it plausibly have gone? Suppose it's written in the 18 seventies? You know what? What are the plausible alternative endings for this story? What about if we move it forward a decade? You can. We're getting models that are good enough to be able to pose that kind of question, like Mhm. what could have happened in the story under alternate circumstances and that opened. That's going to open up just a huge range of questions that are not limited to the kinds of questions we think of as appropriate for data science or social science. There are more. To be honest, they're akin to creative questions that one of the things about these generative models, generative models of images or of text is they're basically they're doing something like generating the artwork. And I don't think that that means that we're going to, like, you know, have robots, right? All our novels for us, they're they're not that good. Um, and the arc of a whole plot is something they still struggle with. Um, and I'm not sure we want that anyway. Actually, we I think it's more fun. People enjoy fan fiction because they enjoy going back and forth with the story world, right? They enjoy participation. But it does mean that the line between what we think of as analytical or critical tasks and what we think of as creative play could get really blurry. And that, to me suggests that the future of data science and humanities is not like not something that's going to want to keep walled out because it's too much like social science. It actually could be really, um, central to what we think of as things like play that are central to the purpose of the humanities. But for that to happen, it's going to need to become kind of like a lingua franca that we're more comfortable with than we are right now. Um and I think that will happen, actually, because the opportunities are just too huge. How it will happen, you know, where that happens in the curriculum is remains a bit of a mystery.
Yeah. And I think and correct me if I'm wrong. But there is a certain level of, I think, like fear out in the world about, like, artificial intelligence or machine learning. Yeah. I mean, certainly more in the context of facial recognition in the use of like, uh, surveillance or what have you and
You are not wrong. You are not wrong.
Yes, right. And social media and yeah, yeah, big tech corporations, these kinds of fear are all interwoven. Yes. yeah.
So I imagine that bleeds into the, uh, the academy and hesitation people might have about what you do,
Oh, yes. Oh, yes. It doesn't just bleed into the academy a lot. A lot of that is sort of generated in the academy and and its I mean, if I'm if I'm going to be completely candid about that, partly that's a reflection of a emerging competition between universities and tech companies, right? which are both like there's this space which is like intellectual institutions in society, which universities had kind of a monopoly. They're like, Oh, there's going to be research in computer science. We're gonna be doing it now. The tech companies are kind of claiming to be the leading edge of CS research, which does not make universities comfortable and So that's part of the anxiety is fueled by just sort of general social things, with social media and fears of surveillance. But within the academic context, we also have to be candid that, like Google, is a competitor for universities. And it's not surprising that university professors like myself are real, Um, Yeah. wary of of it. So you know. But I think it's also it's also valid, right that that should be sure. Like, um, if, you know, the steam engine was socially disruptive and was a problem and magnified social problems and was not handled well and his machine learning going to do all those things to magnify social problems, increased concentration of power, um, not be handled well, need need kinds of regulation that we don't yet have. Yes, I mean, all of that will be true, for sure. So it's, um it's going to be at the same time. I also think everything that I said earlier is true, that it it can like magnified human creativity and become a way of we understand human creativity and be a kind of collaborative space for human creativity. So it's just it. It's not an either or but it means that there's a really interesting, complex, struggling conversation that plays out there. Yeah.
It almost seems like perhaps the attitude is that if we don't use it or don't acknowledge it, it doesn't exist, or we don't have to worry about it then. Or maybe that's oversimplifying it. But Yeah, we're better off at least like our don't know, claiming it or taking ownership of it. Or yeah, I think I think I mean, yeah, at least figuring out how it works. So, like, 
Yeah, that for sure. Like I mean, I think everyone would agree about that. Okay, Yeah. Um, it's a real bad idea for people not to understand how machine learning works that I think we can all agree about that. Where to go next is sort of, um, is where things get a little complicated. But I'm I'm not really sure yet that we have, like that we have a policy disagreement about, like, how should machine learning be regulated? I think generally speaking, a lot of when, when the when the conversation actually gets that concrete. There's often a lot of agreement, but it's before we get to that level. When we're talking about like, what attitude should we adopt a machine learning before it really gets to the concrete level of What should we do? Then? There's a lot of tension because people have very different attitudes and very different. Honestly, it's an emotional thing, like, how do we I have when I look at sort of a big new language model I have a feeling of of, of excitement and joy and like it's spring and there's gonna be new flowers coming up and I don't know what they are. But people I know that people do not have that feeling and they look at the, uh, Uh huh. a large language model, and I understand that. But, um, it's not. I don't think that's purely like an intellectual debate where it's before. It's even sort of like before we've gotten to the stage of framing an intellectual debate about it. It's that people just have at this stage very different. Um, kind of I guess the technical term would be prior Zor instincts. Um, yeah.
Perhaps a gut reaction of some sort or
Gut reaction. Yeah, Yeah, yeah, yeah. And yeah, and I understand I do understand where the other gut reaction is coming from because big tech companies are scary, legit, scary and also ways governments can use machine learning and will use machine learning. I think are highly scary. Um, so that's all true? Um, it's it's just, uh, holding. You know, where we how we hold those things. Intention, right, Because it's there's always a dark and a bright side to everything, including, like, you know, the human body, right? If it has problems, so, um But it's how you hold those things. Intention? That remains to be seen. Mhm.
Right. Um, the title of our podcast is it takes a campus. So I feel like I should ask something about, like, the, um I guess the campus environment in which you operate because at least come someone speaking as someone who came from at the end of the university that had D. H in various forms. Um Yeah, uh, University of Illinois is so big that it almost inevitably becomes, in some sense, siloed, um, black A better way of putting it. Uh, and maybe that's not accurate. I don't know. I'm still fairly new. Yeah, um, but yeah, it feels like, um, that's a definite issue for digital. Humanities is there's a certain level of styling that occurs. And how do we deal with that?
no jacket. Yeah, So they're smiling within with sort of a university from other universities. And then there's different communities within the campus. The the The thing I like to say about the University of Illinois is that it's like a Kafka novel in that or a short story by Kostka in that somewhere on campus, there's an amazing resource meant for you and you alone. But you may or may not ever discover the door where it's hidden behind. So there's so much going on here, but that that means that people are not necessarily always in communication. I think that that is a that is a challenge for th it's a you know there are. There are connections between In particular, I would say the high school and um, H R. I, which has its own humanities Research Institute. If I'm getting the acronym right used to be I P R H um, which is sort of in a different part of campus. But we're both doing digital humanities in different ways and in some communication with each other. And then there's communication with the outside world, which I think is also important. And I think I think this program training in digital methods for humanists, that Senator H. R. I has done a bit to encourage people to go out to. You know, summer institutes say where they can be in communication with people at other universities. And I do think that's important because otherwise you get you fall behind honestly. And it's particularly challenging for all universities, not just for Illinois right now, because there's not a lot of hiring happening in the humanity. So it is easy to fall behind. Actually, if we don't consciously refresh our experience, Um, I do think that's a challenge, But I'm optimistic that who will build the needed bridges?
Yeah, well, great. Um well, that's largely what I had prepared today. Um, but I want to give you the opportunity to speak to anything. You feel like we should have talked about. I mean, that's a pretty open door. So, uh,
No, those were Those were great questions, actually. And I got a chance to go off on how large language models are, like the approach of spraying with unknown flowers coming up. And that that's what I wanted to say. So yeah, if yeah, well, I'm glad you got the opportunity to say that.
Yeah, it's not. It's not, uh, army from Mordor. It's, uh yeah, mhm,
Yeah, yeah. I mean, you know, it's also an army from Mordor, but, you know, it's both both things. yeah. Uh huh, Yeah,
Well, the world is complex. yeah, Um, but well, thank you so much for taking the time to talk to me today. I really enjoyed our conversation. Um, and, uh yeah, oh. I look forward to, uh, talking to you more in class later. Uh, full disclosure to the podcast. Dr. Underwood is currently my professor for data science in the communities Course. Um, mhm.
It's fair to disclose. yeah.
Yes, Thanks.
It's it's It's been a It's been a pleasure talking to you, too.
yes. Yep. And I we'll talk again later.

Texto alterado

Abrir arquivo

Ben Ostermeier: Hello and welcome back to another episode of “It Takes a Campus.” My name is Ben, and I am currently a graduate assistant at the Scholarly Commons, and today I am joined with Dr. Ted Underwood, who is a professor at the iSchool here at the University of Illinois. Dr. Underwood, welcome to the podcast and thank you for taking time to talk to me today.
Ted Underwood: Hi Ben, it’s a pleasure to be here.
Ben: I wanted to get started by asking how you got started integrating digital methods into your research, since as I understand it your formal academic background is in English literature.
Ted: Right, it seems like a bit of an unusual turn, but it actually has a long history. Back in the 1990s when I was in grad school, it was already beginning to be clear that there were going to be opportunities, as our digital libraries got bigger, to pose questions about, you know, the evolution of ideas, development of literary form, and I tried to do that a bit in the ’90s using the very limited collections of texts we had then, and I published an article. But, you know, I didn’t go much further with it, because the collections were very limited, and also it wasn’t easy to do things then, so, you know, fast forward to, like 2009, and John Unsworth, then Dean of what was called the Graduate School of Library and Information Sciences here at Illinois, got in touch with me and drew me into a project, and I discovered, “Wow, we have Google Books now,” first of all, and then secondarily, and I think just as importantly, it’s easy to learn stuff now, like you can just go on the web and search how to do something, and sort of teach yourself. It did help that I had a little bit of programming background from the ’80s, but that was pretty dusty by that point. But, you know, things had just gotten to the point where we had the resources, and it was easy to teach yourself how to do stuff.
Ben: Great, so, I’m going to ask you possibly an annoying question…
Ted: Go for it.
Ben: …which I think every person who works in the Digital Humanities inevitably has to answer at some point, but, it’s a question that comes up, which is how you define the Digital Humanities, and to what extent does that definition matter, because, everybody seems to have their own definition, and inevitably it leads to, perhaps, interesting conversations. So I’m wondering what your thoughts are about that conversation?
Ted: Yeah, thank you, it’s not an annoying question. It’s an inevitable and important question, I think, actually, even though it’s true that people try to avoid it, because, as you say, it’s a complex term, a term that people use in different ways, and the slipperiness of the term does generate sometimes friction. But I don’t think that that is an accident, you know, or something that we can sort of step around. It is inherent to the term, and it’s not accidental. The term is deliberately vague, digital humanities, it could encompass, say, you know, using humanistic methods to study podcasts or blog posts, or digital media generally using traditional humanistic methods to study those things. Or it could mean using digital methods, whatever those are, maybe computational methods, statistics, what have you, to study, well, podcasts, but perhaps also printed books, perhaps movies from the 1950s. So, digital methods to study more traditional media, or it could be the way scholarship is produced. It could be digital humanities is whatever you’re doing if you put it on the web, it’s digital humanities, and that’s a valid way of using the term too. So there’s a lot of looseness there, and I don’t think it’s an accident. I think it is deliberate vagueness that’s constructed in order to create a concept that is loose enough to be welcoming, to welcome lots of different people, because there’s a real danger, there’s a lot of tension at this intersection between the traditions of the humanities and computational media, computational methods, so there’s a lot of risk that if you say, specifically, computational humanities, or humanities using numbers, some people will be like “Woah, that is not what I signed up, get out of here,” you know, that’s a real risk. Conversely, if you say, okay we’re going to study digital media, some people will say, “Well, I’m actually more interested in the 20th century, or the 19th century, and I’m interested in history, that’s what the humanities mean to me.” So there’s these tensions there, and we’ve tried to bridge them by constructing a term that is deliberately baggy, and that works, somewhat, it’s worked to sort of create a broad community of people and a lively conversation, but we shouldn’t be surprised when that dissolves and breaks apart. It was, the instability was built into that term from the beginning. So it does matter, it does matter that we understand the term, but we shouldn’t be surprised that it doesn’t come down to a crisp definition.
Ben: Yeah, great, and that kind of leads me to my next question, because on our previous episode we had Spencer Keralis, who is the digital humanities librarian here at the University of Illinois, and they talked about some of this tension, particularly between research that is published in an interactive digital format, such as on a platform like Scalar or Omeka, versus using digital methods in support of a more traditional communication, like a journal article or a monograph.
Ted: Yep
Ben: Do you agree that such a divide exists and what your thoughts are about that and are there ways that perhaps they could work better in tandem together or is it perhaps to be expected, at least, that there would be somewhat of a divide there within the digital humanities umbrella?
Ted: Yeah, I mean I do agree that a divide exists there, it’s not a crisp one, um, because, I mean, even if your research is setting out to produce basically articles and books, if you’re doing that using digital methods, you’re going to have data or code that needs to be preserved probably online, and you’re probably also, you know, visualizations. So the lines sort of between traditional scholarly formats and new platforms do get blurry, but I do agree that there are some people in digital humanities for whom stretching the boundaries of the publication format counts as scholarship, to include maybe digital editing, for instance, rather than thesis-driven argument. There are people for whom that’s central. And, then there are people for whom may welcome digital editing, but they’re primarily interested in producing new arguments, argument-driven scholarship. And, there’s no necessary conflict between those things, but they rub against the external world in different places. The conflict with existing institutions in different places. So for instance, if you’re doing, you know, if you’re building digital exhibitions on Omeka, then it’s very important to redefine what counts as scholarship in terms of promotion and tenure review, because the role of editing and building collections is often ambiguous, at least at research universities. That then becomes the point of friction between digital humanities and the rest of the world, whereas if you’re doing, say like, quantitative scholarship but scholarship that’s ultimately going to produce an article, then the point of friction might be, say, how do we go about training students to do this, because it’s not in the curriculum. So, and, frankly, ideally it would require a sequence of three or four courses, really to prepare students to do that, statistics programming, you know, it can’t easily be done in one or two courses. So, it’s not that two things are in conflict, but that they have sort of different battles to fight, and I do think that that produces a conflict in the sense that, you know, people are like, “Hey we need some help over here,” you know, that’s where the conflict comes from.
Ben: Yeah, and that leads me to a follow up question, which is, as a professor who teaches courses, in my experience there is definitely a certain challenges in digital humanities of training students in digital methods that, typically it seems like digital humanities students come from a humanities background as opposed to, from a computer science background, typically anyways.
Ted: Yeah
Ben: And so oftentimes there can be quite a learning curve for students…
Ted: Yep [laughs]
Ben: …and oftentimes they can be scared away by the challenges involved, so how do you deal with that, and what are ways we can perhaps do better at that?
Ted: Yeah I don’t think we have a good solution there yet, actually. That is definitely the case, and I’ve been evolving in, I’ll tell you the direction I’ve been evolving on that, and I think I’m still evolving, so like ten years ago, or nine years ago, back in 2011, 2012, I had the idea that it would be possible to do all of this in a course in the English department, which is where I was located then. And maybe we’d have a graduate course in the English department where it would be something like “Digital Humanities” or “Digital Methods in Literary Study,” and we’d, oh we’d explore the controversies about the nature of digital humanities and maybe along the way introduce students to some programming and statistics, and that is so impossible. [Both laugh] That’s now ludicrous, right, that’s actually like five course that you’re trying to compress into one, but it seemed necessary, and in some ways it was necessary at the time, because you couldn’t assume that students, say, in an English department were going to expect to have to take three or four courses in this area, or be willing to, because it was still very new and controversial, so you were going to get one course, realistically that was going to be the curriculum. So there was just a limit on what you could actually do. You could not really teach computational methods in a course like that. And so, you know, I’ve dealt with that partly by expanding my role in the university where now I’m teaching in the School of Information Sciences, where there is a bigger DH curriculum and there are more students who are likely to have taken courses in programming or data science and be able to maybe take a more advanced course to be able to apply that to say look at unstructured data, look at text or images, but it’s still definitely a challenge, because realistically, like I say, it actually would be a three or four course sequence, not a two course sequence. And so, there’s still considerable risk of rushing things, and I don’t think I’m avoiding that successfully yet, to be honest with you. It’s sort of like a coevolution between the way we teach this and the way that the curricular institutions around us sort of frame the topic, and what they suggest is possible. There’s another approach, I should say, there is another way to go about this, which is very popular, and I just don’t think it works either, which is to try to fit it all into one course by basically ditching the programming part, and say, “there are some user friendly tools out there, we can use those.” Like Voyant is one tool, a very good, it’s about as good as can be done in that space of sort of text analysis in your browser on the web. There are some other similar tools that promise to be user friendly, and they are to a certain point, but then you rapidly will run up against the limits of what you can actually do in those, say, graphical user interfaces. So, if we do it that way, we can squeeze it into one course or two courses, maybe, but then where do students go from there, is what I’m not certain. So that is a really big challenge, but I think ideally, my view would be, it means what we need to do is maybe define a little better, say a three course sequence in this space.
Ben: Yeah, that’s definitely something that I’ve had to deal with in my experience, because, for listeners who don’t know me, I was previously the technician for the IRIS Center at Southern Illinois University Edwardsville, which that’s the digital humanities center there, and we have a digital humanities minor there, and I was actually the first student to receive that minor, but…
Ted: Okay
Ben: …in working on the curriculum after I received that minor, a big challenge we’ve had to deal with is, do students actually need to have programming experience to receive that minor, and I did, but many students are… have some hesitation about that, and it seems like, to a certain extent, I think there’s a fear at least of those that design curriculum that if you have too much programming involved you’re going to scare students away, and…
Ted: Yeah
Ben: …I don’t really have a good answer for that [laughs], but it’s a definite challenge.
Ted: I don’t either, but here’s what I see as likely to happen. We can build programs in the humanities that don’t have programming as part of them, and that can be a valid thing. I’m not against doing that. But if we do that, there will also be programs that arise in the social sciences, and in information science, and in computer science for that matter. They’re beginning to happen in departments of computer science already that use more flexible and adventurous kinds of computation than you can easily fit into a graphical tool, because, you know, social scientists are used to using statistics, and departments of information science and computer science exist also. So, it’s like if we don’t do it, they definitely will, because they can, you know [both laugh]. And humanities materials, movies, books, art, are fascinating, they’re really appealing, so departments of computer science will definitely go for that, they’re not gonna wait for us to do it, so it’s fine that, we could do it both ways, but the computational way is gonna happen somewhere, it’s just a question of where.
Ben: And do you think it’s better off happening from the end of the humanities going to the computer or the other way, do you think…
Ted: I would love them to be a bridge, I would love it to be a bridge, and I think it can be, I mean in some ways that’s the promise of information science as a place, is that you can have a single institution where really both of those perspectives are represented are joining hands and collaborating. It can work across campus too, if you’ve got a humanist in a[n] English department or history department collaborating with someone in computer science, that can also work. But I like the feeling of a school where you’ve got people from a lot of different disciplinary backgrounds collaborating. So, yeah I hope we’re able to hold that all together, but you know, just as with the term digital humanities itself, you end up describing a very big arc or bridge that’s fragile at lots of points, and it’s hard to hold together.
Ben: Yeah, as we’ve been talking, I realize we haven’t really had a whole lot of opportunity for you to talk about your particular research interests, so I wanted to give you the opportunity to talk about some of the computational work you’ve done in your research.
Ted: Oh sure, I mean…
Ben: That’s a big question so… [Both laugh]
Ted: …it varies, I’ll organize it under two heads, things I’ve done and thing’s, sort of, maybe looking forward, things that I think are getting exciting. But a lot of what I’ve done is use large digital libraries to pose questions about long timelines in literary history. So, you know, how has the pace of narration changed, how much time passes in a typical page of a novel? Are we talking about a week of fictional time that passes in each page of reading, or is it a day, or is it sometimes increasingly in recent years, it’s more like two minutes per page. The pace has slowed down. And when that slowdown happened, is not something we had a good picture of. A lot of literary critics, for instance, thought that that happened at the beginning of the 20th century with modernism, now that we have a big picture we can see it was much more gradual. And similar sorts of questions about concreteness, the development of concreteness in fiction. And really it’s, you know all of these things come together in a way to make a bigger story about literary history and the study of literature, which is to a large extent, our idea of what literature is for has been shaped by certain aspects of literature that only developed very recently, like this emphasis on concrete particulars and brief, sort of, fragmentary moments that we now think is sort of, that experiential vividness and particularities is crucial to the mission of literature, so much so that it’s shaped the way we think we ought to be reading and interpreting literature. But it’s actually, if you look at the big picture, you can see where we got that. It’s a long story, a gradual story, and in fact, sort of our idea that you can’t use big numbers and panoramas to understand literature is a product of a history that, if you back up, we can actually the panorama that generated that. So, that’s the story I’ve been telling. But I think in years to come, I’m interested in looking at, not just at, sort of, big digital libraries and long surveys across long timelines. But I’m getting increasingly interested in understanding literature in detail, like how plot works, how suspense works, and I think it’s going, the nature of machine learning and of computation is evolving so rapidly that is gonna become increasingly possible for us to pose some questions that seems like less social sciency, more interpretive, using machine learning.
Ben: Yeah, I’ve seen arguments, and this was largely in a non-academic context..
Ted: Yeah
Ben: …so take this kind of with a grain of salt, but that, like, plots are becoming increasingly complex over time in media…
Ted: Mmm
Ben: …in general, and, at least I think the argument was particularly for TV shows…
Ted: Yeah
Ben: …that there used to be TV shows were just linear, beginning to end, and nowadays you have a lot like, not just time travel, but flashbacks…
Ted: Yeah
Ben: …and non-linear storytelling, so I’d be interested in questions like that about like, ways of seeing the way how narrative is constructed…
Ted: Yeah
Ben: …is changing, because the conventional wisdom oftentimes is like, “People are getting dumb,” and like…
Ted: [Laughs] Yeah, no, no
Ben: …which, I don’t agree with, but yeah.
Ted: It’s pretty clear, I think there’s actually some agreement that TV has gotten more adventurous, for a lot of, I mean one thing, there’s a classic thesis, and I don’t know who to attribute it to, but just the development of the VCR or DVR meant that you could return and pay attention to the texture of of television, you could slow down and get the joke, whereas if it’s broadcast television in the 1970s…
Ben: [Laughs]
Ted: …it’s gone, you know, you can’t return to it. So yeah, that I think we know, but I think there are gonna be all kinds of other questions. There’s some good work that’s come out recently in the Journal of Cultural Analytics studying the sitcom using face recognition, and they’re like, “Which characters get attention, when in the plot, when in the arc of the sitcom do they get attention,” and we’re gonna be able to study that kind of formal, the formal architecture of TV genres, as well as literary genres, in new ways, I think, yeah.
Ben: Yeah, that’s exciting, and that leads me to the last formal question I had prepared, which is, and you addressed it already, but perhaps big picture
Ted: There’s more to say.
Ben: I’m sure, there’s always more to say.
Ted: Yeah.
Ben: And the question is, what do you see as the future of data science in the humanities?
Ted: Yeah, I see it as being really capacious, really, so, you know, the kind of thing I’ve done, like, where it’s explicitly quantitative and it’s about big, historical stories, that’s never going to be everything the humanities are about, because we’re also about individuals, and that’s valid. We’re about individual stories. But I think, what’s happening now, the line between data science and machine learning, or, to use a term that’s sometimes used, artificial intelligence. I kind of prefer machine learning, but it’s different sides of the same coin, that is a very blurry continuum. And that means that we are not just going to be studying culture in a kind of large-scale social sciency way, but we’re going to be able to, for instance, do the things that large language models do, where you can give them the beginning of a story, and then they continue it. Like okay, if that’s how the story begins in that style, then this would be a plausible next paragraph. Or they can do something similar to that with images, and that means we can begin to pose questions about the sort of, the frame-to-frame movement of a video, or the paragraph-to-paragraph movement of a story, where we start to ask questions about, for instance, what makes some stories more predictable than others? Like is it easier to predict where this plot is gonna go? Or we can pose questions about, like, where could this story have alternatively gone? Where could it plausibly have gone, suppose it’s written in the 1870s, you know, what are the plausible alternative endings for this story. What about if we move it forward a decade? You can, we’re getting models that are good enough to be able to pose that kind of question, like what could have happened in the story under alternate circumstances? And that’s gonna open up just a huge range of questions that are not limited to the kinds of questions we think of as appropriate for data science or social science. They’re more, to be honest, they’re akin to creative questions. One of the things about these generative models, generative models of images or of text, is they’re basically, they’re doing something like generating the artwork. Now I don’t think that that means that we’re gonna, like, you know, have robots write all our novels for us. [Ben laughs] They’re not that good, and the arc of a whole plot is something they still struggle with, and I’m not sure we want that anyway, actually. I think it’s more fun, people enjoy fan fiction because they enjoy going back and forth with a story world, right. They enjoy participation. But it does mean that the line between, what we think of as analytical or critical tasks, and what we think of as creative play, could get really blurry, and that to me suggests that the future of data science in the humanities is not something we’re gonna want to keep walled out because it’s too much like social science, it actually could be really central to what we think of as things like play, that are central to the purpose of the humanities. But for that to happen, it’s gonna need to become kind of like a lingua franca that we’re comfortable than we are right now. And I think that will happen actually because the opportunities are just too huge, but how it will happen, you know, where that happens in the curriculum, remains a bit of a mystery.
Ben: Yeah, and I think, and.. correct me if I’m wrong, but there is certain level of I think fear out in the world about like artificial intelligence or machine learning…
Ted: You are not wrong. [Both laugh]
Ben: …I mean certainly more in like the context of facial recognition.
Ted: Yes
Ben: …and the use of like surveillance or what have you…
Ted: Right and social media…
Ben: Yeah
Ted: …big tech corporations. These kinds of fear are all interwoven, yes.
Ben: Yeah so, I imagine that bleeds into the academy…
Ted: Oh yes
Ben: …in hesitations people might have about um…
Ted: Oh yes
Ben: …what you do.
Ted: It doesn’t just bleed into the academy, a lot of that is sort of generated in the academy…
Ben: Right
Ted: …and it’s, I mean if I’m gonna be completely candid about that, partly that’s a reflection of an emerging competition between universities and tech companies, which are both like, there’s this space which is like, intellectual institutions and society, which universities have had kind of a monopoly there, like oh there’s gonna be research in computer science, we’re going to be doing it. Now the tech companies are kind of claiming to be the leading edge of CS research, which does not make universities comfortable, and so that’s, part of the anxiety is fueled by just sort of general social things with social media and fears of surveillance. But within the academic context, we also have to be candid that Google is a competitor for universities, and it’s not surprising that university professors like myself are real… wary of it. So, you know, but I think it’s also valid, that, to be sure, you know the steam engine was socially disruptive and was a problem and magnified social problems and was not handled well, and is machine learning going to do all those things too? Magnify social problems, increase concentration of power, not be handled well, need kinds of regulation that we don’t yet have, yes. I mean all of that will be true, for sure. So it’s going to be, at the same time, I also think everything that I said earlier is true, it can magnify human creativity and become a way we understand human creativity and be a kind of collaborative space for human creativity. So, it’s not an either or, but it means that there’s a really interesting, complex struggle and conversation that plays out there.
Ben: Yeah it almost seems like, perhaps, the attitude is that if we don’t use it or don’t acknowledge it, it doesn’t exist, or we don’t have to worry about it then.
Ted: Yeah
Ben: Or maybe that’s oversimplifying it, but we’re better off at least, like, I dunno, claiming it, or taking ownership, or…
Ted: Yeah, I mean, yeah
Ben: …at least figuring out how it works, so like…
Ted: Yeah, that for sure. I mean we, everyone would agree about that, that it’s a real bad idea for people not to understand how machine learning works, I think we can all agree about that. Where to do next is where things get a little complicated, but I’m not really sure yet that we have a policy disagreement about like, how should machine learning be regulated? I think, generally speaking, a lot of, when the conversation actually gets that concrete, there’s often a lot of agreement. But it’s before we get to that level, when we’re talking about like, what attitude should we adopt to machine learning, before it really gets to the concrete level of “what should we do?” then there’s a lot of tension, because people have very different attitudes and very different, I mean honestly it’s an emotional thing, like how do we, I have, when I look at, sort of, a big new language model, I have feeling of excitement and joy and like it’s spring and there’s gonna be new [Ben laughs] flowers coming up and I don’t what they are. But I know that people do not have that feeling [Both laugh] when they look at a large language model, and I understand that, but it’s not, I don’t think that’s purely like an intellectual debate. It’s even sort of like before we’ve gotten to the stage of framing an intellectual debate about it, it’s that people just have, at this stage, very different kind of, I guess a technical term would be priors or instincts, um yeah.
Ben: Perhaps a gut reaction of some sort or?
Ted: Gut reaction, yeah. Yeah.
Ben: Yeah
Ted: And, I do understand where the other gut reaction is coming from because big tech companies are scary, legit scary, and also ways governments can use machine learning and will use machine learning I think are highly scary. So that’s all true. It’s just, how we hold those things in tension, right, ’cause there’s always a dark and a bright side to everything, including like, you know, the human body, right, it has problems. So but it’s how you hold those things in tension that remains to be seen.
Ben: Right, um the title of our podcast is “It Takes a Campus,” so I feel like I should ask something about the, I guess the campus environment in which you operate, because, speaking as someone who came from another university that had DH in various forms…
Ted: Yeah
Ben: …um, University of Illinois is so big that it almost inevitably becomes, in some sense siloed, for lack of a better way of putting it. And maybe that’s not accurate, I don’t know I’m still fairly new here
Ted: No, it’s accurate [Both laugh]
Ben: But, yeah, it feels like that’s a definite issue for digital humanities, is there’s a certain level of siloing that occurs, and how do we deal with that?
Ted: Yeah, so there’s siloing with, sort of a university from other universities, and then there’s different communities within the campus. The thing I like to say about the University of Illinois is that it’s like a Kafka novel in that, or a short story by Kafka, in that somewhere on campus there’s an amazing resource meant for you and you alone, but you may or may not ever discover the door where it’s hidden behind, so there’s so much going on here, that means that people are not necessarily always in communication. I think that that is a challenge for DH, you know, there are connections between, in particular I would say the iSchool and HRI, which has its own, Humanities Research Institute if I’m getting the acronym right, used to be IPRH, which is sort of on a different part of campus, but we’re both doing digital humanities in different ways in some communication with each other, and then there’s communication with the outside world, which I think is also important, and I think this program Training in Digital Methods for Humanists that’s centered at HRI has done a bit to encourage people to go out to, you know, summer institutes, say, where they can be in communication with people at other universities, and I do think that’s important, because otherwise you fall behind, honestly. And it’s particularly challenging for all universities, not just for Illinois right now, because there’s not a lot of hiring happening in the humanities, so it is easy to fall behind, actually, if we don’t consciously refresh our experience. I do think that’s a challenge, but I’m optimistic that we’ll build the needed bridges.
Ben: Yeah, well great, well that’s largely what I had prepared today, but I wanted to give you the opportunity to speak to anything you feel like we should have talked about. I mean, that’s a pretty open door, so…
Ted: No, those were great questions actually, and I got a chance to go off on how large language models are like the approach of spring with unknown flowers coming up…
Ben: [Ben laughs]
Ted: …that’s what I wanted to say, so, yeah.
Ben: Well I’m glad you got the opportunity to say that, yeah. It’s not an army from Mordor…
Ted: Yeah, or I mean, you know, it’s also an army from Mordor, but yeah, it’s both, both things. [Laughs]
Ben: Yeah, well, the world is complex.
Ted: Yeah
Ben: Well, thank you so much for taking the time to talk to me today. I really enjoyed our conversation, and I look forward to talking to you more in class later. Full disclosure to the podcast: Dr. Underwood is currently my professor for Data Science in the Humanities course.
Ted: It’s fair to disclose.
Ben: Yes
Ted: It’s been a please talking to you too.
Ben: Yes, yep. And I will talk to you again later.