Crowdsourcing, with Dr Mia Ridge

[Intro Music]

Clare: Hello and welcome to Making Tech Better, Made Tech’s fortnightly podcast, bringing you content from all over the world on how to improve your software delivery.
My name is Clare Sudbery. My pronouns are she and her, and I am a Lead Engineer at Made Tech.

We’ve talked a lot on this podcast about the building of software but not so much about other public digital resources. On Tuesday 27th July 2021, I spoke to Mia Ridge about how the British Library have enabled the general public to contribute to their online resources via crowdsourcing.

Clare: Hello Mia!

Mia: Hi Clare!

Clare: Hello. Mia is a Digital Curator at the British Library. I think that means that she is curating digital content. I think that curating means choosing, sourcing, displaying content but in the digital realm. Have I got that right?

Mia: Sort of. Curators used to be called Keepers, which is a terrible term because on the one hand, it suggests the fact that they are looking after things, and keeping them in their care, but it also gives a sense of a certain reluctance in terms of sharing them. Actually, my job is all about sharing resources and collections.

My job is really to make sure that people can access and use our digital and digitised collections in scholarship, research, and creative endeavours. I look after things, but my job is to make sure that as many people as possible can use them in exciting ways.

Clare: Fantastic, I love that. What we’re going to be talking about today is crowdsourcing. Before we get to that, a question that I ask everybody is who in the digital realm are you inspired by?

Mia: I have two answers. I feel very lucky to be part of a broad online community. We used to meet at conferences in the before times. We share lessons learned, we inspire each other, we share the hard times, the barriers. That’s probably my Twitter network and various mailing lists and conference groups. It’s very hard to define. At the moment, I’m working particularly closely with Meghan Ferriter from the Library of Congress and Sam Blickhan From Zooniverse and the Adlib Planetarium in Chicago. We’ve found a way of working together that is really grounded and honest and strong and incredibly efficient. Because we approach the world in similar ways but have a very different background. There’s a kind of diversity in how we think about things but a common core of shared values that makes them just a joy to collaborate with. They are both really inspiring. I think unofficially they are my squad, but I haven’t told them that.

Clare: Fantastic. We’ll put links in the description so that people can find out more. So, crowdsourcing. What is it?

Mia: There are a few ways of thinking about what crowdsourcing is. In the commercial sector, it’s really drawing on the sourcing part of the phrase. It’s different ways of thinking about how you attract talent or people willing to contribute ideas or work. The challenge that we take on in cultural heritage is thinking of it as a form of volunteering. There’s no financial recompense, usually. In the commercial sector or academic sector, people might be used to things like Mechanical Turk, where people are paid a couple of pennies or parts of a penny for a microtask. It might be labelling some of the commercial services that pretend to use AI to do your receipts or something. Actually, someone is probably sitting there looking at a line of text and transcribing it or assigning a value to it in some way.

Clare: Wow.

Mia: There are these massive systems that break up the task of looking at a whole receipt into smaller parts, assigning them to people and verifying the results. Crowdsourcing in cultural heritage draws on some of that. Some of the same technologies and the thinking about workflows underlying the process. It also takes on the idea that it’s a form of engagement with cultural heritage collections. One of the projects that I work on is looking at a collection of playbills that is up to 300 years old. They are really fragile because they were never meant to be long-lasting. They were just pasted up on a wall to let people know what was playing in a theatre that week.

You can’t really access them in the reading rooms because they literally crumble. Nobody wants to see an historical record crumble. So, some of them were microfilmed and then digitised but a lot of detail was lost.

Computational techniques like optical character recognition that does automatic text transcription can’t read them, so we made a crowdsourcing project to ask people to transcribe the titles and the dates of these playbills. The secret message is that we want people to experience these playbills. We want them to understand what entertainment was like in the late 18th or 19th centuries.

We’ve set the challenge of both being productive – we’ve got over a million contributions from a couple of thousand volunteers – but that also means a couple of thousand people who probably wouldn’t have in any other circumstances experienced those playbills and thought about entertainment culture in Great Britain in the 19th century or in the late 18th century. So, it’s a form of productive engagement with the work of cultural institutions, in making our collections more accessible.

Clare: That fascinates me. The idea that you are asking people to volunteer and help you in this job of digital curation but also in the process, you want them to become consumers as well. You’re almost sneakily asking people to do work but using it as a way of getting them to be entertained. You would think it would be the other way round. You might ask people to be entertained and then find a way of sneakily turning it into work but it’s almost like you are doing the opposite of that.

Mia: Yes, I’ve never thought of people who access our collections as consumers before. We really talk about participants, or in the British Library’s case, readers. I suppose they are consuming entertainment, but there is never a commercial transaction.

One of the lessons of Covid that really reinforced what the scholarship and practical experience has told us over the past few decades, that people value something constructive to do.

Cultural organisations have a really long history of volunteering. Particularly in the US, if you have a tour of an art gallery or museum, it might be done by a volunteer docent. People volunteer in their local history societies. They volunteer at local archives. Crowdsourcing means that anyone can volunteer from anywhere.

It works on the assumption that people want to volunteer, and it’s about creating a framework in which they can do that. It’s about removing barriers to participation. We know that people feel quite intimidated by cultural institutions, particularly the British Library where there is a kind of – you must be this high to get a reader’s ticket. You need forms of ID, you used to need a letter of recommendation. People don’t necessarily feel that a big research library or a fancy institution wants to hear from them.

In some ways it’s about saying we do want to hear from you. The ordinary words that you use to describe works of art or photographs actually help other people find those things. We might do something where people tag images to create a kind of folksonomy, which is a very early 2000’s term. This idea that if you get enough tags, they form a loose pattern or hierarchy. It’s about creating a space in which we are saying we not only need your help in making these really vast collections more accessible, we also value your unique perspective.

Curatorial perspectives are one thing, they are deeply informed by years of research and practise and academic study. That’s only one way of looking at collections. Increasingly we are looking at community perspectives that are unique to the communities where things came from. That might be working with people who have worked in the theatre, to understand how they view playbills. It might be working with indigenous communities to understand how they talk about their collections.

I’ve never really thought about them as consumers before. I just don’t work in the space where we think about consumptive relationships or that kind of thing.

Clare: Is that because it’s much more interactive to you? You’re thinking of the materials that you create as being things that people interact with rather than consume?

Mia: Yes, very much so because the point is that you want as many people as possible to have access to things. It’s fantastic that people watch documentaries and read popular history books and watch BBC 4 or whatever, but there’ s nothing like actually seeing for yourself what happened in the past: reading old newspaper articles or looking at old paintings. Feeling like you have the intellectual capacity to understand them, even if it’s not within the same research framework that others might view them.

Clare: Yes. A little way back you said that people have to be this high to get into the British Library. You weren’t talking about actual physical stature, were you?

Mia: Not actual physical height, no.

Clare: I was imagining like Alton Towers, you know, you have to be this high to get on this ride.

Mia: No, just the kinds of intellectual barriers that people have. There’s a mystique around getting a reader’s ticket. Obviously, the library has really popular exhibitions and public displays. You can go into the library right now and stand in quite a cool, calm, dark area. There’s Beethoven’s original music scores, Beedle’s original lyric sheets with material that is over 3,000 years old. It’s a fantastic experience, but for people moving from that quite curated public experience into actually using the reading rooms can feel like quite a barrier. People don’t necessarily think that it is something that is accessible to them, unless they are in an academic or research post.

Clare: Yes. Also, for those that haven’t seen the building that the British Library occupies, it’s extremely impressive, imposing and also potentially intimidating. It looks like it’s the place where the very clever people might hang out. I can imagine that people might feel like they don’t belong somewhere like that. I can see how that could be an issue.

Mia: Yes. There’s a concept in museum studies called threshold fears. You can imagine walking into something like the British Museum, which has this big courtyard and these porticos at the end and these grand steps. You have to pass through security, you go through, and you are in this grand space. You pause, you don’t know exactly where to go, you don’t know how to behave. You don’t know if you are welcome there.

For me, online participation has always been part of reducing threshold fear. It’s about going into spaces where people are already online and saying you can have an experience of these grand cultural institutions. You can do it in your pyjamas on the sofa if you like. You don’t need to be fancy; you don’t need to have gone to a museum or library as a kid, you are welcome here as well.

Clare: Yes. It was interesting to me when you talked about language, as well. It occurred to me that because people are entering this rarefied atmosphere, albeit digitally. They feel like they have to obscure their language and start using lots of long words. Do you sometimes have to ask people to dial it back a bit?

Mia: Not so much, that’s partly because we tell and show people what kinds of words we are looking for. One of the joys of crowdsourcing is that if you have a lot of images online, someone will come along who can apply these amazingly specialised words. If there is a picture of a ship, they can probably tell you when it was made, what kind of ship it is, how much it could carry, what kinds of cargo it might have had, name all the different specialists parts of the rigging. It’s incredibly handy to have someone apply that really specialist knowledge. It’s about crafting the invitation and making sure people know what kind of input is welcome.

I think that’s also really important in terms of reducing barriers. People feel quite worried that they will look stupid. That’s the last thing that you want anyone to feel in that space, so making it really clear what kind of input is wanted at a particular point is really part of the design process.

Clare: Yes. I’ve had a very small experience recently that gives me some feel for this, I guess. I was doing a little bit of research into some family history. I was sent by a librarian an article which had been scanned. I think it must have been a digitised copy of a microfiche. It was quite low-res. I had to concentrate very hard to be able to read it. Some of the time I was guessing. I was just working out that that word must be that word because of the context.

I was transcribing it for my parents because I knew that they would struggle to read this. It was quite a long article, so it took me at least half an hour. I was thinking, am I doing a job that I don’t need to do? Is there some bit of software out there that will do this for me? A quick Google didn’t reveal anything. Where are we up to in terms of the ability to scan badly copied low-res newspaper articles from 1932, in this case?

Mia: This is a very live question in my field. Part of my work at the moment is that I am a co-investigator on a really big data science and digital history project called Living with Machines. The project was in part conceived of the fact that the Alan Turing Institute was physically based in the British Library. If we had all these amazing data scientists in the building and we had these extensive digitised collections of newspapers, how could we combine them? We are working a lot with digitised newspapers.

There is broadly automatic text transcription for printed text. It’s called Optical Character Recognition. Most modern typed text is pretty easy for computers to recognise. If you upload them into Google Docs, it might automatically transcribe the text from an image. Instagram certainly seems to recognise anything that is about Covid. It will give you a warning about how you talk about Covid. That’s a combination of recognising the entities, the things in the image, as well as any text in the image.

It’s much trickier for older material. That’s why we have the playbills project in the spotlight. The Victorians were incredibly creative in terms of their typography and how they used type, so computers really struggle to understand the playbills. Humans are fantastic at recognising things. That’s why crowdsourcing is really powerful. We are using the things that computers are really good at with the things that people are really good at.

One of my favourite projects and one of the earliest crowdsourcing projects in cultural heritage in the modern era is from the National Library of Australia, where they have a project called Trove. It’s a treasure trove of Australian history and culture. One of the parts of that is a newspaper collection. Australia invested in digitising newspapers quite early on. They did it as a publicly funded project, so they are all freely available.

They realised that optical character recognition at the time was really quite bad. Rather than waiting for OCR technologies to get better, they decided to publish them as they were, so you could view the images, but you could also search the text to find particular articles.

Once you had viewed an image, you could also see the OCR, the automatically transcribed text alongside it. If there was an error, then you could fix it. This was really revolutionary because previously, libraries hadn’t been about letting other people affect what was in their records.

Clare: Right.

Mia: So, Trove did this. It was really aimed at people who were using the newspapers for family history research or whatever they were researching. If you noticed an error, then you could fix it. Because the task itself was so satisfying, people started doing it as an end in itself. That’s really the joy of crowdsourcing, finding a task that in itself is intrinsically satisfying, so it doesn’t feel like you are volunteering, it feels like a hobby.

Clare: Yes.

Mia: You are doing something that feels personally satisfying but you are also contributing to something that is bigger than yourself. In the last few years, that has been an incredibly powerful draw because there was so much bad news. Hanging on to the fact that other people are doing something positive online, that’s the draw, I think, of this kind of crowdsourcing. You can improve the life of a stranger who you will never meet.

Clare: I love that. I love that idea that you might think, I just fancy a good old correcting session. That’s what I need right now.

Mia: It’s one of the problems with research in this area. Sometimes I would look at a project to analyse it and then it would be 2am, because I had been completely caught up in the enjoyment of the moment.

[Music sting]

Clare: While I’ve got your attention, let me tell you a bit about Made Tech. After 21 years in the industry, I am quite choosy about who I will work for. Made Tech are software delivery experts with high technical standards. We work almost exclusively with the public sector. We have an open source employee handbook on GitHub which I love. We have unlimited annual leave. What I love most about Made Tech is the people. They have such passion for making a difference, and they really care for each other.

Our Twitter handle is @madetech. That’s M A D E T E C H. We have free books available on our website at madetech.com/resources/books, and we are currently recruiting in London, Bristol, South Wales, and the North of England, via our Manchester office. If you go to madetech.com/careers, you can find out more about that.

[Music sting]

Clare: Before we return to the interview, just a quick reminder that before the break, we were talking about the enjoyment that people get from correcting transcribed historical data.

Clare: You mentioned AI briefly before. I know that what can happen when you work with human interactions is that they can help to train models for AI. For instance, particularly with things like categorisation or recognition. If you’ve got a bunch of humans interacting and saying this thing is this and this thing is this, then over time, AI can take over. Effectively you have trained a model. Is that something that also gets used in crowdsourcing?

Mia: It does. There’s a lot of work in human computation. There are different ways of describing it but basically, the idea that you are working in a cycle of creating some ground truth data, using that to train machine learning and then verifying the results of that machine learning by showing people the image that was classified by the AI. I shouldn’t say AI. It’s not really AI, it’s basically machine learning, or various statistical models in data science.

Clare: Yes, thank you because there is a difference, and it is important to highlight that difference. Thank you.

Mia: Yes. I think IA has become a handy shorthand. It’s very short, it’s quite handy but it is a lie. We are seeing a lot more of those systems where people and computers work in conjunction to improve machine learning models. One reason that libraries and cultural institutions need to be in this space, is to make sure that any work that is done with cultural heritage collections is done with add values. Your mission might be to create enjoyment, inspiration, and learning. You want to make sure that any work done with machine learning or data science also has those values. It’s not just a mechanical transaction.

That’s where the difference with commercial work comes in, as well. We’re not paying people, so it has to be an engaging experience and it has to be a rewarding experience.

Clare: Yes. So that thing of it being enjoyable. People who enjoy the stuff that they do tend to create better quality, I think.

Mia: It’s a cliché in the field, that if you give a talk on crowdsourcing and cultural heritage or digital scholarship, digital humanities, the first question is always about data quality. The answer is always that it depends. We know that with things like Mechanical Turk, there’s a lot of work that happens in the commercial and academic computer-supported cooperative work, human conversation, and other fields, to try and understand the factors that effect worker quality, the quality of the results that you get from workers.

You can imagine the difference between doing something because you want to do it, versus reluctantly doing something because you need the money. You might go to the extra amount of work versus just the bare minimum that you can. We see people doing all kinds of really interesting extra work.

The Old Weather project, which is a Zooniverse project, which is a really popular citizen science platform – Old Weather asked people to transcribe ships’ logs. The idea was that they would also get the readings of temperature and pressure, to inform climate science models. But because the ships stopped at all kinds of obscure places, people would get intrigued by the names of the Antarctic research stations or the fuelling stations or whatever. They would go and look them up and then they would go down a Wikipedia hole. Half an hour later they would come back and actually type in the name of the station, having reassured themselves that they had correctly understood the word and were able to type it correctly.

They had had this amazing learning experience and this delving into 18th century whaling history or whatever it is. For me, that’s a real – a marker of success isn’t how many words people type into your project. I want people to have those experiences where they go off and spend half an hour learning about the history of rigging in theatres, or how people got paid or how roles were divided. Or the history of stage management or stage prop design.

Then they come back, and they finish typing in words. That, to me, is a marker of success.

Clare: Yes, that’s fantastic. You touched on it, and I was going to ask about the potential negative side of things. Somebody might come along and see that something has been categorised in a particular way and say no that’s wrong. Actually, the original might have been right, and they might be wrong.

I can also imagine that there might be malicious individuals out there who might deliberately type in wrong information. I also can imagine people who are just a little bit mischievous who are just a little bit bored and might just insert a rude word or a reference to Mickey Mouse in a place where it wasn’t relevant, just for fun. Is there a lot of that?

Mia: There is not. There are so many more interesting places to vandalise. When I worked in the Science Museum, there was a Wiki that was available in-gallery and the creativity of kids on school excursions in trying to get around various filters and traps. They thought they could get words displayed in public somewhere in the museum, they thought it would turn up.

Incredibly creative, multi-lingual, phonetic, slang, I learned a lot about youth culture from just keeping an eye on what they were doing. Yes, there are so many places to go and vandalise if what you want to do is be a bit mischievous. There are millions of Wikipedia entries that have odd facts because someone has done that.

Generally, the platforms in crowdsourcing require say three people to agree on a classification. In some projects on Zooniverse you can go up to 15 people. So, if you are doing something with climate modelling, you really want to be sure that the data is solid and robust.

Unfortunately, if you work in contentious areas like climate change, there are people with a vested interest. Most people don’t have a vested interest in changing the history of regional theatre in Britain in the 19th century. We do ask for classifications to agree but we also allow people to tag. The tags are creative expressions of what people are interested in in those playbills. They don’t have to agree.

Some people get fascinated by claims that a performance is the first time it has been played in a regional theatre, so they tag ‘first time here’. They might be interested in a particular actor. It’s about a robust understanding of the value of different kinds of data. If you are collecting data for a statistical model, then it has to be accurate. If you are collecting data that might make things more discoverable in search terms, then creativity is actually to be rewarded.

I think typos are great, because if someone makes a common typo, they are actually helping someone else who makes that common typo. Lot’s of people can’t spell museum because the U and the E confuse them. Library can be hard for people to spell. So, typos have value in reaching other people who make the same mistakes. That’s because in that particular case, we want as many people as possible to be able to find things using a search term.

Clare: Does that mean you deliberately leave typos in?

Mia: In some cases, yes. Not everyone has the same education, not everyone has the same training. A couple of my good friends are dyslexic. They shouldn’t be shut out of accessing historical materials or cultural materials because they can’t spell.

Clare: That has now sent my mind down a whole rabbit hole. I’m thinking that you should deliberately create all the possible misspellings of all the words in your database, but then that would explode. I imagine that would get rather unwieldy.

Mia: Yes, and in most cases the search engines that get them to your catalogue or your page probably correct for typos. Traditionally, fuzzy search or mapping common typos doesn’t happen in cultural heritage catalogues because they are designed with that very specific, highly trained user in mind.

Clare: People who listen to the podcast regularly will know that I always do a bit at the end where I tell people how to spell my name. In the olden days, when the way that you got search engines to land on your site – I think people still do this but search engines are much more sophisticated now – you would put links in your site that were labelled with the thing that you wanted people to search for. So, at the bottom of my old blog, maybe 15 years ago, were all the possible misspellings of my name, as links linking back to the blog. Just so that people who typed my name incorrectly into Google would still find my blog.

Mia: Yes.

Clare: I had fun thinking of all the possible ways in which people might misspell my name. Okay, so what is your favourite example of crowdsourcing in the public domain? It doesn’t necessarily have to be one you’ve been involved with.

Mia: Trove is one of the favourites, because they took advantage of the fact that people were going to be looking at these newspaper collections. The interactions were so well designed. They minimised the amount of sign off, they maximised the reassurance. It all happened at the point where you were looking at an article and noticed an error. You’d go, ‘Agh, typo.’ Then there would be a thing saying, ‘Fix it.’ Perfect. That little halo would appear, and an angel would ring it’s bell. I love that.

It’s actually not that easy to access as a crowdsourcing project because it’s not set out to be a crowdsourcing project. It’s set out to be a project where you can correct as you use it. You have to go searching, find a typo and then correct it.

Flickr Commons is another one that isn’t really a crowdsourcing project but is another one that does say while you’re here, you can add tags to describe things.

Clare: What did you just say? Something Commons?

Mia: Flickr Commons. Flickr Commons was that 2000’s photo sharing site.

Clare: Yes, Flickr with no E.

Mia: Flickr with no E. It does exist still. A lot of cultural institutions – in part through the tireless advocacy of a woman called George Oates – put any photographic collections that they had with no known copyright restrictions (which is a whole other minefield because copyright very much affects what we can do), put them on Flickr Commons. Because Flickr Commons was already set up for commenting and tagging, and you could put things in albums, people went to town on these gorgeous old photographs and tagged them and added comments.

In some cases they identified people and places in those photographs. They provided information that the institutions didn’t previously have. The National Library of Ireland used to do a fantastic job of that. They had someone who monitored and commented and really encouraged people to add information about parts of rural Ireland in photographs, that really needed people with local, living memory of those places to identify them.

They are not strictly speaking crowdsourcing projects, but they had an element of participation and a reason to participate, that puts them in the same category. I think for libraries and cultural institutions, the fact that they asked unknown people for input was at that point a big cultural shift.

Now it’s the norm, where we know that people outside of the institution, people who might not normally be those contributing to records, have something useful to say. That shift is in part due to projects like Flickr Commons and Trove, and a lot of the earlier crowdsourcing projects.

I am currently working on asking people to look for industrial accidents in 19th century newspaper articles. I love that project partly because it’s really gory. I get these comments saying, ‘This is disgusting but it’s also addictive.’

I think there was something, particularly in the early stages of lockdown, where we were all feeling quite sorry for ourselves. It was fantastic to actually be reminded that the world has always been a bit crap. There are always horrible ways to die. It gives you a sense of perspective. For most of us, the pandemic was a horrible experience but we’ve been through worse in the past, and we will again, I guess.

I just enjoy any opportunity to get people past the storytelling about the past and actually show them material, so they can have a more informed view themselves. And really get the texture of what life was like in different parts of the world at different points in time.

Clare: Yes. You talked about minimising sign-off. Do you mean minimising the gateways that people have to travel through in order to get approval?

Mia: Yes. One of the genius things that Trove did was it displayed the original text and the corrected text. That did two things. If you had corrected text, you could immediately see the difference that you had made because the typos looked like line noise, and now it looks like beautiful, clean text. If you were a sceptic, you could see what had been changed. You could see that no naughty schoolkid had come along and added swearwords.

Actually, that level of transparency was quite difficult for institutions to manage. It used to be that things would go away; they would be verified manually by a quite expert user. Then you would eventually see the results ten years down the line. So, that sort of immediacy was really important.

Clare: Yes. So simple but so powerful.

Mia: Yes.

Clare: I’ve written a quote down, here, and I don’t know where I got it from. What I’ve written down is, ‘Successful crowdsourcing projects reflect a commitment to developing effective interface and technical designs.’ That made me think about the connection between crowdsourcing and user experience design.

Can you describe in a nutshell that connection, or is it too big?

Mia: I suppose I got into UX and human computer interaction through creating systems as a software engineer that then weren’t used as I expected. That completely applies to crowdsourcing. If you don’t apply UX design and UX thinking, then you unintentionally create barriers. You use jargon. If you don’t understand why people might be hesitant, the fears that they have in contributing content, if you don’t understand what kinds of rewards people seek from voluntary activities, then you can’t design them.

So, designing with UX in mind and testing and making sure that your ego goes to one side when people tell you why something doesn’t work for them, it’s a really important part of the design process.

Clare: Is there a lot of user research?

Mia: Most of the user research is in the commercial or academic sectors. Actually, there’s a job of work to translate it into something that is accessible to practitioners. Recently, I co-wrote a book called Collective Wisdom that is about trying to match the really hard-won experiential knowledge that people who run projects and who volunteer on projects have, with academic research.

Practitioners don’t have access to journal articles and don’t have time to read and translate highly sophisticated, experimental models into practise. So, a lot more work could be done there, in bridging academia and practise. Or learning from the commercial sector and applying it into practise.

Clare: Yes. In this context, where you say practitioners, what level of abstraction are we at? Are we talking about the people who are volunteering, or the people who are coordinating those projects?

Mia: Mostly people who are coordinating the projects because they are the ones with the power to apply the lessons. I also think the volunteers often have really high levels of experience there. They have seen lots of projects. They can give you amazingly on-point critiques of projects based on how they use things. It’s mostly the stakeholders and the designers and the software engineers who implement or operationalise things, who have the power to change things.

Clare: Yes. Fantastic.

Okay, I’m going to ask you to tell me one thing that is true and one thing that is false about you. Then I’m going to have to remember to ask you to tell me which is which. The listeners won’t hear that answer, they will have to subscribe to our newsletter to find out.

Mia: Obviously, I stayed up late last night to watch the Olympics, because I love all sports, especially televised sports. My scariest moment is when I was pulled off a minibus in Moldova, crossing over the border into Transnistria. Taken into a back room and interrogated by border guards.

Clare: Okay, what’s the best thing that has happened to you in the last month? It could be either work-related or non-work related.

Mia: I went to a garden party, and I have been shielding for more than a year now, so it was so amazing. I think I started to believe that friends and colleagues only existed as little squares on a screen. So, standing in a back yard, a little bit dressed up, was just a fantastic, wonderful, ordinary thing that I had really missed.

Clare: Brilliant. Where can people find you? Do you have anything that you would like to plug?

Mia: I am on Twitter at mia_out. I blog at openobjects.org.uk and my website is miaridge.com.

Clare: If people want to get involved and help you with some crowdsourcing and are now all excited about the idea of 300 year old playbills, how do they get involved in that?

Mia: They can look for @libcrowds, which is the Twitter account I run for work to share library crowdsourcing projects. Or just have a look on the Zooniverse. There’s everything from Ancient Greek parchment to galaxies, to counting penguins in Antarctica. There are so many different projects. Lots of them are good for kids. Also, From the Page, if you like reading old handwriting, if that kind of puzzle is enjoyable to you, then From the Page has got lots of manuscripts, which just means handwritten text.

Everything from field notes from biologists to family history recipes. Anything you can think of. Between those two sites, there are lots of different things to explore.

Clare: Fantastic. That’s Zooverse?

Mia: Zooniverse. Z O O N I V E R S E. It was originally about looking at something that they called Galaxy Zoo, which was lots of images of galaxies that needed to be categorised in terms of what kind of left or right, or what kind of arms the galaxy pictures had. From there it because the Zooniverse.

Clare: Wow, okay, fantastic. Thank you so much!

Mia: No worries.

Clare: It’s been really good to talk to you.

[Music sting]

Clare: As always, to help you digest what you’ve just heard, I’m going to attempt to summarise it.

Crowdsourcing in the public sector is a form of engagement with cultural heritage collections. For instance, transcribing the titles and the dates of playbills that are hundreds of years old.

Cultural organisations have a long history of volunteering. Digital crowdsourcing means that anyone can volunteer from anywhere.

People really enjoy engaging with these original materials. It reduces threshold fear. Some of these people will have very useful, specialised knowledge.

Optical Character Recognition is used for automatic text transcription, but it has its limitations. For instance, when trying to translate the Victorians’ creative use of typography.

Crowdsourcing is powerful because it is combining the things that computers are really good at with the things that people are really good at.

Some of the inputs that come from crowdsourcing are verified. For instance, classifications have to be verified by several different people. But tags are not, because they represent people’s personal opinions. Typos can be deliberately preserved.

A good example of crowdsourcing in the public domain, is the Trove project of the National Library of Australia, where you can see the source material side-by-side with the OCR digitised versions, and correct errors instantly on the page. Something that people find really satisfying.

Other examples are the Old Weather Project from Zooniverse, which is about transcribing ships logs. Or Flickr Commons, which is about tagging photo archives. Or the National Library of Ireland’s project which is harnessing local knowledge of rural Ireland.

Machine learning, which is not the same as AI, can also be used in conjunction with crowdsourcing. It’s important to make sure that any work done is not just a mechanical transaction, and that it maintains the mission and the values of the organisation. For instance, creating enjoyment, inspiration, and learning.

You can apply UX thinking in the sphere of crowdsourcing, to understand how best to engage people, and what barriers might stand in the way of participation.

Okay, that’s the end of the interview section, but that’s not all. Stick around for some extra content.

[Music sting]

Clare: Every other episode, this last short segment will be devoted to hack of the month, where one of my colleagues and in the future our listeners too, will share a life or a work hack. This time we are going to hear from Chasey Davis Wrigley, who is a Lead Engineer at Made Tech.

Chasey: A great way to improve your ability to learn is to vary the way you approach learning, so you are learning by seeing, hearing, and doing. There are some useful visual ways, like creating your own mind maps, reading books and articles, or watching YouTube tutorials. You could also listen to webinars and podcasts. Then there is having a go yourself. Try doing what you want to learn. If it’s learning a new software language, try doing some code katas, or try setting yourself some useful mini-projects.
This is good for a few reasons. Not everyone has the same learning style. Some people learn better by seeing, some by listening and others by doing.

We might not always know what our learning style is. By trying to absorb new knowledge through different ways, it should help us better understand what our own learning style actually is.

Even if your best learning style is visual, by also listening and doing, we are actually reinforcing the learning we are attempting to achieve.

Finally, by varying the way we interact with the material, we are less likely to become fatigued. After all, a change is sometimes as good as a rest.

[Music sting]

Jack: Hi, I’m Jack, Made Tech’s Events Coordinator. Now, working in the public sector means that at Made Tech, we really care about making a difference. So, for this final Making Life Better segment, myself and my colleagues will be sharing small pieces of advice to make the world a better place. Today’s advice comes from Aleisha Lambie, one of our Bid Managers here at Made Tech, who has some advice on making time for social chats. Non work-related, even if remote. Aleisha, do you want to tell us a little more about that?

Aleisha: Yes. I think just with everyone moving to remote working with Covid, we have got to a situation where it’s very easy to just get in, get things done. In actual fact, we miss those watercooler moments. It’s really nice to make that time to create those relationships and just get to know people.

Sometimes you just need to remind yourselves, let’s take five minutes at the beginning of this meeting to just introduce ourselves or something that we’ve done. Do something a little bit different, build those relationships.

Jack: I couldn’t agree more. Do you have a go-to icebreaker question?

Aleisha: It depends what day the meeting is. Sometimes it is as simple as, ‘What did you get up to this weekend?’. A lot of us starting at a new company as well, there have been a lot of opportunities just to understand where people live, how they live, who they are living with. Just getting to know what their family life is like.

Jack: It’s nice to remind yourself that we are dealing with actual humans, rather than just the faces on our screen.

Aleisha: Absolutely.

Jack: That’s been absolutely brilliant. Thank you for taking the time, Aleisha.

Aleisha: Thanks, Jack.

[Music sting]

Clare: And that’s the end of another episode. If you are enjoying the podcast, please do leave us ratings and reviews. It pushes us up the directories and makes it easier for other people to find us.

Speaking of which, thank you to our latest reviewer, Mark Butcher, who enjoyed our Communities of Practise episode, with Emily Webber.

I’ve got a few talks coming up. You can see the details on my events page on Medium, which is linked to from my Twitter profile. You can find that @claresudbery, which is probably not spelled the way that you think. There is no I in Clare, and Sudbery is spelled E R Y at the end, the same as surgery or carvery.

You can find Made Tech on Twitter @madetech. Do come and say hello, we are very interested to hear your feedback, and any suggestions you have for any content for future episodes, or just to come and have a chat.

Thank you to Rose our Editor, Gina Cady our virtual assistant, Viv Andrews our transcriber, Richard Murray for the music – there’s a link in the description, and to the rest of our internal Made Tech team. Kyle Chapman, Jack Harrison, Karsyn Rob and Lara Plaga.

Also in the description is a link for subscribing to our newsletter. We publish new episodes every fortnight on Tuesday mornings. Thank you for listening and goodbye.

[Recording Ends]

Back to the episode