Following on from discussion the factors informing the design of or analysis, I promised to outline each phase of our analytic plan. Here’s the first.
First, though, it’s useful to illustrate the analytic plan in its entirety, because in undertaking any phase of analysis it is always crucial to build on what has gone before, and anticipate what will happen as a result. That’s what makes analysis focused.
The diagram below shows the four phases of analytic plan as it currently stands. You’ll notice that in my last blog post I said our plan had three phases. Since then we’ve got further into the analysis and now are thinking about the phases slightly differently. That’s the nature of qualitative research design, it develops as the project proceeds.
It’s important to note that this diagram relates specifically to the qualitative analysis of the MOP writers’ narratives, but this is only one part of the Defining Mass Observation project. How the quantitative analysis of writers’ characteristics integrates with the work I’m discussing here will be the topic of a separate blog post later on.
The first phase takes the form of mapping out the content of the materials at a high level. This is necessary because we didn’t have an overview of the materials at the outset. See here for a discussion of why this is.
There are three elements to this mapping process, that are undertaken in parallel.
– indexing the narratives according to semantic content (via descriptive coding)
– capturing the emotional tone of the writings (coding for expressions)
– reflecting on and summarising each writers’ narrative in relation to the Research Questions
Indexing semantic content
Our pilot analysis identified key areas that we need to know about if we are to be able to answer our research questions. For the starting research questions for My Life Line and Social Divisions see here.
There are several things we need to do in order to be able to answer these questions. First, we need to know which events writers discuss in their responses and how they express themselves when writing about them. For example, with respect to the My Life Lines Directive, we cannot analyse how and why certain events are significant, meaningful or how they structure writers’ lives unless we first know what events are reported and how they are discussed.
These factors underlie the need for a multi-staged approach and our focus on mapping out semantic content at a high-level first.
Some areas that we need to index are specific to the Social Divisions and My Life Lines Directives, although as shown below, there are many overlaps. The overlaps are one of the ways that we well be a able to integrate analysis of the two directives later on.
So what do we mean by ‘semantic content’ and how do we go about ‘high-level mapping’ in MAXQDA?
To quote Braun & Clarke (2006:13): “With a semantic approaach, the themes are identified within the explicit or surface meanings of the data and the analyst is not looking for anything beyond what a participant has said or what has been written.” What this means for us, is that we are capturing – through coding – the content of the material in our key areas (the concepts in diagram above) from the ‘surface level’ of what is written. This means that the codes we’re using during this stage are not ‘themes’. They are areas of interest that we need to index in order to be able to answer our research questions – during this stage their purpose is essentially descriptive – to map out the content of the data according to the ‘surface’ or ‘semantic’ level.
Capturing the emotional tone of the writings
In addition to mapping out the semantic content of writers’ responses to the Directives, we also need to capture the way in which they write, the ways the present their accounts. The research team had extensive discussions about the best way to go about doing this, discussions that were driven by the overarching objectives of the study, but also informed by what we know is possible using our CAQDAS package of choice for this project, MAXQDA. See here for a discussion of the factors informing our choice of software. Any qualitative text can be read at different levels, for example, what is explicit in the text, what is implied and what can be inferred through interpretation.
Given that this phase is generally about mapping out the semantic content of the material, our aim in capturing the emotional tone of the writings is to interrogate whether certain topics and events are written about in a generally positive, negative or neutral way. Coding for expressions beyond what is explicitly present in the writing – for example because we know from earlier statements the writers’ feelings about certain topics – would mean that we lose that they have recorded an event or experience in a particular way. For example, in the My Life Line material, coding the straightforward statement “my cat died” as ‘negative’ because we know from earlier comments that the writer loved her cat, took her cat everywhere with her, perhaps partly as a result of having few close family members or friends, would lose that she records the event of her cat dying in a neutral, matter-of-fact way. The decision was therefore made to index every statement to one of the following codes: positive, negative, neutral, mixed. Doing so sets up the possibility to interrogate in the next phase of our analysis, whether certain events, experiences and topics are generally discussed in different ways.
Reflecting on and summarising each writers’ narrative in relation to the Research Questions
So, in indexing the semantic content and emotional tone of the writings, at this stage we’re not focusing on capturing our interpretations as analysts within the coding. No doubt this needs to be done, within the interpretive paradigm within which thematic analysis resides. And we’ll do this later on – but only once we’ve identified and prioritised core areas, which of course we cannot do until we have mapped them out.
However, whilst we go about this mapping process we do, of course, have thoughts and insights, make connections and interpretations. And we don’t want to lose them. Amongst the things that are core to what we all do in qualitative data analysis – whatever our objectives, methodologies, analytic strategies or tactics – is to reflect. We reflect on everything, all the time, we can’t help it. And nor should we. A key benefit of using a CAQDAS package, whichever one you choose, is that the thoughts you have can be captured within the software project, at the time you have them and, crucially, be linked to the data that prompted them.
We therefore developed a template for capturing these reflections during the process of indexing the semantic content and capturing the explicit emotional tone of the writings. This included the following elements: Opening, Topics, Language use, Classifications, Writing styles. As each writers response to the My Life Line and Social Divisions directives were indexed, these templates were filled in by the analyst. Each analyst had the freedom to make notes about these different elements in the ways that seemed appropriate for the individual response, but we each did so in relation to the overarching starting research questions for each directive, and the consistent template ensured the focus of our reflections was consistent. In particular, it meant that we were capturing the thoughts and insights we had about each MOP writer as we were reading and indexing their responses.
One of the issues with taking a high level semantic indexing approach is that it can be quite difficult to only think at this level when reading directive responses. For the reasons discussed above and in the blog post about the factors informing the design of our analytic approach, it was important to work initially at this level. However, we didn’t want to lose the more in-depth interpretive thoughts we had about writers whilst undertaking the indexing of semantic content and explicit emotional tone. The use of our structured template for reflections ensured that we could keep the coding at the right level, whilst capturing our interpretive thoughts. Both will be of use for the later stages of our analysis.
My next blog post will discuss the second phase of our analysis, analytic prioritisation.
In her recent post, Rose commented on the variety in the responses to the 2008 Your Life Line Directive. This variety has shaped the way we are approaching the qualitative analysis of this and the 1990 Social Divisions Directives. So I thought I’d outline our analytic design and share how we are implementing it within MAXQDA (see here for an explanation of our choice of CAQDAS package). I’m doing that in a series of 4 posts, this is the first, check back over the next few weeks and months as our analysis proceeds for the next 3 posts, which detail the way we are going about each analytic phase.
Framed by the projects’ overall objectives, research questions and methodology, our analytic design evolved out of a pilot analysis phase when different approaches were trialled on a sub-sample of narratives from both Directives.
The resulting design involves three phases: i) high-level mapping of semantic content, ii) thematic prioritisation, and iii) in-depth latent thematic analysis. Each phase will be the subject of separate blog posts.
Here, though, I’m briefly discussing four factors that underlie this approach:
1) the nature of the data
2) the need to both keep separate and to integrate the analysis of the two Directives
3) the practicalities of the project
4) the need to develop a transparent and transferable process
The nature of writing for the Mass Observation Project (MOP) and the data that is generated
Because the writings of Mass Observation volunteers are only loosely guided by the questions in the Directive, we did not have an overview of the general content of the material at the start. This is symptomatic of this type of secondary qualitative data analysis. The narratives were not generated for the purpose of this study and therefore we have had no influence on the nature or content of the material we are analysing.
This is very different to the type of situation where researchers design and undertake interviews or focus-groups, or observe naturally occurring settings or events. Had we been involved in designing the Directive questions for a specific substantive purpose we might have had some ‘control’ but even then, the very nature of the MOP results in very varied responses to Directives. Some are short, others much longer; some specifically seek to answer Directive questions, others attend only very loosely to the Directive questions; some are written in a quite structured form, for example using bullet points, others are longer more free-flowing, discursive-style narratives; some are written in the first person and reveal detailed insights into personal experiences and opinions, others are more cursory, brief descriptions that upon first reading appear to reveal little about the feelings and opinions of the writers.
This varied nature means we have a very rich set of materials – just what qualitative data analysts love – but when we started out we had no idea of the content of this large body of varied writings. We therefore needed to design an approach that first provides us with an overview, so that we can evaluate the extent to which our research questions are answerable by the data. We could have achieved this by first reading all the Directive responses, but with almost 600 Social Divisions and almost 200 My Life Lines responses, some of which are many pages long, and a short-time frame, we couldn’t do this. We needed to be coding whilst reading. Had the analytic team done the transcribing, we would have had the broad content view we’re looking for from that process, but again, the project resources didn’t allow for that and temporary typists were employed to transcribe the responses. We did informally ‘interview’ the transcribers about their impressions of the MOP writers when they had finished, though, and this informed our thinking. But we could not solely rely on their opinions.
Analysing different sets of responses separately then integrating our analyses
Initially our intention had been to analyse responses to both Directives together, because one of our objectives is to explore the extent to which perceptions and lives have changed between 1990 and 2008 amongst writers who responded to both Directives (that we call ‘serial responders’). However, the pilot demonstrated that despite the need to analyse the ‘serial responders and despite the synergies across the two Directives’, starting out with all the data together would be impractical and would affect our ability to maintain focus whilst coding each Directive.
In addition, it became clear in our pilot coding that we cannot know at the outset which areas of our substantive interest offer potentials for looking across the two Directives. There are many potential synergies, and the longitudinal element of exploring the identities of writers that have responded to both Directives is an important part of our work. However, the Directives are very different and the context of the time in which these were responded to is important.
The practicalities of the project – number of coders and time-frame
Any project needs to attend to ensuring coding is consistent. The involvement of three coders and the short time-scales mean that we needed to design an approach that maximizes consistency from the outset. We could have mapped out the content of the Directive responses by undergoing an “open” coding exercise as a means of initial theme generation; this would have been similar to the usual first stage of Grounded Theory-informed projects. Indeed we did this in our pilot work and this informed the focus of phase one. However, the time available and the uncertainty of content means it is more systematic to focus first on the descriptive content and undertake more interpretive work once we have a clear idea of content.
The need to develop a transparent and transferable process
One of the objectives of this project is to open up possibilities for using MOP as a source of secondary longitudinal qualitative data. This means that verified and assured processes for analysing MOP data that can be adopted or adapted by other researchers are amongst the projects outputs. The three-phased approach not only serves our analytic purposes but also offers a method that can be easily documented and illustrated.
Like qualitative research design in general, ours is iterative and emergent – we expect to need to refine our initial research questions as we progress – in light of our growing understandings. I will outline the three phases of the design (high-level mapping of semantic content, thematic prioritisation, and in-depth latent thematic analysis) and how they are being implemented in the CAQDAS package MAXQDA, in future posts.
In this blogpost I’m sharing the process we went through in choosing to use the software package MAXQDA to analyse the writings of volunteers in relation to the MOP Directives “Social Divisions” and “Life Lines”. I hope this will help others make their own informed choice between the range of software available.
Software designed to allow researchers to manage and analyse qualitative data are collectively known as CAQDAS packages. CAQDAS stands for Computer Assisted Qualitative Data AnalysiS. These packages have a range of features including data management and organisation, content searching, code and retrieve, writing and visualisation, querying (e.g. Boolean, proximity) and output. For more information see the CAQDAS Networking Project.
There are many different qualitative software packages to choose between (see here for some reviews). There is no “best” package; each has its pros and cons and therefore the decision-making process is best made in relation to the characteristics of particular projects. For some it might not be a given to use software at all. If that’s the case, I’d still recommend going through a process like the one described here because it will help rationalise whether using one of these software packages would be useful.
How to choose
The first step is to list and prioritise the needs of the analysis, then compare products in terms of how their features can be harnessed for those needs. The aim is to choose the program that will enable you to fulfil your prioritised needs in the most streamlined way. There are usually some compromises – because not all the programs have the same features. Prioritizing the project needs will help you see which compromises you can live with and which you cannot.
The first step is to be clear about the broad objectives of the study. For the Defining Mass Observation project we have a number of overarching objectives relating to the whole project which were important to consider in choosing a CAQDAS package:
Overarching project objectives
- a) to illustrate the value of MOP data for social scientific enquiry;
- b) to facilitate its opening up for the purposes of secondary analysis; and
- c) to contribute to the representativeness debate around the use of MOP data.
In addition are the specific research questions we are starting out with. In choosing a CAQDAS package these relate to analysing the writings provided by volunteers in response to the two MOP Directives we are focusing on in this part of the project. There will be a blogpost about the research design of the project as a whole later on.
It’s worth saying at this point that qualitative research design is emergent. That means that the research questions you start out with likely change as the analysis proceeds, which is why those that informed our choice of software are called “initial research questions”. We have three, two of which have sub-questions:
Initial research questions
- How do writers describe their perceptions of their own and others’ identities, in relation to class, politics, race and gender?
- how to writers perceive their educational trajectory to have influenced their class status and social mobility?
- do the ways MOP writers reflect on identities reflect socio-economic classifications used by social researchers?
2. How do writers perceive the structure of their lives?
A. how and why are certain events significant?
B. how and why are meanings are attached to events and decisions?
3. Of the writers who have responded to both Directives, how have their perceptions of their identities and their lives changed between 1990 (Social Divisions directive) and 2008 (Life Lines directive).
In order to fulfil the overarching project objectives and the research questions, we listed our key analytic and practical needs:
– we need to be able to combine numeric information about MOP writers’ characteristics with their actual writings;
– we need to undertake an ‘inductive thematic analysis’ of the writings (see Braun & Clark, 2006 for a nice overview of the different types of thematic analysis );
– we need to be able to track the contributions of writers who responded to both the Directives;
– we need to be able to access keywords and phrases used by MOP writers that indicate the concepts we are interested in;
– we need to be able to map concepts within writings in response to each Directive in isolation, as well as compare across both Directives;
– we need to be able to interrogate the ways writers discuss Directive themes according to their characteristics.
– at least 3 researchers need to be able to work with the qualitative software;
– the software needs to be intuitive and easy to learn;
– we need the software project we create to be accessible to non-expert users.
Having thought these needs through in more detail, we prioritised them by putting them into a table in order of importance. This allowed us to rank the needs, as shown in the table below. As a result we chose to use MAXQDA. You can find out more about MAXQDA from their website.
Note that the table shows the results of our evaluation of MAXQDA only. I’m not showing in this blogpost the comparison with other CAQDAS packages, because the point is to outline the nature of our decision-making process rather than to compare products. In reality we constructed a table with several other columns – one each for the programs we looked at. I find it useful when doing this to colour the cells to show visually in which dimensions the different programs are evaluated positively. Here I’ve made the text bold here to highlight the reasons why MAXQDA was seen as a good choice. In the larger comparison chart, the cells were highlighted green for positive evaluations, red where a particular program does not enable a requirement, and left white where any of the programs had the required features – resulting in a heat-map type matrix of requirements by software program features.
|Requirement – in priority order||Detail||Evaluation – of MAXQDA features in relation to requirements|
|1||Team-working||There are three researchers on the project who will contribute to the analysis and use the software. They have different responsibilities and roles and therefore we need the ability to isolate and integrate aspects at different times.||Although MAXQDA does not allow concurrent work by multiple analysts, whereas other packages do, its teamwork import/export features are sufficient for our needs.
Two of the researchers have prior experience of working with MAXQDA in team projects and therefore are familiar with the protocols we need to put in place to enable systematic and streamlined team-work in MAXQDA.
|2||Intuitive and easy to learn
|This project runs over 15 months and therefore timescales are tight. When making the decision we did not know if the researcher we are recruiting would have experience of using any package and therefore a program that could be familiarised with quickly was important.
The overarching objectives of illustrating the value of MOP for scientific enquiry and opening it up for secondary analysis also necessitates an accessible means of communicating our analytic process and findings to other researchers who may not conversant with CAQDAS packages.
|Intuitiveness of software and ease with which individuals learn to harness it for sophisticated analysis are subjective. We therefore used our experience of teaching CAQDAS packages as a proxy measure for this requirement: MAXQDA is amongst the most straightforward to teach.|
|3||Assessable to non-expert software users||An intended output is to make the software project we develop available for other researchers to access.||That MAXQDA has a free Reader-version that enables those without software licences to open projects and view analysis was an important factor in our choice, as it will enable us to share our database with others easily.|
|4||Linking of quantitative information with qualitative texts||This is a key analytic priority of this project. Quantitative information about writers’ characteristics is being collated and cleaned for the purposes of statistical analysis in order to contribute to the debate around the representativeness of MOP writers and the value of these materials for social science research.||All of the leading CAQDAS packages allow numeric information to be imported and linked with the qualitative materials to which they correspond; although this is a fundamental need of our research, it was thus not a determining factor for the choice of software.|
|5||Data-driven qualitative analysis||We are adopting an inductive thematic analysis of MOP writers’ responses to two Directives. We therefore need software features that we can use for this purpose.||Any of the CAQDAS packages would enable in-depth qualitative analysis so this was the least important requirement in our prioritized ranking and not a factor that impacted upon our choice.|
|6||Track responses of writers contributing to both Directives||An important aspect in opening up MOP materials for research purposes is to illustrate how the writings of individuals over time can be tracked.||All of the leading CAQDAS packages will allow for tracking via combining or grouping data contributed by the same individuals. So this was not a determining factor in our choice of software.
However, MAXQDA’s Document Comparison chart provides an alternative way of comparing application of (groups of) codes at the level of data files, not available in other packages. We considered this will be of benefit when considering whether and how individuals perceptions differ when responding to different MOP Directives.
|7||Map concepts within and between Directives||The nature of the data (i.e. that it was generated in response to open-ended Directive questions) means we have no overview of content at the outset, as is the case when generating data through customary qualitative data collection methods, such as interviews or focus-group discussions. An initial analytic task will therefore be to map out the content of writings at a high level, in order to identify the prevalence of concepts so that we can focus analysis.||All the leading CAQDAS packages enable high level mapping of the application of codes in data files in tabular (numeric) format with access to the corresponding texts for qualitative interpretation.
MAXQDA’s Code Matrix Browsers, however, are particularly easy to generate and provide clear and accessible visualisations. The requirements of 2) and 3) make these features attractive.
|8||Interrogate Directive themes by the characteristics of writers||Exploring the extent to which the perceptions and experiences of MOP writers differ according to their characteristics is one way in which we hope to contribute to the debate about their representativeness.
This means we also need an accessible means of showing other researchers, who may not be familiar with the software, the results of our work.
|There are several “mixed methods” features in MAXQDA that we can use to present joint displays of dimensions in the data. The ones we evaluated as being particularly useful for this project are Crosstabs, Configuration Tables, turning codes into Categorical Variables and constructing Typology Tables.
The Quote Matrix feature enables a joint display of quantitative characteristics and qualitative texts to be directly outputted which will allow us to share findings easily.
Udo Kuckartz (2012) paper on mixed methods in MAXQDA gives a clear overview of these features.
|9||Access and auto-code for keywords and phrases||This is particularly important for the Social Divisions Directive because we want to consider the extent to which established socio-economic classifications are understood by and reflected in writers’ accounts and to focus on evocative language used in the texts.||All of the CAQDAS packages have word search type tools which we could use for this purpose. In MAXQDA we can create and save our own dictionaries to locate multiple keywords and phrases which is attractive but not thought to be a major aspect of our decision.|
Choice of software is always a combination between analytic and practical priorities. The outcome of a process like the one discussed here might prioritize analytic needs over practical ones. For Defining Mass Observation the practical needs outweighed the analytic because of the overarching project-level objectives.
It’s important to say that we are not claiming that this project could not be undertaken using a different CAQDAS package; far from it.
But when in the position of being able to make a choice for a specific project it is important that the choice is guided by the needs of the project. That requires both an understanding of the differences between programs as well as being clear about the overarching project objectives and the specific research questions.
Braun, V., Clarke, V., 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 77–101. doi:10.1191/1478088706qp063oa
Kuckartz, U., 2012. Realizing Mixed Methods Approaches with MAXQDA.