Tag Archives: methodology

Social Network Analysis and Digital Data Analysis

Notes on a presentation by Pablo Paredes. The abstract for the seminar is:

This presentation will be about how to make social network analysis from social media services such as Facebook and Twitter. Although traditional SNA packages are able to analyse data from any source, the volume of data from these new services can make convenient the use of additional technologies. The case in the presentation will be about a study of the degrees of distance on Twitter, considering different steps as making use of streaming API, filtering and computing results.

The presentation is drawn from the paper: Fabrega, J. Paredes, P. (2013) Social Contagion and Cascade behaviours on Twitter. Information 4/2: 171-181.

These are my brief and partial notes on the seminar taken live (so “typos ahead!”).

Looking at gathering data from social network sites and on a research project on contagion in digital data.

Data access requires knowledge of the APIs for each platform but Apigee details the APIs of most social networks (although as an intermediary, this may lead to further issues in interfacing different software tools, e.g., Python tool kits may assist in accessing APIs directly rather than through Apigee). In their research, Twitter data was extracted using Python tools such as Tweepy (calls to Twitter) and NetworkX (a Python library for SNA) along with additional libraries including Apigee. These tools allow the investigation of different forms of SNA beyond ego-centric analysis.

Pablo presented a network diagram from Twitter using NodeXL as ego-networks but direct access to Twitter API would give more options in alternative network analysis . Diffusion of information on Twitter was not possible on NodeXL.

Used three degrees of influence theory from Christakes & Fowler 2008. Social influence diffuses to three degrees but not beyond due to noisy communication and technology/ time issues leading to information decay. For example, most RTs take place within 48 hrs so tends not to extend beyond a friends, friends friend! This relates to network instability and loss of interest from users beyond three degrees alongside increasing information competition as too intense beyond three degrees to diffusion decomposes.

The  direct research found a 3-5% RT rate in diffusion of a single Tweet. RT rates were higher with the use of a hashtag and correlate to the number of followers of the originator but negatively correlates to @_mentions in the original Tweet. This is possibly as a result of @_mentions being seen as a private conversations. Overall, less than 1% of RTs went beyond three degrees.

Conclusion is that diffusion in digital networks is similar to that found in physical networks which implies that there are human barriers to communication in online spaces. But the research is limited due to the limits on access to Twitter API as well as privacy policies on Twitter API. Replicability becomes very difficult as a result and this issue is compounded as API versions change and so software libraries and tools no longer work or no longer work in the same way. Worth noting that there is no way of knowing how Twitter samples the 1% of Tweets provided through the API. Therefore, there is a need to access 100% of the Twitter data to provide a clear baseline for understanding Twitter samples and justify the network boundaries.

Points to importance that were writing code using R/ Python preferable as easier to learn and with larger support communities.

PhD research

*warning* this is a long post that basically provides the theoretical foundations for my ongoing research. Its taken from a paper presented to a PhD progress board at the University of Edinburgh so its basically an edited version of a vast/ huge paper written a few months back. All questions/ comments/ suggestions/ criticisms are welcome – especially is you manage to read the whole thing!

It starts with my espoused theoretical stance (which I anticipate will change as the research progresses (I couldn’t really describe it as a learning experience if it didn’t change).

My initial research approach took an epistemological position that drew on differing strands of social constructivism. Social constructivism (Vygotsky 1978) has been cited as a dominant theoretical perspective in educational research (Phillips, 1995; Fox 2001) and has been seen to be making significant in-roads to management research (Alvesson & Skoldberg 2009; Cunliffe 2008; Samra-Fredericks, 2008).

However, I was also interested in learning and knowledge as a practical act, that is, learning to “do” something. This brought in pragmatism as an epistemology of action (Cook & Brown 2005), of knowing “how” rather than knowing “that” (Spender 2005; Kivenen & Ristela 2003). A similar combination of social constructivism with pragmatism can be seen to inform Engestrom’s use of Vygotsky’s theories in the development of activity systems theory (Young 2008) that also places an emphasis on knowledge and action together. Furthermore, Cook and Brown (2005) refer to the notion of knowing linking to the theoretical area of practice (Bourdieu 1977; Antonacopolou 2006).

So, constructivism involves understanding, knowledge development and learning as active and either intentional or unconscious and habitual, which indicated that a practice-based approach to my research might prove beneficial. Given my focus on interactive digital environments that can be labelled as Web 2.0, a practice-based approach that is concerned with the complex interrelations between people, artefacts, language, collaboration and control seemed appropriate (Nicolini et al 2003; Guzman 2009; Geiger 2009).

Based on this argument, I was initially attracted to using the social constructivist based Activity Theory (Engestrom 1987; 2001) as it allows for multiple constructions of practice. As a socio-material perspective, Activity Theory suggests an individual only becomes meaningful in a social context where knowing in practice, activities and non-human materials are intertwined in a dynamic series of interactions (Tuomi 2000). The interactions of activity systems aim to highlight the tensions and contradictions that stimulate change, development and learning (Chappell et al 2009, p179). As Piaget argues, change comes not just through exposure to a ‘better’ theory, but rather through actively applying that ‘better’ theory in the world (Ackermann 2001). In other words, to practice (with) it.

Yet the experience of attempting an earlier discourse analysis of a single Twitter chat event suggested that Activity Theory was predicated on a degree of stability that did not appear to apply to the dynamic instabilities seen in the chat event. The event appeared to exaggerate many of the key problematic features of unstructured discussions identified by Belnap & Withers (2008, p8): sequences extending over many exchanges; overlapping exchanges and sequences; short sequences tending to be cut off prior to a conclusion and sequences re-emerging later in discussions. The norms of participant interactions appeared to be under almost constant negotiation and renegotiation. Also, non-human elements appeared to have an impact that suggested more than passive mediation. For example, Twitter apps such as Tweetdeck, which aggregates and organises Twitter ‘streams’, arguably shape how Twitter chats are structured and “consumed”. This combination of inherent emergence, instability and ambiguity within a socio-material framework (Fenwick & Edwards 2010) suggested that Actor Network Theory (ANT) would provide a more appropriate perspective to the research study. Indeed, Sorenson (2007) suggests that for material to be meaningful the material object must interact with “the social”, so the material object can be seen as being itself as unstable as the social context of the interactions that make it meaningful. The material, or materiality, can only be understood in terms of patterns of relations that change over time and space and not in any notion of the independent properties of that material.

Approaches to researching networked and practice-based approaches to learning and knowledge construction include a range of theories on social learning including communities of practice, cultural historical activity theory and so on. What these approaches have in common is a rejection of the primacy of the individual person in that the individual only becomes meaningful as a member of any number of networks where knowing in practice, activities and non-human materials are intertwined in a dynamic series of interactions (Tuomi 2000).

Actor Network Theory
Actor Network Theory (ANT) can be characterised as a “perspective” or lens of loosely combined ideas and concepts rather than a theory (Bergquist et al 2008). Latour (1999, p19) argues that in the case of ANT:
It was never a theory of what the social is made of … for us, ANT was simply another way of being faithful to the insights of ethnomethodology
At its most basic, ANT seeks to “follow the actors” (Latour 1999) by the detailed tracking of specific practices as a means to see how actors influence the world (Fenwick & Edwards 2010). ANT is focused on the study of networks through actors (Miettinen 1997) and:
Although the ‘T’ of the ANT acronym stands for ‘Theory’, it is this better understood as a methodological approach. In this way, ANT can be seen as an approach to the field that offers analytical tools that can be applied to narrative knowledge, be they organizational or otherwise (Alcadipani & Hassard 2010, p423).
While there is a broad range of ANT based research approaches, they are all fundamentally socio-materialist whereby an actor pursues an interest which can be translated into both non-human and social arrangements. These arrangements can be seen as network effects and can include combinations of people, organisations, groups, equipment or objects (Law 1992). Examples of these arrangements include a professional community, an organisational routine (Bergquist et al 2008) or communities of practice (Fox 2000).

There appear to be three key elements of ANT of particular interest: symmetry of human and non-human actors; the processes of translation and network assemblages or dynamics.

Symmetry seeks to avoid a subject/ object dualism that defines the human and non-human worlds as distinctly and qualitively different (Mietinen 1997). ANT specifically rejects the privileging of the human as an “all powerful agent imposing an arbitrary form on shapeless matter” (Latour cited in Miettinen 1997) while at the same time rejecting technological determinism.

But in practice symmetry appears to be difficult to achieve in the research process (Fenwick & Edwards 2010). As a result, ANT research practice has been criticised for adopting a form of human asymmetry described as Machiavellian as the researcher ends up following the loudest actor (Miettinen 1997).

To some extent this Machiavellianism is difficult to avoid other than through the reflexivity of the researcher (although such reflexivity may be aided by adopting the perspective of symmetry). It should also be noted that Miettinen (1997) goes on to discuss Machiavellianism only in terms of ANT’s concern with power and thus with “the Prince” and so ANT:
ignores such phenomena as learning, development of expertise, complementarity of resources and know-how in network construction (Miettinen 1997 unpaginated).
By focusing on the interaction between power, learning and resources in the emergence of networks, this study should have a wider span of attention than solely the machinations of the loudest actor.

Fox’s (2005) analysis of the role of newspapers illustrates the role of non-human actors in the generation and maintenance of the imagined community of the nation. Specifically discussing the layout of newspaper front pages as consisting of a number of unrelated news stories, Fox asserts that (2005, p103):
The regular reader thus keeps abreast of multiple narrative threads that weave the fabric of his or her imagined world. But this is not experienced as a simulated world but as the real world … By following the threads of news over time, the reader maintains a sense of a world known in common with distant, imagined others, fellow readers, fellow citizens too numerous to know personally, participants in a regional community, with spatial as well as temporal specificity.
Fox goes on to conclude:
In terms of ‘symmetrical analysis’, the non-human elements in the networks of ‘print capitalism’ made the ‘imagined community’ of the nation … a social and cultural reality.
Similarly, in an earlier study of a Twitter based chat event undertaken as part of a research methods course, it was found that using a browser or specific applications such as Tweetdeck (http://www.tweetdeck.com/) or Twhirl (http://www.twhirl.org) around 20 tweets are co-visible to the participant. Individual tweets are made visible in a single stream in time order rather than threaded by discussion theme. A result of this is that an individual is more likely to make contributions across multiple sequences rather than stay focused on a single discussion sequence (Simpson 2005).

(Screenshot of Tweetdeck)

But it was also clear that the technology of presentation required participants to focus on a few specific threads of discussion as they came up, at least partially ignoring other threads. So to ensure participants were able to re-engage with discussion sequences that they may have been ignoring, there was frequent retweeting of key tweets as well as of the agreed event questions.

In addition, other non-human actors also influence the network: the computers that people use, the technical infrastructure of the internet and World Wide Web, the use of rss feeds and content aggregators, hyperlinks into and out of the event content, the blog site that archives the tweet chat and so on.

ANT has been described as a sociology of translation (Latour 2005). The term translation appears to be used in two key ways. Firstly, it concerns the interpretation and reinterpretation of knowledge or meaning as seen in various studies of ‘workarounds’ that emerge during the implementation of information systems or in studies of workplace safety (Gherardi & Nicolini, 2000, cited in Fenwick & Edwards 2010). For example, different actors translate changes in organisational routines in different ways. Networks evolve as actors seek the support of others by translating the interests of others and enrolling them into the network (Mitev 2009). This in turn generates ordering effects and stabilises the network (Fenwick & Edwards 2010, p9).

The processes of translation in ANT are also processes of simplification whereby an actor comes to be taken to represent a complex underlying network. This simplification is necessary to enable practical action to be taken as these translated networks become taken for granted. So translation is part of the process of generating social order and stabilising a system through ordering routines (Tuomi 2000). The process whereby the social meaning of actors becomes settled is often referred to as “black boxing”.

In the earlier study of a Twitter chat event previously mentioned, the following tweet was interpreted as an attempt to legitimate among the participants the rejection of the Kirkpatrick model of evaluation as inadequate.

8:55:29 @H Can we have another question to keep us from wasting time burying Kirkpatrick? #lrnchat

However, it could also be framed as an attempt to negotiate a stabilisation of the #LrnChat network that the critique of Kirkpatrick can be taken for granted, eg, placed within an unexamined “black box”. Similarly, notions of workplace “performance” were treated as “givens” not to be examined, while other notions such as “business” or “learning” were treated as being far-from-stable notions and central to key discussions during the event. More broadly, the use of Twitter applications (as discussed earlier) and their wider networks of development, maintenance and dissemination and how these might impact on how the individual may experience such chat events was also, unsurprisingly, “black boxed”.

Network assemblages
Networks can be seen as an:
assemblage of materials brought together and linked through processes of translation that perform a particular function (Fenwick & Edwards 2010, p12)
ANT approaches are less interested in the size of a network or networks than in the dynamics of the influence in and on networks, being concerned with the ways in which influence can expand and contract those networks (Fox 2005). So ANT has a central concern with power as enacted through processes of enrolling and translation – that power can be understood as persuasion (McLean & Hasssard 2004).

Networks are products of symmetrical actors linked by intermediaries (Callon 1991). An intermediary “is anything passing between actors, which defines the relationship between them” (Callon 1991, p134). These can include software, documents or human bodies (Depauw 2008). Raisanen and Linde (2004) focused on text as intermediary finding that text played a key role in organisations in attempting to control the environment as well as “being durable and transmittable” (2004, p117). Mitev points to textual intermediaries as “reflecting earlier translations of interests” (2009, p15). Mediators, however, can transform entities and the network and there are an endless number of potential mediators in a network. These may include a CPD plan, a strategy document, (Fenwick & Edwards (2010) or a management method (Raisanen & Linde 2004).

The process by which networks evolve, grow or contract, is proposed by Callon (1986) as starting with a problematisation of specific entities. This “problem” becomes a focal point for the identity of the network via which actors seek to translate a “set of possibilities” to enrol other actors (Toennesen, et al n.d, p7).

For example, it may be argued that the #LrnChat Twitter event network had a tendency to problematise “training”. Members of the #LrnChat community sought to enrol actors by processes of negotiated translation of the possibilities of technology enhanced informal and self-directed learning. Translation and enrolment processes may further stabilise networks and sub-networks to the point that they act as a unified entity in their own right, ie, they become “punctualised” (Tuomi 2000, p9). Punctualisation being the process whereby a network becomes stablised to the extent that it is no longer understood as a network of actors but rather is understood and “black-boxed” as a given single entity (Fox 2005, 102). In other words, the network, such as a community of practice, becomes itself a single actor in a network or collection of networks.

There is some concern among ANT scholars that the term “network” itself leads researchers towards seeking actors of authority as nodal points in the network that in turn may generate a bias towards asymmetry. Thus the language of space and flow may be adopted (Mol & Law 1994) or “action nets” (Czarniawska 2004) placing an emphasis on contextual variables and interactions of human and non-human actors. The notion of “action nets” which privileges links between actions rather than actors themselves is of particular interest if a text object is treated as an action in its own right – a speech act – as Czarniawska suggests (2004, 783):
Although actants access existing action nets, thus recreating and stabilizing these connections, they must also continually form new connections. Such connections are forged during the process of translation, in which words, numbers, objects and people are translated into one another. Like calculation, translation is dispersed: everybody translates, although some translations, like some calculations, have more currency than others.
So (2004, 782):
Action nets need therefore to be observed as they are being established and re-established, which can be done progressively, deduced speculatively or, in Foucault’s terms, studied genealogically.

ANT, knowledge and learning
From the ANT perspective, knowledge and knowing is situated, embodied and distributed in and across networks. Knowledge cannot be perceived as stable nor:
limited to subjective constructions through meaning-centred interpretations of the world, as is the case with much interpretive research (Fenwick & Edwards 2010, p24).
Learning (new ideas or changes in behaviours) can be seen as the network effects of relational interactions involving technologies, objects, people and knowledge changes occurring anywhere in a network (Fenwick & Edwards 2010, p22). Networks act to mobilise knowledge and negotiate its alignment with actor interests. Networks of actors may also operate to bound and constrain learning activities to specific sites of relational interactions; that some interactions are allowed to occur in specific spaces, flows and action nets. Certain discourses of learning may take place in informal contexts such as a Twitter chat event that would be suppressed within other networks or action nets. This would, to an extent, reflect Ashton’s (2004) findings that the mobilisation of more expansive learning opportunities in larger organisations was limited to the higher “management levels” while more restrictive and task-orientated learning opportunities were more widely available. So, such restrictive learning opportunities were arguably constrained to align to the interests of a specific group of actors while opportunities for counter-discourse development were limited to actors who had already been mobilised within those distinct management networks and their related discursive genres and reportoires.

ANT as a research method
ANT has also been described as a “hybrid theoretical blend” that is contingent and unstable (Fenwick & Edwards 2010, p2) and is often used in conjunction with other theoretical perspectives and methods. For example, in a study of higher education Fox (2005) sets out to combine ANT with Communities of Practice and Benedict Anderson’s notion of imagined communities. Mitev (2009) found that ANT alone was insufficient in researching a major information system implementation and eventually combined it with Clegg’s theory of power. Raisanen and Linde (2004) combined ANT with Critical Discourse Analysis in the study of a project management methodology in a specific firm, while Czarniawska (1997) combines an ANT approach with institutional theory in studies of municipal government. However, in a study of human resource managers, Vickers and Fox (2007) successfully used an ANT approach to challenge the notion of a unified “management” within the case organisation, and to expose management practices as sites of both conformity and subversion of official policy. ANT’s focus on the micro-levels of negotiation in network formation and development – the uses of persuasion, coercion, seduction and resistance (Fenwick & Edwards 2010) – provide a mechanism for new insights in the critical dynamics of power relations.

A number of researchers have commented on how difficult it can be to operationalise ANT as a research method (Fenwick & Edwards 2010; Mitev 2009; Raisanen & Linde 2004). Mitev (2009) in particular focuses on the difficulties of deciding where to start, how to “cut the cake” of the initial problem area and then who to include as actors (and by implication, who to exclude). Such discussions are also framed by the practical issues of handling huge volumes of data and the concomitant requirement to scope the research and to exclude various actors, networks and black boxes, with this in mind.

Web 2.0
Web 2.0 has emerged as a label for the culmination of incremental developments in software and network technologies over the last twenty years or so that focus on user-generated content and interaction around that content. Whether Web 2.0 represents a paradigm shift in the World Wide Web or the outcomes of various incremental changes remains a point of contention that may be being repeated with the labelling of the semantic web as Web 3.0. Either way, Depauw (2008) makes the case that ANT is an appropriate approach to the study of Web 2.0 phenomena. For example, social software has been described as employing Web 2.0 technologies in “digital social networks” that support interactions between “social entities” (Kieslinger & Hofer 2007, p7). McAfee (2009) discusses what he terms “emergent social software platforms (ESSPs)” (2009, p69) within which content and interactions are made visible and permanent, and the structure and organisation of content and community develops over time and through interaction. McAfee (2009, p73) then defines the term “Enterprise 2.0” as the use of ESSPs by organisations to assist those organisations to be more effective. McAfee’s ESSPs suggest a perspective on social software technologies that sees such technologies as either intermediaries within fairly stable and “unproblematised” organisational networks, or as mediators that assist in the stabilisation of those networks by making permanent and visible that network as an organisational entity.

Others suggest that Web 2.0 technologies undermine distinctions between information producers, distributors and consumers, so making networks inherently less stable (Androutsopoulos 2008; Pata 2009). Within this understanding, it becomes problematic to see them as simply assisting in organisational goal achievement. This study will focus on what may be perceived as a less stable network of a Twitter based chat event and then will seek to engage with other more stable spaces of interaction such as blogs. Both such ESSPs provide data that is mainly but not exclusively text based.

Texts provide a focus on online content but such technical artefacts also act as intermediaries that coordinate networks, suggesting that the target platforms can be seen as intermediary non-human actors (Depauw 2008). The interactional bases of these social software platforms generate and reinforce the practices of social networks, so contributing to the durability of those network effects (Waldron 2010) – the sociality of such environments (Young 2006) underpins and normalises practices of digital interactions. In discussing activities in wider Web 2.0 environments, Bruns and Humphreys (2007) suggest that knowledge and content artefacts are constantly being developed and refined through social interactions and so are dynamic and fluid rather than static and solid. Furthermore, Pachler & Daly (2009) point to Web 2.0 in learning contexts in terms of “narrative trails” (p7) of social and individual sense-making activities. Narrative trails such as the tagging of virtual spaces and flows are part of the emergent and user-centric organising of ESSPs including Twitter and blogs.

Tagging in the context of folksonomies make visible patterns of interactions (Alexander 2006) between actors as “taggers” and actants as data objects that may include both the main text and the tags used to describe and classify that text. From an ANT perspective, tagging and metadata (data about data) provides an important mediating effect on network evolution in social digital learning environments. This notion of metadata linking networks and flows of people, artefacts and traces of activities through social technologies provides a basis for a common ecological metaphor of Web 2.0 learning environments (Siemens 2006; Brown 2002; Pata 2009). The emphasis on metadata can also be found in the emerging label of “activity streams” (Boyd 2010). In both cases, the effects of tagging and metadata as being used to identify specific spaces, flows and content as well as being potentially mobilised to direct those flows is recognised.

In summary, existing literatures suggest that what is currently labelled as Web 2.0 in general but more specifically Twitter and related social platforms is an appropriate and “rich

On the meaning of a case study

I am currently trying to draft a research framework for my PhD and especially what might be the basic unit of analysis.

Ragin (2000) in discussing case orientated research (COR) raises the key question: a case of what is being researched? In turn, this problematises the notion of a population in COR. One approach may be for case populations can be defined by the research question which in turn highlights the interplay of population definition and causal validity. For example, using Orr’s (1996) analysis of work there may be two approaches to a COR ‘population’: (a) work as a series of employment relations and (b) work as day-to-day activities. My study is interested in work as (b) so not necessarily bounded by particular organisation specific employment relations

Howard (2002) discusses field settings as specific organisations or physical spaces but also as ‘nodal events’ that are socially significant to a community. His research focus was on a specific professional community. Maier & Thalman (2008) discuss the impacts of web 2.0 for knowledge workers in terms of deinstitutionalisation through, for example, individualisation and interaction. The implications of these arguments for case study research that focuses on informal learning practices in the workplace is that a significant proportion of such learning is supported by the individual’s own networks of contacts and trusted sources. Organisational boundaries are arguably less relevant, and access to data ‘held’ by the organisation may provide only a partial picture in the area of interest. Rather, informal learning may be better understood through a focus on ‘nodal events’ that can be seen as being interactions occurring within and between communities and/ or networks.

But this raises further issues of how to enter and/ or bound a network? What is or is not a network? Employing some aspects of Actor Network Theory, how tentative, dynamic and unstable can a series of connections be while still a network?

Howard, P.N. (2002) Network ethnography and the hypermedia organization: new media, new organizations, new methods. New Media and Society. 4 (4), 550 – 574
Maier, R. and Thalmann, S. (2008) Informal learner styles: Individuation, interaction, in-form-ation.
Orr, J.E. (1996) Talking About Machines: an ethnography of a modern job. New York: Cornell University Press
Ragin, C.C. (2000) Fuzzy-set social science. Chicago: University of Chicago Press.