Tag Archives: research

IT Futures at Edinburgh

I’m attending the IT Futures conference at Edinburgh today. These notes are not intended to be a comprehensive record of the conference but to highlight points of interest to me and so will be subjective and partial.

A full recoding of the conference will be available at the IT Futures website

The conference opens with an address from the Principal, Sir Timothy O’Shea with an opening perspective:

Points to the strengths of the University in computing research, super-computing and so on, and ‘ludicrously’ strong in e-learning with 60 plus online postgraduate programmes. In these areas, our main competitors are in the US rather than the UK.

Beginning with a history of computing from the 19402 onwards. Points to Smallwood and using computers to self-improving teaching and Papert on computing/ e-learning for self-expression. 1980s/90s digital education was dominated by the OU. 1990s the rise of online collaborative learning was an unexpected development that addressed the criticisms that e-learning (computer assisted learning) lacked interactive/ personalisation elements.

2000 saw the rise of OERs and MOOCs as a form of providing learning structure around OERs. Also noted the success of OLPC in Uruguay as one of the few countries to effectively implement OLPC.

Argues that the expansion of digital education has been pushed by technological change rather than pedagogical innovation. We still refer to the constructivism of Vygotsky while technology innovation has been massive.

How big is a MOOC?
– 100 MOOCs is about the equivalent in study hours of a BA Hons. A MOOC is made up of a 1000 minnows (I think this means small units of learning. MOOCs are good for access as tasters and to test e-learning propositions. They also contribute to the development of other learning initiatives, enhance the institutional reputations including relevance through ‘real-time MOOCs’ such as on the Scottish referendum. MOOCs provide a resource for learning analytics.

So e-learning is mature, not new, and blended learning is ‘the new normal’ and dominated by the leading university brands of MIT, Stanford, etc. A huge contribution of e-learning is access.

A research agenda: to include modelling individual learning, including predictive learning support; speed of feedback; effective visualisation; supporting collaboration; understanding Natural Language; location of the hybrid boundary (eg, in practical tests); personal programming (coding) and how realistic is it for meaningful coding skills for the non-geeks to  be developed.

Open questions are around data integrity and ownership; issues of digital curation; integration of data sources; who owns the analysis; should all researchers be programmers?; and how to implement the concept of the learner as researcher?

Questions:

Question about artificial intelligence: Answer – Tim O’Shea’s initial research interest was in developing programmes that would teach intelligently – self-improving teachers – but using AI was too difficult and switched towards MIT’s focus on self-expression and for programmers to understand what their codes were doing. Still thinks the AI route is too difficult to apply to educational systems.

Q: surprised by an absence of gaming for learning?

A: clearly they can and cites Stanford on influence of games on learning motivation

Q: on academic credit and MOOCs

A: Thinks this is inevitable and points to Arizona State University which is attempting to develop a full degree through MOOCs. Can see inclusion of MOOCs in particular postgraduate programmes – heuristic of about a third of a Masters delivered via (external) MOOCs but more likely to be taken forward by more vocational universities in the UK – but using MIT or Stanford MOOCs replacing staff!.

Now moving on to Susan Halford on ‘Knowing Social Worlds in the Digital Revolution’:

Researches organisational change and work and digital innovation. Has not directly researched changes in academic work but has experienced them through digital innovation. Digital innovation has kick-started a revolution in  research through data volume, tracking, analyse and visualise all sorts of data. So data becomes no longer used to research something but is the object of social research.

Digital traces may tell us lots about how people live, live together, politics, attitudes, etc. Data capturing social activities in real time and over time rather than replying on reporting of activities in interviews, surveys and so. At least, that is the promise and there are a set of challenges to be addressed to realise the potential of these data (also see this paper from Prof Halford).

Three key challenges: definition; methods and interdisciplinarity

Definition–  what are these digital data: these are not naturally occurring and do not provide a telescope to social reality. Digital data is generated through mediation by technology and so is not naturally occurring. In the case of Twitter, a huge amount of data, but is mediated by technological infrastructure that packages the data. The world is, therefore, presented according to the categories of the software – interesting but not naturally-occurring data. Also, social media generate particular behaviours and are not simply mirrors of independent social behaviour – gives the example of the ReTweet.

Also, there is the issue of prominence and ownership of data. Survey data often is transparent in the methods used to generate data and therefore, the limits of the claims that can be made from the data. But social media data is not transparent in how it is generated – the data is privately owned where data categories and data stream construction is not transparent. We know that there is a difference between official and unofficial data. We do not know what Twitter is doing with its data but that it is part of an emerging data economy. So this data is not neutral and is the product of a series of technological and social decision-making that shapes the data. We need to understand the socio-technical infrastructure that created them.

Method – the idea that in big data, the numbers speak for themselves is wrong: numbers are interpreted. The methods we have are not good for analysis of large data. Research tends towards small scale content analysis or large scale social network analysis but neither are particularly effective at understanding the emergence of the social over time – to harness the dynamic nature of the data. A lot of big data research on Twitter is limited to mathematical structures and data mining (and is a-theoretical)  but is weak on the social aspects of social media data.

Built a tool and Southampton to dynamically map data flows through ReTweeting.

Interdisciplinariety: but is a challenge to operationalise inter-disciplinarity.

Disciplines imagine their object of study in (very) different ways and with different forms of cultural capital (what is the knowledge that counts – ontological and epistemological differences). So the development of interdisciplinarity involves changes on both sides – researchers need to understand programming and computer scientists need to understand social theory. But also need to recognise that some areas cannot be reconciled.

Interdisciplinarity leads to questions of power-relations in academia that need to be addressed and challenged for inter-disciplinarity to work.

But this work is exciting and promising as a field in formation. But also rises for responsibilities: ethical responsibilities involved in representing social groups and societies and data analytics; recognising digital data excludes those who are not digitally connected; data alone is inadequate as social change involves politics and power.

Now Sian Bayne is responding to Prof Halford’s talk: welcomes the socio technical perspective taken and points to a recent paper: “The moral character of cryptographic work” as  generating interest across technical and social scientists.

Welcomes the emphasis of interdisciplinarity while recognising the dangers of disciplinary imperialism.

Questions:

What actions can be taken to support interdisciplinarity?

A: share resources and shared commitments are important. Also academic structures are important and refers to the REF structures against people submitting against multiple subjects. (but is is pointed out that joint submissions are possible).

Time for a break ….

 

We’re back with Bernard Schafer of the School of Law talking on the legal issues of automated databases. Partly this is drawn from a PG course on the legal issues of robotics.

The main reference on the regulation of robots is Terminator but this is less worrying than Short Circuit, eg, when the robot reads a book, does it create a copy of it, does the licence allow the mining of the data of the book, etc. See the Qentis hoax. UK is the only country to recognise copyright ownership of automatically generated works/ outputs but this can be problematic for research, can we use this data for research?

If information wants freedom, does current copyright and legal frameworks support and enable research, teaching, innovation, etc? Similar issues arose form the industrial revolution.

Robotics replacing labour – initially labour but now examples of the use of robots in teaching at all levels.

But can we automate the dull part of academic jobs. But this creates some interesting legal questions, ie, in Germany giving a mark is an administrative act similar to a police caution and is subject to judicial review, can a robot undertake an administrative act in this way?

Lots of interesting examples of automatic education and teaching digital services:Screen Shot 2015-12-17 at 12.10.02

 

 

 

 

Good question for copyright law is what does ‘creativity’ mean in a world share with automatons? For example, when does a computer shift from thinking to expressing an idea which is fundamental to copyright law?

Final key question is: “Is our legal system ready for automated generation and re-use of research?”

Now its Peter Murray-Rust on academic publishing and demonstrating text or content mining of chemistry texts.

…And that’s me for the day as I’m being dragged off to other commitments.

weeknotes [20102014]

Over the last few weeks, I’ve been

further working through my research involving discourse analysis along with network and other sociomaterial methods for my PhD. I think I’m developing a stronger understanding of of the method “in action” and Technology Enhanced Learning.

I’m also continuing to enjoy the teaching on two courses: Digital Environments for Learning; and Course Design for Digital Environments.

I’m also continuing to contribute to the development of two initiatives which I’ll hopefully write about sometime soon.

Personal learning environments

Network ALL2_BC
I’m currently writing up some ideas on open online professional learning that includes considering  personal learning networks. I came across this interesting post from Martin Weller on the apparent decline in interest or discussion of personal learning networks. The reasons suggested include the mainstreaming of the practices associated with PLEs, a consolidation of the tools used in to a fairly generic set of software used but also that the (research) agenda has shifted from personal learning to institutionally provided personalised learning partly driven by learning analytics.

 

MOOCs automation, artificial intelligence and educational agents

Geoge Veletsianos is speaking at a seminar hosted by DiCE research group at University of Edinburgh. The hastag for the event is #edindice and the subject is MOOCs, automation and artificial intelligence.

[These notes were taken live and I’ve retained the typos, poor syntax and grammer etc… some may call that ‘authentic’!]
 
George began by stating that this is an opportune time for the discussion as MOOCs in the media, the developments on the Turing Test and MIT media lab story telling bots used for second language skills in early years or google’s self-driving cars. Bringing together notion of AI, intelligent being ets.
Three main topics: (1) MOOCs as sociocultural phenomenon; (2) autonomation of teaching and (3) pedagogical agents and the automation of teaching.

MOOCs: first experienced these in 2011 and Change11 as a facilitator and uses them as object of study for his PG teaching and in research. Mainly participated as observor/ drop out.

MOOCs may be a understood as courses of learning but also sociocultural phenomena in response to the perceived failure of higher education. In particular, MOOCs can be seen as a response to the rising costs of higher education in North America and as a symptom of the vocationalisation of higher education. Worplace training drives much of the discussion on MOOCs as illustrated by Udacity changing from higher ed to training provider and introducing the notion of the nano-degree linked to employability. Also changes in the political landscape and cuts to state funding of HEIs in the USA and the discourse of public sector ineffieciencies and solutions based on competition and diversity of provision being prefered. MOOCs also represent the idea of technology as a solution to issues in education such as cost, student engagaement  and MOOCs as indicative of scholarly failure. Disciplines and knowledge of education such as learning sciences not available many as knowledge locked-in to costly journals, couched in obscure language. MOOCs also represent the idea that education can be packaged and automated at scale. Technologies seen as solutions ot providing education at scale, including TV, radio and recording lectures etc. so education is seen as content delivery. 
Also highlighted that xMOOCs came out of comp sci rather than education schools and driven by rubics of efficiency and autonomation. 
Pressey 1933 called for an industrial revoluation of education through the use of teaching machines that provide information, allow the learner to respond and provide feedback on that learner response. B.F. Skinner also created a teaching machine in 1935 based on stimulous/ response of lights indicating whether a response is correct or not. 
Similarly MOOCs adopt similar discourses on machine learning around liberating teachers from administration and grading to be able to spend more time teaching. So these arguments are part of a developed narrative of efficiency in education.But others have warned against the trend towards commodification of education (Noble 1988) but this commodification can be seen in the adoption of LMS and “shovelware” (information masquarading as a course).
Automation is increasing encrouching in to academia via eg, reference management software, Google scholar alerts, TOC alerts from journals, social media automation, RSS feeds, content aggregators (Feedly, Netvibes) and programming of the web through, for example, If This Then That (IFTTT). 
As a case, looks at the Mechanical MOOC that are based on assumptions that high quality open learning resources can be assembled, that learners can automatically come together to learn and can be assessed without human involvement and so the MOOC can be automated. An email schedular coordinates the learning, OpenStudy is used for peer support and interactive coding is automatically assessed through CodeAcademy. So attracts strongly and self-directed and capable learners. But research incates the place and visibility of teachers remains important (Ross & Bayne 2014). 
Moving on to educational agents as avatars that present and possibly respond to learners. These tend to be similar to virtual assistants. Such agents assist in learning, motivation, engagement, play and fun but the evidence to support these claims is ambiguous and often “strange”. In the research, gender, race, design and functions all interact and learners respond often based on the stereotypes used in human interactions. The most appealing agent tending to have a more positive effect on learning. Also context mediates perceptions and so how pedagogical agents are perceived and understood. 
The relationship between the agents and learners and their interactions is the subject of a number of studies on topics of discussion and social practices. Found that students and agents engage in small-talk and playfulness even though they are aware they are interacting with an arteficial agent. Also saw aggressive interactions from the learners, especially if the expert-agent is unable to answer a query. Students also shared personal information with the agents. Agents were positioned in to different roles as a learner companion, as a mediator between academic staff and learner, as a partner.
So social and psychological issues are as important as technology design issues. So do we need a Turing test for MOOC instruction? How we design technologies reflect as well as shape our cultures. 
//Ends with Q&A discussion

UFHRD 2014: 5 June, key note on HRD research and design science, Prof Eugene Sadler-Smith

Back at the UFHRD conference and the post-lunch key note address.

Change dthe title to “(Quite) grupy old men, mars bars and epistemology”. Noted a slide from a talk yesterday as a list of critics of HRD: the grumpy old men including “Sadler-Smith 2014”!

Looked at the issue of relevance and rigour in HRD and critiqued by academics as overly descriptive, needs ot be evidence based and criticised as ambiguous over goals and how to achieve them. But these issues have been present since the foundation of HRD academic journals. It is time to find a solution to the double-bind of relevance and rigour.

Could design science be a productive line of inquiry to resolve some of these issues in HRD research.

Design science positioned in terms of explanatary sciences and field problems and artefacts. Looking to SImon’s work on ‘.sciences of the artificial’. Simon distinguishes between explanatory sciences that describe, explain and preduct the natural and social worlds. While design science is concerned with developing actionable knowledge for designing solutions in the real work (field problems). But these interact, eg, Newton’s second law (explanatory) is used in air travel / engineering (design science), or at the Forth Road Bridge as a solution to the field problem of trains crossing the Firth of Forth. Engineering is a design science as is medicine and more recently in education and management. Leads to the question of what are the field problems in education or management, eg, teaching complicated problem-solving, how to plan for complexity/ uncertainty.

In the case of the design science perspective for HRD, what might be the field problems of research and professional practice? But proactice can be seen as a distinctive component from the research where practice is applied knowledge.

Which brings us to the issue of the artefact. Artefacts have a purpose in addressing a field problem and so are moulded to the context, eg, sunglasses moulded by sunshine. But what artefacts does HRD produce (eg, learning materials, procedures, products…) as central to the process of design.

It is also worth noting that design science is not ‘mode 2’ research as design science is concerned with the product of research, nor applied science and not action research but can be related to all of these.

How to do design science:

1. design proposition; 2. design science logic; 3. testing the logic and 4. applying the logic. The design proposition depends on a logic pf prescription (in this context, use this intervention to generate this desired outcome by achieving a specific mechanism). Creates a logic of Context, Intervention, Mechanism, Outcome (CIMO). Management and HRD literature tends to focus on intervention and outcomes and so ignores the generative mechanisms as these are grounded in explanatory science (as well as decontextualised). It raises the question of what is meant by theory in a CIMO logic.

In the application of the CIMO logic, multiple interventions are often required. This fits well with HRM in terms of strategic HRM/D discussions of bundles of interventions/ practices. Eg, Hodgkinson and Healey (2008) used psychology theory to develop design propositions for scenario planning.

Simon: the essence of the design problems resides in assemblages of components.

Testing the CIMO logic depends in locating generalisations valid across different contexts and is pluralistic in termsof methods. In education, use design based research, using a VLE in science education to promote complex inquiry skills – generates generalised findings and falsifiability.

HRD research and design science – is it of any use? Since 2007 there have been some papers referring to design science in the HRD journals.

But management academia privilege the eexplanatorysciences over design sciences (Van Aken 2005). Design science may assists HRD in overcoming issues of relevance to practitioners in the production of actionable knowledge.

The epistemological implications of a design science in terms of what knowledge is and how it might be created – is there a specific type of HRD knowledge and theory to be produced. Researchers should co-create with practitioners on developing design propositions and that interventions are tested in multiple contexts.

UFHRD Conference : 4 June 2014, opening key note

The conference welcome is from Dave McGuire of Edinburgh Napier University including a short welcome video prepared by one of his students with a good number of talking heads.

The opening key note address is by Prof Jonathon Passmore [JP] with the title: “Coaching Research: The Good, the Bad and the Ugly”.

The session looks at coaching research especially on coaching in organisations and a critical review of the literature but is these on the good, the bad and the ugly and

1. why research coaching
2. what makes for good quality coaching research
3. key themes in coaching research and
4. suggested direction of research for the coming decade.

Why research is a question he asks in organisations with the response of there’s no need as “we know it works”. But coaching involves risks and there is a need to demonstrate effectiveness and ROI with positive outcomes for individuals and organisations. But this is difficult in terms of agreeing participation, problems of measurement of intangible benefits which can be difficult to publish.

The quality of research depends on the research question that is clearly defined and bounded; that the research method is appropriate, clearly described for purposes of replication and correctly executed; that results are compared with and positioned within earlier research and that conclusions are appropriate and not over-claiming as well as identifying new questions.

Critical questions to ask f the results from research is to query whether a placebo effect is occurring or that other factors contaminated the research, i.e., other training going on or selection of high performers leading to positive outcomes. Also, can the research be replicated. But few studies meet these criteria.
Can look at phases of coaching studies: phase 1 involving case studies and surveys; phase 2 involves theory development through qualitative research which is valuable in immature research areas like coaching – putting up a straw man to be challenged; phase 3 has seen initial randomised controlled trials (RCTs) and a small-scale (25 – 40 people) but provides important evidence on individual and psychological impacts; phase 4 sees larger RCTs (Passmore & Rehman 2012) and phase 5 sees an increased use of meta-analysis and includes the increase ease of access to data sources as well as the impacts of the ‘computational turn’.

These studies have identified a number of popular themes of coach behaviour attracting lots of papers as did the coach-client relationship. But only limited research on client decision-making on coaching and an increase in research on the impact of coaching.

Coach behaviour research, e.g., Hall et al (1999) involving interviews of coaches and clients identified some tentative behaviours but has been validated by future studies especially around the discursive and collaborative approaches and the power relations and dynamics to work collaboratively. Probing and challenge is an emerging area as a distinction from the empathy focus of counselling. JP cites client work and that senior leaders relish challenge. Aspects of confidentiality are critical to effective coaching including risky behaviour as well as commercial confidentiality and maintaining professional distance is also important in the evidence of effective coaching.

Literature on the coach and coach relationship focus on the develop of an alliance between coach and coach but little evidence of what factors make a successful relationship although these can be inferred from other studies, e.g., empathy

Outcome studies (McGivern et al 2001) as a ROI study based on Jack Philips method of ROI leading to an estimate based approach and then decided to cut the number in half – although this was not really justified. JP assessed this as twaddle and rubbish and we need different methods for HRD (the bad research).

Identified 156 outcome studies between 1998 & 2010. Of these, most are small-scale with 30 or so participants and some RCTs. Miller used quasi-experimental study and found no statistical significance on a beneficial impact of coaching but this may be that the coaching intervention was limited and didn’t lead to behavioural change or that managers tended to revert to a more directive styles. Also. a lot of RCT studies involve students not in organisations but these did show psychological benefits of coaching around resilience and mental health. Passmore & Rehman (2012) RCT of military drivers found that a coaching approach reduced time for training and success rates increased.

Some outcome studies have involved longitudinal research evidencing a longer-term effect of coaching that may indicate that coaching is more effective, deeper learning and greater behavioural change than training interventions.

But coaching still only has a small number of studies and these have a small sample sizes compared to studies in health settings. e.g., conducting RCTs in organisations is difficult. Also, the isolation of variables and factors of interest can be difficult (Hawthorne effect), outcome study methods are often not fully described and that research is often undertaken by champions of coaching with inevitable biases.

Meta-analysis research, e.g., De Meuse, Dai and Lee (2009) but this was only based on four papers only so interesting in terms of being a meta-analysis but based on very little data (the ugly). Teeboon, et al (2013) and Jones (in press) more robust papers. Teeboon found positive benefits  around factors such as coping, goal directed and self-regulation, performance, attitudes and well-being at about the same level as other L&D interventions. So coaching is one of a number of effective interventions available for L&D practice. Jones study is of 24 RCT studies and looked at effect size on style of coaching and found a larger effect size of internal coaches compared to external coaches. Jones found that coaching had a medium to strong positive impact but the findings should be treated with caution given the small number of papers used.

The future of coaching research may be dominated by either (a) a business school use of case studies; (b) an organisational psychology approach model that disconnects scholarship from practice; (c) a medical based approach with an emphasis on evidence based practice that informs experts including scholar-practitioners.

Research needs to aim for larger RCTs involving random allocations involving two or more interventions, a control group and placebo group. Research needs to identify factors for effective coaching. Need larger scale meta-analysis to identify impact effect sizes.

This will improve understanding of efficacy and appropriateness of coaching or other interventions  and then which approaches to coaching are appropriate for different needs and which coaching behaviours are most effective. Also, identifying when is a client ready for coaching in terms of the individual and the organisation (i.e., as managerial support and a supportive culture). Lastly coach behaviour research underpins PG programmes and by professional body competence.

Digital Scholarship day of ideas: data [2]

This is the second session of the day I wanted to note in detailed (the first is here). The session it Robert Procter on Big Data and the sociological imagination, Professor of social informatics at the University of Warwick. These notes are written live from the live stream. So here we go:

The title has changed to Big Data and the Co-Production of Social Scientific Knowledge. The talk will explain a bit more on social informatics as a hybrid computer scientist and sociologist; the meaning of ‘big data’ and how academic sociology can use such data including the development of new tools and methods of inquiry – see COSMOS – and concluding with remarks how these elements may combine in an exciting understanding of how social science and technology may emerge through different stakeholders including crow-sourced approach.

Social informatics is inter-disciplinary study of factors that shape adoption of ICT and the social shaping of technology. Processes of innovation involving districted technologies are large in scale and involve diverse range of publics such as understanding social media as processes of large-scale social learning. Asking how social media works and how people can use it to further their aims. As it is public and involves social media makes it easier in many ways to see what is going on as the technology makes much of the data available (although its not entirely straightforward).

Social media is Rob’s primary area of interest. Recent research includes on the use of social media in scholarly communications to put research in the public domain. But the value of this is not entirely clear. Identified positive and negative view points. The research also looked at how academic publishers were responding to such changes in scholarly communications such as supporting the use of social media as well as developing tools to trace and aggregate the use of research data. This showed mixed results.

Another research project was on the use of Twitter use during the 2012 riots in England in conjunction with The Guardian. In particular, was social media important in spreading false information during such events. So the research looked at particular rumours identified in the corpus of Tweets. So how do rumours emerge in social media and how do people behave and respond to such rumours?

This leads to the question of how to analyse 2.5m Tweets which is qualitative data. Research needs to seek out structures and patterns to focus scarce human resources for closer analysis of the Tweets.

Savage and Burrows (2007) on empirical sociology arguing that the best sociology is being done by the commercial sector as they have access to data. Academic sociology becoming irrelevant. However, newer sources of data that provides for enhanced relevance of academic sociology and this is reinforced by the rise of open data initiatives. So we can feel more confident on the future of academic sociology.

But how this data is being used raises further issues such as linking mood in social media with stock market movements but this confuses correlation and causation. Other analysis has focused on challenges to dictatorial regimes and the promotion of democracy and political changes and for social movements to self-organise. Methodological challenges are concerned with dealing with the volume of data so combining computation tools with sociological sensitivity and understanding of the world. But many sociologists are wary of the ‘computational turn’.

Returning to the England riots looking at the rumour of rioters attacking a children’s hospital in Birmingham. This involves an interpretive piece of work focused on data that may provide useful and interesting results. So the rumour started with people reporting police congregating at the hospital and so people inferred that the hospital was under threat. The computational component was to discover a useful structure in the data using sentiment and topic analysis – divided Tweets into original and retweets that combine in to an information flows and some flows are bigger than others. Taking size of the information flow as an indicator of significance can provide an indication for where to focus the analysis. Used coding frames to capture the relevant ways people were responding to the information including accepting and challenging Tweets. This coding was used to visualise how information flows through Twitter. The rumour was initially framed as a possibility but mushroomed and different threads of the rumour emerged. The rumour initially spreads without challenge but later people began to Tweet alternative explanations for the police being her the hospital i.e., a police station is next to the hospital. So rumours do not go unchallenged and people apply common-sense reasoning to rumours. While rumours grow quickly in social media but the crowd sourcing effects of social media help in establishing what the likely truth is. This could be further enhanced through engagement from trusted sources such as news organisations or the police? This could be augmented by computational work to help address such rumour flows (see Pheme).

There is also the question of what the police were doing on Twitter at the time. In Manchester, accounts were created to disseminate what was happening and draw attention to events to the police so acting to inform public services.

This research indicates innovation as a co-production. People collective experimenting and discovering the limitations and benefits of social media. Uses of social media are emergent and shaped through exploration.

On to the development of tools for sociologists to analyse ‘big’ social data including COSMOS to help interrogate large social media data. This also involves linking social media data with other data sets [and so links to the open data]. So COSMOS assists in forging interdisciplinary working between sociologists and computing scientists, provide interoperable analysis tools and evolve capabilities for analysis. In particular, points to the issues of the black-boxing of computational analysis and COSMOS aims to make the computational processes as transparent as possible.
COSMOS tools include text analysis and social network analysis linked to other data sets. A couple of experimental tools are being developed on geolocation and on topic identification and clustering around related words. COSMOS research looking at social media and civil society; hate speech and social media, citizen science, crime sensing; suicide clusters and social media; and the BBC and tweeting the olympics. Points to an educational need for people to understand the public nature of social media especially in relation to hate speech.

Social media as digital agora, on the role of social media in developing civil society and social resilience through sharing information, holding institutions to account, inter-subjective sense-making, cohesion and so forth.

Sociology beyond the academy and the co-production of scientific knowledge. Points to examples such as the Channel 4 fact checker as an example of wider data awareness and understanding and citizen journalism mobilises people to document and disseminate what is going on in the world. Also gives the example of sousveillance of the police as a counter to the rise of the surveillance state. The Guardian’s use of volunteers to analyse MP expenses. So ‘the crowd’ is involved in social science through collecting and analysing data and so sociology is spanning the academy and so boundaries of the academy are becoming more porous. These developments create an opportunity to realise a ‘public sociology’ (Burawoy 2005) but this requires greater facilitation from the academy through engaging with diverse stakeholders, provision of tools, new forms of scholarly communication, training and capacity building and developing more open dialogues on research problems. Points to public lab and hackathons as means for people to engage with and do (social) science themselves.

A model of discussion events on Twitter

As previously discussed here & here, I am studying two Twitter discussion events as sites of professional identity formation and development. The broad structure of the two events is broadly similar to the research process of a Tweetstorm: “an online, open brainstorm-like session via Twitter” (Sie, Bitter-Rijpkema, and Sloep 2009: 60). A Tweetstorm was described as a six stage process involving: (i) the context established by, for example, a topic briefing; (ii) questions are presented on Twitter by the event moderator organised using the specified event hashtag; (iii) answers to the questions are given as tweets by participants; (iv) these tweets are aggregated, for example, using Tweet Archivist; (v) the aggregated tweets are analysed into categories and (vi) the categories are then analysed. The outputs from a Tweetstorm are a series of core statements drawn from the knowledge of the participating experts. As such, a Tweetstorm has similarities to the processes of Delphi studies (Nworie 2011) or collaborative concept mapping (Simone, Schmid, and McEwen 2001).

The individual discussion events broadly followed the structure of a Tweetstorm. However, in these discussion events, the Tweets are not aggregated, categorised or systematically analysed. Rather, they conclude with a call for participants to identify the key points of the discussions and any actions they may take in response to the points made.

Based on the notion of the Tweetstorm, the chat events’ structure can be summarised as follows:

Figure 1: structure of the chat events

Figure 1 Sie et al

 

References

Nworie, John. 2011. “Using the Delphi Technique in Educational Technology Research.” TechTrends 55 (5) (August 11): 24–30. doi:10.1007/s11528-011-0524-6.

Sie, Rory, Nino Pataraia, Eleni Boursinou, Kamakshi Rajagopal, Isobel Falconer, Marlies Bitter-rijpkema, Allison Littlejohn, and B Peter. 2013. “Goals , Motivation for , and Outcomes of Personal Learning through Networks: Results of a Tweetstorm.” Educational Technology & Society 16 (3): 59–75.

Simone, Christina De, Richard F. Schmid, and Laura A. McEwen. 2001. “Supporting the Learning Process with Collaborative Concept Mapping Using Computer-Based Communication Tools and Processes.” Educational Research and Evaluation 7 (2-3) (September 1): 263–283. doi:10.1076/edre.7.2.263.3870.

Social Network Analysis and Digital Data Analysis

Notes on a presentation by Pablo Paredes. The abstract for the seminar is:

This presentation will be about how to make social network analysis from social media services such as Facebook and Twitter. Although traditional SNA packages are able to analyse data from any source, the volume of data from these new services can make convenient the use of additional technologies. The case in the presentation will be about a study of the degrees of distance on Twitter, considering different steps as making use of streaming API, filtering and computing results.

The presentation is drawn from the paper: Fabrega, J. Paredes, P. (2013) Social Contagion and Cascade behaviours on Twitter. Information 4/2: 171-181.

These are my brief and partial notes on the seminar taken live (so “typos ahead!”).

Looking at gathering data from social network sites and on a research project on contagion in digital data.

Data access requires knowledge of the APIs for each platform but Apigee details the APIs of most social networks (although as an intermediary, this may lead to further issues in interfacing different software tools, e.g., Python tool kits may assist in accessing APIs directly rather than through Apigee). In their research, Twitter data was extracted using Python tools such as Tweepy (calls to Twitter) and NetworkX (a Python library for SNA) along with additional libraries including Apigee. These tools allow the investigation of different forms of SNA beyond ego-centric analysis.

Pablo presented a network diagram from Twitter using NodeXL as ego-networks but direct access to Twitter API would give more options in alternative network analysis . Diffusion of information on Twitter was not possible on NodeXL.

Used three degrees of influence theory from Christakes & Fowler 2008. Social influence diffuses to three degrees but not beyond due to noisy communication and technology/ time issues leading to information decay. For example, most RTs take place within 48 hrs so tends not to extend beyond a friends, friends friend! This relates to network instability and loss of interest from users beyond three degrees alongside increasing information competition as too intense beyond three degrees to diffusion decomposes.

The  direct research found a 3-5% RT rate in diffusion of a single Tweet. RT rates were higher with the use of a hashtag and correlate to the number of followers of the originator but negatively correlates to @_mentions in the original Tweet. This is possibly as a result of @_mentions being seen as a private conversations. Overall, less than 1% of RTs went beyond three degrees.

Conclusion is that diffusion in digital networks is similar to that found in physical networks which implies that there are human barriers to communication in online spaces. But the research is limited due to the limits on access to Twitter API as well as privacy policies on Twitter API. Replicability becomes very difficult as a result and this issue is compounded as API versions change and so software libraries and tools no longer work or no longer work in the same way. Worth noting that there is no way of knowing how Twitter samples the 1% of Tweets provided through the API. Therefore, there is a need to access 100% of the Twitter data to provide a clear baseline for understanding Twitter samples and justify the network boundaries.

Points to importance that were writing code using R/ Python preferable as easier to learn and with larger support communities.

Sociability and networking

I’m currently analysing a couple of Twitter chat events aimed at learning professionals. The analysis will mainly be qualitative but to make sense of what are often a messy and chaotic events, I’m currently doing some pattern searches on the nature and functions of participants’ interactions. This image is of a social network analysis of both communities over a three month period. Now I really wasn’t expecting something like this – more that two distinct groups would emerge with a few boundary spanners. What I’m seeing is a densely networked professional community with few distinct clusters and stronger ties between the two event communities than a casual read through the event archives would have suggested.
Network ALL2_BCMore analysis is needed on the types of exchanges being seen but its an interesting image nonetheless.