Research on context and retention

Research on context and retention

Non-Constitutional

Abstract

TogetherCrew is a project, incubated in RnDAO, that aims to empower communities with their data. With a team of network scientists, TogetherCrew is researching how Web3 social networks behave, with a special emphasis on retaining new community members and community health.

TogetherCrew is requesting read access to Arbitrum’s Discourse and Discord API to run an analysis with our partners (researchers at the University of California Santa Barbara) and present it to the community for research purposes.

For clarity, this analysis ONLY uses data that is already publicly available. No private data will be collected. No data will be sold to third parties nor used without consent.

This proposal also allows us to test Hivemind, a bot that reads public data from Discourse and Discord and allows community members to ask questions: e.g. “has anyone discussed the topic of Treasury management before?” or “What were the most common objections to proposals related to incentives?”.

No budget is required as this research has already been funded through other sources (academic grants and previous Web3 grants from Aave, Aragon, Celo, Pocket Network and MetaCartel to develop the data pipelines, infrastructure, and previous research on community helath).

Motivation

Our understanding of Web3 social networks is still in its infancy. Studying the configuration of a social network and its impact on member retention could yield valuable insights to improve the resilience of Web3 communities.

Additionally, the social network mapping analysis gives us a potential tool to identify high-context community members (e.g. a recent analysis of Optimism’s community enabled their community team to identify potential community ambassadors who had been missed). This analysis could be used to suggest members to invite to the offsite (non-binding, just as an exploratory analysis).

Specifications

We care about data privacy. We only collect data that is already in the public domain (e.g. public discourse posts yes, but not the email addresses associated with their account). Feedback was provided by the Arbiturm Foundation on our privacy policy and terms of service.

The solution is based on an (open-source) bot that will get read access to selected community platforms (for now Discourse and public channels in Discord). This bot has already been deployed in 100+ communities, including Optimism, Celo, Aave, Aragon, Shardeum, etc. The permissions are configured to be low-risk (no admin permission, no managing channels, etc. so even if the bot was hacked, the discord and discourse servers wouldn’t be compromised) and we have battle-tested for scalability and reliability.

The analyses of the data are performed by the TogetherCrew team of scientists and our research partners at the University of California Santa Barbara. Those handling the data have undergone ethical data management training and could lose their academic credentials for unethical conduct.

The architecture has been developed by our tech lead who’s ex-Accenture and built a data startup in the medical industry. So although the data we manage is low risk (no private data, only public handles, etc.) we’re still investing in data security progressively.

We don’t use the data to train any AI models. Some analyses might use AI, in which case we favour open-source models that don’t feed on user data.

Changes to the Terms of Service or Privacy Policy:
TogetherCrew currently has no changes planed, but should a change be planned, TogetherCrew will give a minimum of 2 weeks’ notice to the DAO, via a new proposal post, before such changes take effect. Thus allowing any delegate to submit a snapshot vote to suspend the service if desired.

Steps to Implement

A favourable snapshot vote will give the foundation the permission to give us read access to the Discourse API and public Discord channels. No further actions are required. We’ll present our findings to the community (progressively over 1-3 months).
Additionally, we’ll run a pilot of Hivemidn which uses this same data to answer community questions.

Budget

No budget requested

We do hope in the future to develop additional functionalities communities would pay for. E.g.:

In parnterhip with SingulairtyNET we’ve been developing Hivemind - an AI-powered Q&A bot that acts as a research assistant enabling community members to ask questions such as:

  • “How do I participate in governance?”
  • “Has anyone proposed something on topic XYZ?”
  • “What was discussed last month while I was away”
  • etc.

And funded by the Arbitrum questbook program we’re also developing Dynamic Reputation NFTs, where members who desire to do so can mint their context score onchain. Thos who don’t desire it, don’t need to take any action, the NFTs are opt-in only.

For now, the solution is made available free of charge.

Had there been a previous proposal?
We had posted about this some time ago but didn’t move forward with it. Now with the discussion around context we’re curious to explore that and see if we can provide a solution to the DAO.

6 Likes

To clarify, the AF provided feedback on the wider proposal as requested, and asked questions pertaining to the privacy policy and terms of service

5 Likes

Could you please identify the researchers at UCSB and provide a copy of their IRB submission and IRB board approval? Assuming this research is sanctioned by the University, they should be able to provide the DAO with these documents which will, among other things, identify the research questions, responsible parties, the risks, and how they intend to utilize the data (along with their privacy policy, etc.).

If not sanctioned, or they are unable to provide the IRB approval, please indicate why not.

1 Like

I really appreciate the open-source nature of this proposal and the fact that all the data being analyzed is already public, so there are no privacy concerns. Plus, the fact that they’re not even asking for any budget makes this an easy decision.

Looking forward to seeing the results!

1 Like

Good proposal.
Regarding additional future functions:

I think it would be convenient for the authors of the proposals to have a final message with all the pros and cons of the proposal before the vote (on Wednesday).

1 Like

I think this is a no-brainer assuming that all the necessary precautions, ethical considerations around data usage, privacy and so on are dealt with and deemed fine. I really like what Danielo and the RnDAO team are doing around improving social organisation and coordination across DAOs, and believe it is much more important than the attention it is currently receiving would suggest.

Going over the terms of service / privacy policy, one thing that caught my eye was the paragraphs saying that the ToS and PP may be modified without notice (wouldn’t this be non-compliant with GDPR?) - here I think it’d be reasonable to include a requirement to notify (in this case) the DAO of any changes involving the use of this data.

Kudos to you and team @danielo

2 Likes

That’s a good point. I’ll edit the proposal so TogetherCrew notifies the DAO of any privacy policy or TS changes at least 2 weeks before they take effect to allow for a snapshot vote to suspend the service if desired.

1 Like

We agree with many of the previous comments. Just a few questions: Is it sufficient to include in the proposal that TogetherCrew notify the DAO? I ask because their privacy policies are very clear.
We understand that the findings will be presented to the community over a period of 1 to 3 months. Is that the duration of this proposal, or is the timeframe indefinite?

Here’s what I got from the team

“The research will be conducted in collaboration with the META lab at UCSB, led by Distinguished Professor Jonathan Schooler (META Lab | Psychological & Brain Sciences | UC Santa Barbara). The first stage of the project consisted of an exploratory analysis of community dynamics at Optimism, which yielded various interesting preliminary findings. This led to our decision to turn these explorations into an official research project where multiple online communities are studied to see which findings are consistent among communities. The IRB submission process for this research has been initiated and we will follow up with the relevant documents as soon as this is completed. From the start, only fully anonymized data has been used and all research practices match the GDPR compliance as described in TogetherCrew’s privacy policy (Privacy and Terms).”

2 Likes

Appreciate the reply and sharing the lab and PI. I look forward to reviewing the IRB and the final research. I just wanted to note for you, usually IRB should be approved before data collection or analysis in human subjects research, even if the risks are de minimis.

I hear you. Do note that the data in that case had already been collected because TogetherCrew was doing analytics for them. So the extra step was anonymizing it and giving the researchers access (with consent from the client of course).

  1. Given that consent is based on the terms of the proposal, it would be a breach of consent to keep taking data if the terms are not respected. Please do note that the data is publicly available already, anyone could scrape it. We’re choosing to get community consent as a general principle. And we also hope the relationship could evolve over time and TogetherCrew can offer significant value to Arbitrum to unlock some funding. So we’re trying to be a responsible stakeholder :slight_smile:
  2. The 2-3 month deadline is indicative and likely to be extended as academic research can take a lot longer. We do not know whether that’d be the case, and we’re hoping to present something to the community within 2-3 months, but depending on what’s found, it could take a lot longer or not. Hard to say and this is in good part outside our control as it mostly depends on our academic partners prioritisation of activities. We hope that early findings are interesting enough and encourage them to keep researching. And we hope to grow the dataset with other communities making this one of the largest studies ever on community health and network behaviour outside of major social platforms. But it’s still very early. Also, TogetherCrew is aiming to provide different solutions to empower communities with their data, so we might make available other functionalities to Arbitrum, always respecting individual consent. E.g. the dynamic reputation NFTs that were funded by ArbitrumDAO via Questbook will be available soon and individuals could choose to mint them as proof of community participation. Again, please do note that we’re not collecting any private data unless individuals actively go to TogetherCrew’s interface and submit said data.
2 Likes

Maybe share the findings, since all study was consensual and public. Would make it easier for all on the forum to gauge what they are signing up for !

Wouldn’t it be illegal (or in the very least highly unethical) to scrape such data without consent ? To seek consent is not a general principle, it’s a legal prerequisite…

1 Like

It’s public data, so by law it’s not illegal to scrape it.

Now, I’m precisely here asking for consent…

The study is not published yet. Quite a bit of work left to do still

Q1: Is there a single report on the study (conducted in over 100 communities I read) available to publish as a sample ?

Q2: I went through the link embedded in “research on community health”. The document defines community health as the following-

" Our definition of Community Health includes five aspects:

  1. Community members
  2. Relationships between two members
  3. Cliques (informal groups) and subgroups (formal groups)
  4. The community
  5. The larger ecosystems of communities "

Does this mean that these are the things that would be measured as part of this study? If so, is there even a minute possibility that this bot would enable someone to use that publicly available data to map -

  • a complete list of all users in this discourse forum
  • a network graph that tells who messaged who / (or what? if that’s even possible)
  • an activity chart of negative/positive proposal feedback patterns of a user ??

Q3: IF any of these possibilities were true, wouldn’t it be akin to having a constituency list, with each voting block mapped and their interactions visible? Wouldn’t any politician, pay a hand or a leg to access this goldmine of a database?

Disclaimer:
I highlight these concerns with no privileged information other than what is provided in the discussion thread. I claim no knowledge of the process involved in the suggested meta-study. All these questions I ask, I seek to learn.

As cool as this sounds, it would be a highly incomplete map as a lot of the governance occur offchain and across other communication platforms

Q1: The TogetherCrew bot is deployed in 100+ communities. That does not mean 100+ communities are part of the study. To date, Optimism is part of the Study and we haven’t yet asked the others.

Q2: The article you’re referencing on community health is not connected to the current study.

Q3: we haven’t done research on the willingness to pay by politicians for said dataset. But as I explained already, the data is already public. A simple bot could scrape it. There are already open-source versions of these bots, so anyone with basic development knowledge can extract it already. And doing so would be GDRP complaint because it’s public data.

The following reflects the views of the Lampros Labs DAO governance team.

We have been actively gathering data from different public forums and really appreciate this proposal for improving transparency. The proposal addresses concerns from all angles and ensures access to valuable research at no cost to the DAO. Given the organization’s previous record of delivering results to other DAOs, this makes sense and could be beneficial for the Arbitrum community.

2 Likes