Notice of Application Pause
All application submissions on the InfoReady platform are paused until late January or early February as SOMAR takes time to evaluate and develop a new system and process aimed at improving your user experience. We will announce when applications are open again and provide details on where to go and how to begin a new application.
Have questions about SOMAR's data? See the following resources:
- Meta Content Library FAQs
Frequently Asked Questions
What is SOMAR?
The Social Media Archive (SOMAR) at the Inter-university Consortium for Political and Social Research (ICPSR) is a collection of public and restricted data from various social media platforms organized and stored for research and analysis purposes. With their data available to the community, SOMAR aims to help researchers and community members better understand social media behavior and trends. In addition, the data can inform the development of new technologies and services.
What is the goal of the Social Media Archive (SOMAR)?
SOMAR will provide access to social media data and develop a robust set of wraparound services, including training in social media data use and learning opportunities for the community.
What is public data?
Public data is information available to anyone without any restrictions on access.
What is restricted data?
Restricted data is information not available to the general public and may only be accessed by authorized users. To gain access to this type of data, a data user must complete a restricted data application.
Who will be able to access the data, and how?
The SOMAR project will democratize access to some of the most consequential information in contemporary society. By providing a reliable, unbiased resource to data users everywhere, ICPSR and SOMAR foster clarity and transparency during a time in which these qualities seem ever scarcer. Much of SOMAR's data will be available through approved restricted data applications, and the data will be accessed through a virtual data enclave.
How is the privacy of users protected in the archive?
The privacy of users is a top priority for SOMAR. Therefore, all data in the archive are de-identified, meaning it has been stripped of any information that could be used to identify individual users. In addition, access to restricted data is tightly controlled, and only authorized users who have agreed to strict confidentiality terms are granted access. Please visit our privacy policy for more information.
How is data confidentiality being handled in SOMAR?
ICPSR and SOMAR are experienced with handling data with the utmost confidentiality and privacy. Stringent protections are in place for securing and accessing sensitive data and ensuring that any analyses of SOMAR data do not reveal sensitive information about individuals. This attention to ethical data use is irreplaceable when it comes to the data of millions of social media users.
The dataset I wish to analyze is restricted. What do I have to do to get the data?
Information on obtaining restricted data can be found on their respective study home pages. When you click the "Apply for Restricted Data" button, you will find instructions for accessing and preparing the restricted data application specific to that dataset(s).
Before you begin filling out the paperwork, you should know the following:
- Restricted data are generally only made available to PhD-level researchers and their staff; these data are typically not appropriate for undergraduate-level projects or class projects.
- The review process typically takes 2-4 weeks, but it may take more or less time, depending on the number of revisions needed and the responsiveness of institutional representatives.
- You'll need to fill out documentation describing your project and how you'll protect the data. This may include providing technical specifications on the security of your work environment.
- You will need the Institutional Review Board at your institution to review your project and agree that it can go forward as described.
SOMAR does not discourage the use of restricted data. In fact, we've put a lot of effort into building systems that make these data available to data users. We are, however, very serious about protecting respondent confidentiality and ensuring that sensitive data are used appropriately.
What is typically included in a restricted data application?
- Principal Investigator information: Includes name, contact information, and institutional affiliation of the principal investigator. In most cases, the principal investigator must have a Ph.D., J.D., or M.D. degree and be affiliated with an academic or research institution.
- Research staff information: Includes name, contact information, and institutional affiliation of all research staff members who will access the data. All research staff members must be affiliated with the same institution as the Investigator.
- Research description: Applicants will submit a research description that includes a description of the research project for which the data will be used and transparent information about why these specific data are being requested. If multiple restricted studies or series are being requested, applicants should explain why each study or series is required.
- Confidential Data Security Plan: Applicants must agree and adhere to the security terms in the Restricted Data Use Agreement.
- Restricted Data Use Agreement: Applicants must submit a Restricted Data Use Agreement. This is an agreement between the University of Michigan and the investigator’s institution, signed by both the investigator and a legal representative of the investigator’s institution, which specifies the terms of use of the restricted data.
- IRB approval or exemption: Applicants must provide IRB approval or exemption documentation for the proposed research project.
- Other requirements: Some restricted data have additional requirements, such as obtaining special certifications or filling out additional forms.
Once my restricted application has been approved, how do I gain access to the data?
Data users are approved to access the data via a remote desktop connection called the Virtual Data Enclave (VDE). Data users cannot move files from the remote desktop to their desktop or the Internet. To receive output from the VDE, data users must request that ICPSR conduct a disclosure review on the desired files.
Why and how should I cite data?
Proper citation ensures that research data can be: discovered, reused, replicated for verification, credited for recognition, and tracked to measure usage and impact.
Citing data is straightforward. Each citation must include the essential elements that allow a unique dataset to be identified over time:
- Author
- Title
- Distributor
- Date
- Version
- Persistent identifier (such as the Digital Object Identifier, Uniform Resource Name URN, or Handle System)
Here are some examples of ICPSR data citations:
Barnes, Samuel H. Italian Mass Election Survey, 1968. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 1992-02-16. https://doi.org/10.3886/ICPSR07953.v1
Schneider, Barbara, and Waite, Linda J. The 500 Family Study [1998-2000: United States]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2008-06-03. https://doi.org/10.3886/ICPSR04549.v1
For more information, see ICPSR's Citing Data web page.
Are there any publications yet from SOMAR data analysis or any early findings?
While SOMAR is still in its early stages, there are already social media data held at ICPSR, which will be cross-listed when SOMAR is up and running. The datasets include:
How is SOMAR funded?
SOMAR has been made possible by a $41,000 Propelling Original Data Science grant from the Michigan Institute for Data Science, called "Ensuring FAIRness in Social Media Archives". In 2022, Meta provided a $1.3 million gift to support SOMAR's vision and help build the archive so that it continues to exist and support research for years to come. Other funders have an opportunity to get involved, and potential supporters are encouraged to reach out to the ISR Development team.
I'm a journalist. Who do I contact for SOMAR interview requests?
SOMAR project lead Libby Hemphill directs the Resource Center for Minority Data at ICPSR and holds a joint appointment as an associate professor at the U-M School of Information. You may also contact the SOMAR team at somar-help@umich.edu.
I have social media data to share. How can I get involved?
SOMAR accepts data deposits from researchers and will build data-sharing partnerships with social media companies. Institutions are encouraged to contact ISR Development Director Henry Jewell to join the movement to democratize social media data. In addition, individual PIs are encouraged to email the SOMAR team at somar-help@umich.edu.