To read this content please select one of the options below:

Please note you do not have access to teaching notes, adoption of online streaming services: moderating role of personality traits.

International Journal of Retail & Distribution Management

ISSN : 0959-0552

Article publication date: 16 July 2021

Issue publication date: 27 April 2022

The purpose of this paper is to study the adoption of online streaming services from the technology acceptance perspective. A conceptual model incorporating personality traits with the technology acceptance model (TAM) is proposed and tested to predict user's intention to use online streaming services. Apart from the direct effects of personality traits on TAM variables, the study also examines the moderating effect of personality traits on TAM relationships.


To test the proposed model, a structured questionnaire was developed by adapting existing scales for the constructs to suit the online streaming services context. The data for the study were collected from online streaming services users in India. The model was tested using structural equation modeling using AMOS 18. Moderation analysis was performed using the PROCESS MACRO.

The findings suggest that perceived ease of use, subjective norms and technology anxiety affect intention to use online streaming services. Self-efficacy was found to affect perceived ease of use positively, and technology anxiety was found to have a negative effect on perceived usefulness. The results also evidenced the moderating role of self-efficacy and technology anxiety.


The paper explores the adoption of online streaming services from the technology acceptance perspective. Further, very few studies have examined the moderating role of personality traits in technology adoption. This paper attempts to fill this gap. It expands the understanding of technology adoption literature by assessing the direct as well as moderating effect of personality traits.

  • Technology acceptance model (TAM)
  • Personality traits
  • Technology anxiety
  • Self-efficacy
  • Moderating effect
  • Online streaming service retailer


The author would like to thank anonymous reviewers for their comments and suggestions to improve the manuscript.

Bhatt, K. (2022), "Adoption of online streaming services: moderating role of personality traits", International Journal of Retail & Distribution Management , Vol. 50 No. 4, pp. 437-457.

Emerald Publishing Limited

Copyright © 2021, Emerald Publishing Limited

Related articles

We’re listening — tell us what you think, something didn’t work….

Report bugs here

All feedback is valuable

Please share your general feedback

Join us on our journey

Platform update page.

Visit to discover the latest news and updates

Questions & More Information

Answers to the most commonly asked questions here

Book cover

International Conference on Human-Computer Interaction

HCII 2020: Social Computing and Social Media. Design, Ethics, User Behavior, and Social Network Analysis pp 227–242 Cite as

The Law of Live Streaming: A Systematic Literature Review and Analysis of German Legal Framework

  • Kaja J. Fietkiewicz   ORCID: 9  
  • Conference paper
  • First Online: 10 July 2020

4893 Accesses

2 Citations

Part of the Lecture Notes in Computer Science book series (LNISA,volume 12194)

With evolved streaming technologies and faster mobile broadband, more and more live streaming platforms emerge online and become very popular among the users. From general platforms for streaming everyday life (e.g., YouNow) or reporting on news events (e.g., Periscope), through platforms for streaming video games (e.g., Twitch) or certain artistic performances (e.g., Picarto), the range of the services became very wide. As in most social media domains and with new developments on the digital market, the question arises whether the new trends also bear new challenges and issues of legal or ethical nature. This study is a systematic literature review of international scientific research on live streaming and potential legal problems (N = 22) conducted in order to pursue this question. It also entails a short review of legal issues with live streaming in Germany, a country with relatively strict consumer laws (e.g., data privacy) as well as first laws aiming at getting better control over the social media companies and users (e.g., Network Enforcement Act). The most prevalent legal domain within research on live streaming are copyright and sports broadcasting laws. The still understudied areas appear to be privacy, personality rights, and youth protection regulations. The most prominent issue within German legal discourse is the classification of live streaming as a telemedia offer or a broadcast, the second one entailing more restrictions and requirements (e.g., a broadcasting license).

  • Live streaming
  • Personality rights
  • Sport broadcasting

This is a preview of subscription content, access via your institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only .

Fietkiewicz, K.J.: Guest editorial preface: special issue on live videos in social media. Int. J. Interact. Commun. Syst. Technol. 9 , vi–viii (2019)

Google Scholar  

Scheibe, K., Fietkiewicz, K.J., Stock, W.G.: Information behavior on social live streaming services. J. Inf. Sci. Theory Pract. 4 , 6–20 (2016).

CrossRef   Google Scholar  

Fietkiewicz, K.J., Stock, W.: Introduction to the minitrack on live streaming services. In: Proceedings of the 52nd Hawaii International Conference on System Sciences, pp. 2536–2537 (2019).

Fietkiewicz, K.J., Dorsch, I., Scheibe, K., Zimmer, F., Stock, W.G.: Dreaming of stardom and money: micro-celebrities and influencers on live streaming services. In: Meiselwitz, G. (ed.) SCSM 2018. LNCS, vol. 10913, pp. 240–253. Springer, Cham (2018).

Hilvert-Bruce, Z., Neill, J.T., Sjöblom, M., Hamari, J.: Social motivations of live-streaming viewer engagement on Twitch. Comput. Hum. Behav. (2018).

Fietkiewicz, K.J., Scheibe, K.: Good morning… good afternoon, good evening and good night: adoption, usage and impact of the social live streaming platform YouNow. In: Proceedings of the 3rd International Conference on Library and Information Science, pp. 23–25 (2017)

Sjöblom, M., Törhönen, M., Hamari, J., Macey, J.: The ingredients of Twitch streaming: affordances of game streams. Comput. Hum. Behav. 92 , 20–28 (2019).

Sjöblom, M., Hamari, J.: Why do people watch others play video games? An empirical study on the motivations of Twitch users. Comput. Hum. Behav. 75 , 985–996 (2017).

Fietkiewicz, K.J., Lins, E.: New media and new territories for European law: competition in the market for social networking services. In: Knautz, K., Baran, K.S. (eds.) Facets of Facebook: Use and Users, pp. 285–324. De Gruyter, Berlin/Boston (2016).

Specht, L.: Zum Verhältnis von (Urheber-) Recht und Technik. GRUR, pp. 253–259 (2019)

Kasakowskij, T., Fürst, J., Fischer, J., Fietkiewicz, K.J.: Network enforcement as denunciation endorsement? A critical study on legal enforcement in social media. Telemat. Informatics. 46 (2020).

Woollacott, E.: EU Copyright Directive Passed - Upload Filters and All.

Okoli, C., Schabram, K.: A guide to conducting a systematic literature review of information systems research. Sprouts Work. Pap. Inf. Syst. 10 , 49 (2010).

Zhang, D.Y., Song, L., Li, Q., Zhang, Y., Wang, D.: StreamGuard: a Bayesian network approach to copyright infringement detection problem in large-scale live video sharing systems. In: Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018, pp. 901–910 (2019).

He, K., Maillé, P., Simon, G.: Delivery of live watermarked video in CDN: fast and scalable algorithms. In: Proceedings of the 27th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, NOSSDAV 2017, pp. 79–84 (2017).

Borghi, M.: Chasing copyright infringement in the streaming landscape. In: IIC International Review of Intellectual Property and Competition Law, vol. 42, pp. 316–343 (2011)

Zhang, D.Y., Li, Q., Tong, H., Badilla, J., Zhang, Y., Wang, D.: Crowdsourcing-based copyright infringement detection in live video streams. In: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018, pp. 367–374 (2018).

Postel, C.: “Let’s play”: YouTube & Twitch’s video game footage & a new approach to fair use. Hastings Law J. 68 , 1169–1192 (2017)

Taylor Jr., I.O.: Video games, fair use and the internet: the plight of the let’s play. J. Law Technol. Policy 2015 (1), 247–271 (2015)

Sakthivel, M.: Webcasters’ protection under copyright - a comparative study. Comput. Law Secur. Rev. 27 , 479–496 (2011).

Lim, S.C., Chik, W.B.: Whither the future of internet streaming and time-shifting? Revisiting the rights of reproduction and communication to the public in copyright law after Aereo. Int. J. Law Inf. Technol. 23 , 53–88 (2015).

Faklaris, C., Cafaro, F., Hook, S.A., Blevins, A., O’Haver, M., Singhal, N.: Legal and ethical implications of mobile live-streaming video apps. In: Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct, MobileHCI 2016, pp. 722–729 (2016).

Jung, A.K., Sell, J.I., Stratmann, J.: Determining the ethical dimensions of live streaming: an explorative delphi study. In: 26th European Conference on Information System Beyond Digitization – Facets of Socio-Technical Change, ECIS 2018 (2018)

Fuller, M.Y., Mukhopadhyay, S., Gardner, J.M.: Using the periscope live video-streaming application for global pathology education: a brief introduction. Arch. Pathol. Lab. Med. 140 , 1273–1280 (2016).

Birmingham, J., David, M.: Live-streaming: will football fans continue to be more law abiding than music fans? Sport Soc. 14 , 69–80 (2011)

Ainslie, A.: The burden of protecting live sports telecasts: the real time problem of live streaming and app-based technology. SSRN Electron. J. (2016).

Kariyawasam, K., Tsai, M.: Copyright and live streaming of sports broadcasting. Int. Rev. Law Comput. Technol. 31 , 265–288 (2017).

Edelman, M.: From meerkat to periscope: does intellectual property law prohibit the live streaming of commercial sporting events. Columbia J. Law Arts 39 , 1–38 (2016)

Holden, J.T., Kaburakis, A., Rodenberg, R.M.: The future is now: Esports policy considerations and potential litigation. J. Legal Aspects Sport 46–78 (2017).

Holden, J.T., Edelman, M., Baker III, T.A.: A short treatise on esports and the law: how America regulates its next national pastime. Univ. Ill. Law Rev. 2020 (2), 509–582 (2020)

Scheibe, K., Zimmer, F., Fietkiewicz, K.J.: Das Informationsverhalten von Streamern und Zuschauern bei Social Live-Streaming Diensten am Fallbeispiel YouNow. [The information behavior of streamers and viewers on social live streaming services at the example of YouNow]. Information-wiss. und Prax. 68 , 352–364 (2017).

Zimmer, F., Fietkiewicz, K.J., Stock, W.G.: Law infringements in social live streaming services. In: Tryfonas, T. (ed.) HAS 2017. LNCS, vol. 10292, pp. 567–585. Springer, Cham (2017).

Honka, A., Frommelius, N., Mehlem, A., Tolles, J.N., Fietkiewicz, K.J.: How safe is YouNow ? – an empirical study on possible law infringements in Germany and the United States. J. Macro Trends Soc. Sci. 1 , 1–17 (2015)

Horsman, G.: A forensic examination of the technical and legal challenges surrounding the investigation of child abuse on live streaming platforms: a case study on periscope. J. Inf. Secur. Appl. 42 , 107–117 (2018).

Stewart, D.R., Littau, J.: Up, periscope: mobile streaming video technologies, privacy in public, and the right to record. Journal. Mass Commun. Q. 93 , 312–331 (2016).

Rundfunkrechtliche Zulassungspflicht für Live-Streams. MMR, 133 (2019)

Leeb, C.-M., Seiter, F.: Rundfunklizenzpflicht für Streaming-Angebote? ZUM, 573–581 (2017)

Törhönen, M., Hassan, L., Sjöblom, M., Hamari, J.: Play, playbour or labour? The relationships between perception of occupational activity and outcomes among streamers and YouTubers. In: Proceedings of the 52nd Hawaii International Conference on System Sciences (2019).

Martini, M.: TMG § 1 Anwendungsbereich. In: Gersdorf, H., Paal, P.B. (eds.) BeckOK Informations- und Medienrecht. Verlag C.H.Beck, München (2019)

Hentsch, C.-H.: Die Urheberrechte der Publisher bei eSport. MMR, 3 (2018)

Störerhaftung von EU-ausländischen Upstream-Providern für illegale Live- Streams von Spielen der Fußball-Bundesliga. ZUM, 67 (2016)

Unerlaubte Weitersendung via Internet. ZUM, 873 (2017)

Verstoß gegen den Jugendmedienschutz durch Live-Stream. ZUM-RD, 369 (2018)

Hopf, K., Brami, B.: Die Entwicklung des Jugendmedienschutzes 2016/2017. ZUM, 1 (2018)

Download references

Author information

Authors and affiliations.

Heinrich Heine University, Universitätsstr. 1, 40225, Düsseldorf, Germany

Kaja J. Fietkiewicz

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Kaja J. Fietkiewicz .

Editor information

Editors and affiliations.

Towson University, Towson, MD, USA

Dr. Gabriele Meiselwitz

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Cite this paper.

Fietkiewicz, K.J. (2020). The Law of Live Streaming: A Systematic Literature Review and Analysis of German Legal Framework. In: Meiselwitz, G. (eds) Social Computing and Social Media. Design, Ethics, User Behavior, and Social Network Analysis. HCII 2020. Lecture Notes in Computer Science(), vol 12194. Springer, Cham.

Download citation


Published : 10 July 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-49569-5

Online ISBN : 978-3-030-49570-1

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 29 November 2023

Scaling deep learning for materials discovery

  • Amil Merchant   ORCID: 1   na1 ,
  • Simon Batzner 1   na1 ,
  • Samuel S. Schoenholz 1   na1 ,
  • Muratahan Aykol   ORCID: 1 ,
  • Gowoon Cheon 2 &
  • Ekin Dogus Cubuk   ORCID: 1   na1  

Nature volume  624 ,  pages 80–85 ( 2023 ) Cite this article

88k Accesses

1 Citations

589 Altmetric

Metrics details

  • Computer science
  • Scaling laws

Novel functional materials enable fundamental breakthroughs across technological applications from clean energy to information processing 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 . From microchips to batteries and photovoltaics, discovery of inorganic crystals has been bottlenecked by expensive trial-and-error approaches. Concurrently, deep-learning models for language, vision and biology have showcased emergent predictive capabilities with increasing data and computation 12 , 13 , 14 . Here we show that graph networks trained at scale can reach unprecedented levels of generalization, improving the efficiency of materials discovery by an order of magnitude. Building on 48,000 stable crystals identified in continuing studies 15 , 16 , 17 , improved efficiency enables the discovery of 2.2 million structures below the current convex hull, many of which escaped previous human chemical intuition. Our work represents an order-of-magnitude expansion in stable materials known to humanity. Stable discoveries that are on the final convex hull will be made available to screen for technological applications, as we demonstrate for layered materials and solid-electrolyte candidates. Of the stable structures, 736 have already been independently experimentally realized. The scale and diversity of hundreds of millions of first-principles calculations also unlock modelling capabilities for downstream applications, leading in particular to highly accurate and robust learned interatomic potentials that can be used in condensed-phase molecular-dynamics simulations and high-fidelity zero-shot prediction of ionic conductivity.

The discovery of energetically favourable inorganic crystals is of fundamental scientific and technological interest in solid-state chemistry. Experimental approaches over the decades have catalogued 20,000 computationally stable structures (out of a total of 200,000 entries) in the Inorganic Crystal Structure Database (ICSD) 15 , 18 . However, this strategy is impractical to scale owing to costs, throughput and synthesis complications 19 . Instead, computational approaches championed by the Materials Project (MP) 16 , the Open Quantum Materials Database (OQMD) 17 , AFLOWLIB 20 and NOMAD 21 have used first-principles calculations based on density functional theory (DFT) as approximations of physical energies. Combining ab initio calculations with simple substitutions has allowed researchers to improve to 48,000 computationally stable materials according to our own recalculations 22 , 23 , 24 (see Methods ). Although data-driven methods that aid in further materials discovery have been pursued, thus far, machine-learning techniques have been ineffective in estimating stability (decomposition energy) with respect to the convex hull of energies from competing phases 25 .

In this paper, we scale up machine learning for materials exploration through large-scale active learning, yielding the first models that accurately predict stability and, therefore, can guide materials discovery. Our approach relies on two pillars: first, we establish methods for generating diverse candidate structures, including new symmetry-aware partial substitutions (SAPS) and random structure search 26 . Second, we use state-of-the art graph neural networks (GNNs) that improve modelling of material properties given structure or composition. In a series of rounds, these graph networks for materials exploration (GNoME) are trained on available data and used to filter candidate structures. The energy of the filtered candidates is computed using DFT, both verifying model predictions and serving as a data flywheel to train more robust models on larger datasets in the next round of active learning.

Through this iterative procedure, GNoME models have discovered more than 2.2 million structures stable with respect to previous work, in particular agglomerated datasets encompassing computational and experimental structures 15 , 16 , 17 , 27 . Given that discovered materials compete for stability, the updated convex hull consists of 381,000 new entries for a total of 421,000 stable crystals, representing an-order-of-magnitude expansion from all previous discoveries. Consistent with observations in other domains of machine learning 28 , we observe that our neural networks predictions improve as a power law with the amount of data. Final GNoME models accurately predict energies to 11 meV atom −1 and improve the precision of stable predictions (hit rate) to above 80% with structure and 33% per 100 trials with composition only, compared with 1% in previous work 17 . Moreover, these networks develop emergent out-of-distribution generalization. For example, GNoME enables accurate predictions of structures with 5+ unique elements (despite omission from training), providing one of the first strategies to efficiently explore this chemical space. We validate findings by comparing predictions with experiments and higher-fidelity r 2 SCAN (ref.  29 ) computations.

Finally, we demonstrate that the dataset produced in GNoME discovery unlocks new modelling capabilities for downstream applications. The structures and relaxation trajectories present a large and diverse dataset to enable training of learned, equivariant interatomic potentials 30 , 31 with unprecedented accuracy and zero-shot generalization. We demonstrate the promise of these potentials for materials property prediction through the estimation of ionic conductivity from molecular-dynamics simulations.

Overview of generation and filtration

The space of possible materials is far too large to sample in an unbiased manner. Without a reliable model to cheaply approximate the energy of candidates, researchers guided searches by restricting generation with chemical intuition, accomplished by substituting similar ions or enumerating prototypes 22 . Although improving search efficiency 17 , 27 , this strategy fundamentally limited how diverse candidates could be. By guiding searches with neural networks, we are able to use diversified methods for generating candidates and perform a broader exploration of crystal space without sacrificing efficiency.

To generate and filter candidates, we use two frameworks, which are visualized in Fig. 1a . First, structural candidates are generated by modifications of available crystals. However, we strongly augment the set of substitutions by adjusting ionic substitution probabilities to give priority to discovery and use newly proposed symmetry aware partial substitutions (SAPS) to efficiently enable incomplete replacements 32 . This expansion results in more than 10 9 candidates over the course of active learning; the resulting structures are filtered by means of GNoME using volume-based test-time augmentation and uncertainty quantification through deep ensembles 33 . Finally, structures are clustered and polymorphs are ranked for evaluation with DFT (see Methods ). In the second framework, compositional models predict stability without structural information. Inputs are reduced chemical formulas. Generation by means of oxidation-state balancing is often too strict (for example, neglecting Li 15 Si 4 ). Using relaxed constraints (see Methods ), we filter compositions using GNoME and initialize 100 random structures for evaluation through ab initio random structure searching (AIRSS) 26 . In both frameworks, models provide a prediction of energy and a threshold is chosen on the basis of the relative stability (decomposition energy) with respect to competing phases. Evaluation is performed through DFT computations in the Vienna Ab initio Simulation Package (VASP) 34 and we measure both the number of stable materials discovered as well as the precision of predicted stable materials (hit rate) in comparison with the Materials Project 16 .

figure 1

a , A summary of the GNoME-based discovery shows how model-based filtration and DFT serve as a data flywheel to improve predictions. b , Exploration enabled by GNoME has led to 381,000 new stable materials, almost an order of magnitude larger than previous work. c , 736 structures have been independently experimentally verified, with six examples shown 50 , 51 , 52 , 53 , 54 , 55 . d , Improvements from graph network predictions enable efficient discovery in combinatorial regions of materials, for example, with six unique elements, even though the training set stopped at four unique elements. e , GNoME showcases emergent generalization when tested on out-of-domain inputs from random structure search, indicating progress towards a universal energy model.

All GNoME models are GNNs that predict the total energy of a crystal. Inputs are converted to a graph through a one-hot embedding of the elements. We follow the message-passing formulation 35 , 36 , in which aggregate projections are shallow multilayer perceptrons (MLPs) with swish nonlinearities. For structural models, we find it important to normalize messages from edges to nodes by the average adjacency of atoms across the entire dataset. Initial models are trained on a snapshot of the Materials Project from 2018 of approximately 69,000 materials. Previous work benchmarked this task at a mean absolute error (MAE) of 28 meV atom −1 (ref.  37 ); however, we find that the improved networks achieve a MAE of 21 meV atom −1 . We fix this promising architecture (see Methods ) and focus on scaling in the rest of this paper.

Active learning

A core step in our framework for accelerating materials discovery is active learning. In both structural and compositional frameworks, candidate structures filtered using GNoME are evaluated using DFT calculations with standardized settings from the Materials Project. Resulting energies of relaxed structures not only verify the stability of crystal structures but are also incorporated into the iterative active-learning workflow as further training data and structures for candidate generation. Whereas the hit rate for both structural and compositional frameworks start at less than 6% and 3%, respectively, performance improves steadily through six rounds of active learning. Final ensembles of GNoME models improve to a prediction error of 11 meV atom −1 on relaxed structures and hit rates of greater than 80% and 33%, respectively, clearly showing the benefits of scale. An analysis of final GNoME hit rates is provided in Fig. 1d .

Scaling laws and generalization

The test loss performance of GNoME models exhibit improvement as a power law with further data. These results are in line with neural scaling laws in deep learning 28 , 38 and suggest that further discovery efforts could continue to improve generalization. Emphatically, unlike the case of language or vision, in materials science, we can continue to generate data and discover stable crystals, which can be reused to continue scaling up the model. We also demonstrate emergent generalization to out-of-distribution tasks by testing structural models trained on data originating from substitutions on crystals arising from random search 26 in Fig. 1e . These examples are often high-energy local minima and out of distribution compared with data generated by our structural pipeline (which, by virtue of substitutions, contains structures near their minima). Nonetheless, we observe clear improvement with scale. These results indicate that final GNoME models are a substantial step towards providing the community with a universal energy predictor, capable of handling diverse materials structures through deep learning.

Discovered stable crystals

Using the described process of scaling deep learning for materials exploration, we increase the number of known stable crystals by almost an order of magnitude. In particular, GNoME models found 2.2 million crystal structures stable with respect to the Materials Project. Of these, 381,000 entries live on the updated convex hull as newly discovered materials.

Consistent with other literature on structure prediction, the GNoME materials could be bumped off the convex hull by future discoveries, similar to how GNoME displaces at least 5,000 ‘stable’ materials from the Materials Project and the OQMD. See Supplementary Note  1 for discussion on improving structures of already-discovered compositions. Nevertheless, Figs. 1 and 2 provide a summary of the stable materials, with Fig. 1b focusing on the growth over time. We see substantial gains in the number of structures with more than four unique elements in Fig. 2a . This is particularly promising because these materials have proved difficult for previous discovery efforts 27 . Our scaled GNoME models overcome this obstacle and enable efficient discovery in combinatorially large regions.

figure 2

a , GNoME enables efficient discovery in the combinatorial spaces of 4+ unique elements that can be difficult for human experts. b , Phase-separation energies (energy to the convex hull) for discovered quaternaries showcase similar patterns but larger absolute numbers than previous catalogues. c , Discovered stable crystals correspond to 45,500 novel prototypes as measured by XtalFinder (ref.  39 ). d , Validation by r 2 SCAN shows that 84% of discovered binary and ternary crystals retain negative phase separations with more accurate functionals.

Clustering by means of prototype analysis 39 supports the diversity of discovered crystals with GNoME, leading to more than 45,500 novel prototypes in Fig. 2c (a 5.6 times increase from 8,000 of the Materials Project), which could not have arisen from full substitutions or prototype enumeration. Finally, in Fig. 2b , we compare the phase-separation energy (also referred to as the decomposition enthalpy) of discovered quaternaries with those from the Materials Project to measure the relative distance to the convex hull of all other competing phases. The similarities in distribution suggest that the found materials are meaningfully stable with respect to competing phases and not just ‘filling in the convex hull.’ Further analyses of materials near to (but not on) the updated convex hull is given in Supplementary Note  3 .

Validation through experimental matching and r 2 SCAN

All candidates for GNoME are derived from snapshots of databases made in March 2021, including the Materials Project and the OQMD. Concurrent to our discovery efforts, researchers have continued to experimentally create new crystals, providing a way to validate GNoME findings. Of the experimental structures aggregated in the ICSD, 736 match structures that were independently obtained through GNoME. Six of the experimentally matched structures are presented in Fig. 1c and further details of the experimental matches are provided in Supplementary Note  1 . Similarly, of the 3,182 compositions added to the Materials Project since the snapshot, 2,202 are available in the GNoME database and 91% match on structure. A manual check of ‘newly’ discovered crystals supported the findings, with details in Supplementary Note  4 .

We also validate predictions to ensure that model-based exploration did not overfit simulation parameters. We focus on the choice of functional. Standard projector augmented wave (PAW)-Perdew–Burke–Ernzerhof (PBE) potentials provided a speed–accuracy trade-off suited for large-scale discovery 40 , 41 , but the r 2 SCAN functional provides a more accurate meta-generalized gradient approximation 29 , 42 , 43 . 84% of the discovered binaries and ternary materials also present negative phase-separation energies (as visualized in Fig. 2d , comparable with a 90% ratio in the Materials Project but operating at a larger scale). 86.8% of tested quaternaries also remain stable on the r 2 SCAN convex hull. The discrepancies between PBE and r 2 SCAN energies are further analysed in Supplementary Note  2 .

Composition families of interest

We highlight the benefits of a catalogue of stable materials an order of magnitude larger than previous work. When searching for a material with certain desirable properties, researchers often filter such catalogues, as computational stability is often linked with experimental realizability. We perform similar analyses for three applications. First, layered materials are promising systems for electronics and energy storage 44 . Methods from previous studies 45 suggest that approximately 1,000 layered materials are stable compared with the Materials Project, whereas this number increases to about 52,000 with GNoME-based discoveries. Similarly, following a holistic screening approach with filters such as exclusion of transition metals or by lithium fraction, we find 528 promising Li-ion conductors among GNoME discoveries, a 25 times increase compared with the original study 46 . Finally, Li/Mn transition-metal oxides are a promising family to replace LiCoO 2 in rechargeable batteries 25 and GNoME has discovered an extra 15 candidates stable relative to the Materials Project compared with the original nine.

Scaling up learned interatomic potentials

The process of discovery of stable crystals also provides a data source beyond stable materials. In particular, the ionic relaxations involve computation of first-principles energies and forces for a diverse set of materials structures. This generates a dataset of unprecedented diversity and scale, which we explore to pretrain a general-purpose machine-learning interatomic potential (MLIP) for bulk solids. MLIPs have become a promising tool to accelerate the simulation of materials by learning the energies and forces of reference structures computed at first-principles accuracy 30 , 47 , 48 , 49 . Existing efforts typically train models per material, with data often sampled from ab initio molecular dynamics (AIMD). This markedly limits their general applicability and adoption, requiring expensive data collection and training a new potential from scratch for each system. By making use of the GNoME dataset of first-principles calculations from diverse structural relaxations, we demonstrate that large-scale pretraining of MLIPs enables models that show unprecedented zero-shot accuracy and can be used to discover superionic conductors, without training on any material-specific data.

Zero-shot scaling and generalization

We scale pretraining of a NequIP potential 30 on data sampled from ionic relaxations. Increasing the pretraining dataset, we observe consistent power-law improvements in accuracy (see Fig. 3a,b ). Despite only being trained on ionic relaxations and not on molecular-dynamics data, the pretrained GNoME potential shows remarkable accuracy when evaluated on downstream data sampled from the new distribution of AIMD in a zero-shot manner, that is, in which no training data originate from AIMD simulations (see Fig. 3 ). Notably, this includes unseen compositions, melted structures and structures including vacancies, all of which are not included in our training set (see Supplementary Note  6.4 ). In particular, we find that the scale of the GNoME dataset allows it to outperform existing general-purpose potentials (see Fig. 3d ) and makes the pretrained potential competitive with models trained explicitly on hundreds of samples from the target data distributions (see Supplementary Note  6.4 ). We observe particularly pronounced improvements in the transferability of MLIPs, one of the most pressing shortcomings of MLIPs. To assess the transferability of the potentials, we test their performance under distribution shift: we train two types of NequIP potential on structures sampled from AIMD at T  = 400 K, one in which the network is trained from randomly initialized weights and the other in which we fine-tune from a pretrained GNoME checkpoint. We then measure the performance of both potentials on data sampled from AIMD at T  = 1,000 K (see Fig. 3c ), out of distribution with respective to the 400-K data. The potential pretrained on GNoME data shows systematic and strong improvements in transferability over the potential trained from scratch, even when training is performed on more than 1,000 structures. The zero-shot GNoME potential, not fine-tuned on any data from this composition, outperforms even a state-of-the-art NequIP model trained on hundreds of structures.

figure 3

a , Classification of whether a material is a superionic conductor as predicted by GNoME-driven simulations in comparison with AIMD, tested on 623 unseen compositions. The classification error improves as a power law with training set size. b , Zero-shot force error as a function of training set size for the unseen material K 24 Li 16 P 24 Sn 8 . c , Robustness under distribution shift, showing the MAE in forces on the example material Ba 8 Li 16 Se 32 Si 8 . A GNoME-pretrained and a randomly initialized potential are trained on data of various sizes sampled at T  = 400 K and evaluated on data sampled at T  = 1,000 K. The zero-shot GNoME potential outperforms state-of-the-art models trained from scratch on hundreds of structures. d , Comparison of zero-shot force errors of three different pretrained, general-purpose potentials for bulk systems on the test set of ref.  56 . Note that the composition Ni is not present in the GNoME pretraining data. RMSE, root-mean-square error.

Screening solid-state ionic conductors

Solid electrolytes are a core component of solid-state batteries, promising higher energy density and safety than liquid electrolytes, but suffer from lower ionic conductivities at present. In the search for novel electrolyte materials, AIMD allows for the prediction of ionic conductivities from first principles. However, owing to the poor scaling of DFT with the number of electrons, routine simulations are limited to hundreds of picoseconds, hundreds of atoms and, most importantly, small compositional search spaces. Here we show that the GNOME potentials show high robustness in this out-of-distribution, zero-shot setting and generalizes to high temperatures, which allows them to serve as a tool for high-throughput discovery of novel solid-state electrolytes. We use GNoME potentials pretrained on datasets of increasing size in molecular-dynamics simulations on 623 never-before-seen compositions. Figure 3a shows the ability of the pretrained GNoME potentials to classify unseen compositions as superionic conductors in comparison with AIMD.

When scaled to the GNoME dataset—much larger than existing approaches—we find that deep learning unlocks previously impossible capabilities for building transferable interatomic potentials for inorganic bulk crystals and allows for high-accuracy, zero-shot prediction of materials properties at scale.

We show that GNNs trained on a large and diverse set of first-principles calculations can enable the efficient discovery of inorganic materials, increasing the number of stable crystals by more than an order of magnitude. Associated datasets empower machine-learned interatomic potentials, giving accurate and robust molecular-dynamics simulations out of the box on unseen bulk materials. Our findings raise interesting questions about the capabilities of deep-learning systems in the natural sciences: the application of machine-learning methods for scientific discovery has traditionally suffered from the fundamental challenge that learning algorithms work under the assumption of identically distributed data at train and test times, but discovery is inherently an out-of-distribution effort. Our results on large-scale learning provide a potential step to move past this dilemma, by demonstrating that GNoME models exhibit emergent out-of-distribution capabilities at scale. This includes discovery in unseen chemical spaces (for example, with more than four different elements), as well as on new downstream tasks (for example, predicting kinetic properties).

GNoME models have already found 2.2 million stable crystals with respect to previous work and enabled previously impossible modelling capabilities for materials scientists. Some open problems remain for the transition of findings in applications, including a greater understanding of phase transitions through competing polymorphs, dynamic stability arising from vibrational profiles and configurational entropies and, ultimately, synthesizability. Nevertheless, we see pretrained, general-purpose GNoME models being used as powerful tools across a diverse range of applications to fundamentally accelerate materials discovery.

Datasets and candidate generation

Snapshots of available datasets.

GNoME discoveries aim to extend the catalogues of known stable crystals. In particular, we build off previous work by the Materials Project 16 , the OQMD 17 , Wang, Botti and Marques (WBM) 27 and the ICSD 15 . For reproducibility, GNoME-based discoveries use snapshots of the two datasets saved at a fixed point in time. We use the data from the Materials Project as of March 2021 and the OQMD as of June 2021. These structures are used as the basis for all discovery including via SAPS, yielding the catalogue of stable crystals as a result of GNoME. Further updates and incorporation of discoveries by these two groups could yield an even greater number of crystal discoveries.

For a revised comparison, another snapshot of the Materials Project, the OQMD and WBM was taken in July 2023. Approximately 216,000 DFT calculations were performed at consistent settings and used to compare the rate of GNoME discoveries versus the rate of discoveries by concurrent research efforts. From 2021 to 2023, the number of stable crystals external to GNoME expanded from 35,000 to 48,000, relatively small in comparison with the 381,000 new stable crystal structures available on the convex hull presented in this paper.

Substitution patterns

Structural substitution patterns are based on data-mined probabilities from ref.  22 . That work introduced a probabilistic model for assessing the likelihood for ionic species substitution within a single crystal structure. In particular, the probability of substitution is calculated as a binary feature model such that \(p(X,{X}^{{\prime} })\approx \frac{\exp {\sum }_{i}{\lambda }_{i}{f}_{i}^{(n)}(X,{X}^{{\prime} })}{Z}\) , in which X and X ′ are n -component vectors of n different ions. The model is simplified so that f i is 0 or 1 if a specific substitution pair occurs and λ i provides a weighting for the likelihood of a given substitution. The resulting probabilities have been helpful, for example, in discovering new quaternary ionic compounds with limited computation budgets.

In our work, we adjust the probabilistic model so as to increase the number of candidates and give priority to discovery. In particular, the conditional probability computation in the original substitution patterns prefers examples that are more likely to be found in the original dataset. For example, any uncommon element is assigned a smaller probability in the original model. To give priority to novel discovery and move further away from the known sets of stable crystals, we modify the implementation so that probabilities are only computed when two compositions differ. This minor modification has substantial benefits across our pipeline, especially when scaling up to six unique elements.

We also introduce changes to the model parameters to promote novel discovery. In the original probabilistic model, positive lambda refers to more likely substitutions, although ‘unseen’ or uncommon substitution resulted in negative lambda values. We increase the number of generations by setting the minimum value of any substitution pair to be 0. We then threshold high-probability substitutions to a value of 0.001, enabling efficient exploration in composition space through branch-and-bound algorithms available from pymatgen. Overall, these settings allow for many one-ion or two-ion substitutions to be considered by the graph networks that otherwise would not have been considered. We find this to be a good intermediate between the original model and using all possible ionic substitutions, in which we encounter combinatorial blow-ups in the number of candidates.

For the main part of this paper, substitutions are only allowed into compositions that do not match any available compositions in the Materials Project or in the OQMD, rather than comparing structures using heuristic structure matchers. This ensures that we introduce novel compositions in the dataset instead of similar structures that may be missed by structure matchers.

To further increase the diversity of structures generations, we introduce a framework that we refer to as symmetry aware partial substitutions (SAPS), which generalizes common substitution frameworks. For a motivating example, consider the cases of (double) perovskites. Ionic substitutions on crystals of composition A 2 B 2 X 6 does not lead to discovering double perovskites A 2 BB′O 6 , although the two only differ by a partial replacement on the B site.

SAPS enable efficient discovery of such structures. Starting with an original composition, we obtain candidate ion replacements using the probabilities as defined in the ‘Substitution patterns’ section. We then obtain Wyckoff positions of the input structures by means of symmetry analysers available through pymatgen. We enable partial replacements from 1 to all atoms of the candidate ion, for which at each level we only consider unique symmetry groupings to control the combinatorial growth. Early experiments limited the partial substitutions to materials that would charge-balance after partial substitutions when considering common oxidation states; however, greater expansion of candidates was achieved by removing such charge-balancing from the later experiments. This partial-substitution framework enables greater use of common crystal structures while allowing for the discovery of new prototypical structures, as discussed in the main part of this paper. Candidates from SAPS are from a different distribution to the candidates from full substitutions, which increases the diversity of our discoveries and our dataset.

To validate the impact of the SAPS, we traced reference structures from substitutions of all 381,000 novel stable structures back to a structure in the Materials Project or the OQMD by means of a topological sort (necessary as discovered materials were recycled for candidate generation). A total of 232,477 out of the 381,000 stable structures can be attributed to a SAPS substitution, suggesting notable benefit from this diverse candidate-generation procedure.

Oxidation-state relaxations

For the compositional pipeline, inputs for evaluation by machine-learning models must be unique stoichiometric ratios between elements. Enumerating the combinatorial number of reduced formulas was found to be too inefficient, but common strategies to reduce such as oxidation-state balancing was also too restrictive, for example, not allowing for the discovery of Li 15 Si 4 . In this paper, we introduce a relaxed constraint on oxidation-state balancing. We start with the common oxidation states from the Semiconducting Materials by Analogy and Chemical Theory (SMACT) 57 , with the inclusion of 0 for metallic forms. We allow for up to two elements to exist between two ordered oxidation states. Although this is a heuristic approach, it substantially improves the flexibility of composition generation around oxidation-state-balanced ratios.

AIRSS structure generation

Random structures are generated through AIRSS when needed for composition models 26 . Random structures are initialized as ‘sensible’ structures (obeying certain symmetry requirements) to a target volume and then relaxed through soft-sphere potentials. A substantial number of initializations and relaxations are needed to discover new materials, as different initial structures lead to different minima on the structure–energy landscape. For this paper, we always generate 100 AIRSS structures for every composition that is otherwise predicted to be within 50 meV of stable through composition-only model prediction.

As we describe in Supplementary Note  5 , not all DFT relaxations converge for the 100 initializations per composition. In fact, for certain compositions, only a few initializations converge. One of the main difficulties arises from not knowing a good initial volume guess for the composition. We try a range of initial volumes ranging from 0.4 to 1.2 times a volume estimated by considering relevant atomic radii, finding that the DFT relaxation fails or does not converge for the whole range for each composition. Prospective analysis was not able to uncover why most AIRSS initializations fail for certain compositions, and future work is needed in this direction.

Model training and evaluation

Graph networks.

For structural models, edges are drawn in the graph when two atoms are closer than an interatomic distance cutoff (4.0 Å for structural models, 5.0 Å for interatomic potentials). Compositional models default to forming edges between all pairs of nodes in the graph. The models update latent node features through stages of message passing, in which neighbour information is collected through normalized sums over edges and representations are updated through shallow MLPs 36 . After several steps of message passing, a linear readout layer is applied to the global state to compute a prediction of the energy.

Training structural and composition models

Following Roost (representation learning from stoichiometry) 58 , we find GNNs to be effective at predicting the formation energy of a composition and structure.

For the structural models, the input is a crystal definition, which encodes the lattice, structure and atom definitions. Each atom is represented as a single node in the graph. Edges are defined when the interatomic distance is less than a user-defined threshold. Nodes are embedded by atom type, edges are embedded on the basis of the interatomic distance. We also include a global feature that is connected in the graph representation to all nodes. At every step of the GNN, neighbouring nodes and edge features are aggregated and used to update the corresponding representations of nodes, edges or globals individually. After 3–6 layers of message passing, an output layer projects the global vector to get an estimate of the energy. All data for training are shifted and scaled to approximately standardize the datasets. This structural model trained on the Materials Project data obtains state-of-the-art results of a mean absolute error of 21 meV atom −1 . Training during the active-learning procedure leads to a model with a final mean absolute error of 11 meV atom −1 . Training for structural models is performed with 1,000 epochs, with a learning rate of 5.55 × 10 −4 and a linear decay learning rate schedule. By default, we train with a batch size of 256 and use swish nonlinearities in the MLP. To embed the edges, we use a Gaussian featurizer. The embedding dimension for all nodes and edges is 256 and, unless otherwise stated, the number of message-passing iterations is 3.

For the compositional models, the input composition to the GNN is encoded as a set of nodes, for which each element type in the composition is represented by a node. The ratio of the specific element is multiplied with the one-hot vector. For example, SiO 2 would be represented with two nodes, in which one node feature is a vector of zeros and a 1/3 on the 14th row to represent silicon and the other node is a vector of zeros with a 2/3 on the 8th row to represent oxygen. Although this simplified GNN architecture is able to achieve state-of-the-art generalization on the Materials Project (MAE of 60 meV atom −1 (ref.  25 )), it does not offer useful predictions for materials discovery, which was also observed by Bartel et al. 25 . One of the issues with compositional models is that they assume that the training label refers to the ground-state phase of a composition, which is not guaranteed for any dataset. Thus, the formation-energy labels in the training and test sets are inherently noisy, and reducing the test error does not necessarily imply that one is learning a better formation-energy predictor. To explore this, we created our own training set of compositional energies, by running AIRSS simulations on novel compositions. As described in Supplementary Note  5 , we find that compositions for which there are only a few completed AIRSS runs tend to have large formation energies, often larger than predicted by the compositional GNN. We find that, if we limit ourselves to compositions for which at least ten AIRSS runs are completed, then the compositional GNN error is reduced to 40 meV atom −1 . We then use the GNN trained on such a dataset (for which labels come from the minimum formation energy phase for compositions with at least ten completed AIRSS runs and ignoring the Materials Project data) and are able to increase the precision of stable prediction to 33%.

Model-based evaluation

Discovering new datasets aided by neural networks requires a careful balance between ensuring that the neural networks trained on the dataset are stable and promoting new discoveries. New structures and prototypes will be inherently out of distribution for models; however, we hope that the models are still capable of extrapolating and yielding reasonable predictions. This is out-of-distribution detection problem is further exacerbated by the implicit domain shift, in which models are trained on relaxed structures but evaluated on substitutions before relaxation. To counteract these effects, we make several adjustments to stabilize test-time predictions.

Test-time augmentations

Augmentations at test time are a common strategy for correcting instabilities in machine-learning predictions. Specific to structural models, we especially consider isotropic scaling of the lattice vectors, which both shrinks and stretches bonds. At 20 values ranging from 80% to 120% of the reference lattice scaling volume, we aggregate by means of minimum reduction. This has the added benefit of potentially correcting for predicting on nonrelaxed structures, as isotropic scaling may yield a more appropriate final structure.

Deep ensembles and uncertainty quantification

Although neural network models offer flexibility that allows them to achieve state-of-the-art performance on a wide range of problems, they may not generalize to data outside the training distribution. Using an ensemble of models is a simple, popular choice for providing predictive uncertainty and improving generalization of machine-learning predictions 33 . This technique simply requires training n models rather than one. The prediction corresponds to the mean over the outputs of all n models; the uncertainty can be measured by the spread of the n outputs. In our application of training machine-learning models for stability prediction, we use n  = 10 graph networks. Moreover, owing to the instability of graph-network predictions, we find the median to be a more reliable predictor of performance and use the interquartile range to bound uncertainty.

Model-based filtration

We use test-time augmentation and deep-ensemble approaches discussed above to filter candidate materials based on energy. Materials are then compared with the available GNoME database to estimate the decomposition energy. Note that the structures provided for model-based filtration are unlikely to be completely related, so a threshold of 50 meV atom −1 was used for active learning to improve the recall of stable crystal discovery.

Clustered-based reduction

For active-learning setups, only the structure predicted to have the minimum energy within a composition is used for DFT verification. However, for an in-depth evaluation of a specific composition family of interest, we design clustering-based reduction strategies. In particular, we take the top 100 structures for any given composition and perform pairwise comparisons with pymatgen’s built-in structure matcher. We cluster the connected components on the graph of pairwise similarities and take the minimum energy structure as the cluster representation. This provides a scalable strategy to discovering polymorphs when applicable.

Active learning was performed in stages of generation and later evaluation of filtered materials through DFT. In the first stage, materials from the snapshots of the Materials Project and the OQMD are used to generate candidates with an initial model trained on the Materials Project data, with a mean absolute error of 21 meV atom −1 in formation energy. Filtration and subsequent evaluation with DFT led to discovery rates between 3% and 10%, depending on the threshold used for discovery. After each round of active learning, new structural GNNs are trained to improve the predictive performance. Furthermore, stable crystal structures are added to the set of materials that can be substituted into, yielding a greater number of candidates to be filtered by the improved models. This procedure of retraining and evaluation was completed six times, yielding the total of 381,000 stable crystal discoveries. Continued exploration with active learning may continue to drive the number of stable crystals higher.

Composition-based hashing

Previous efforts to learn machine-learning models of energies often use a random split over different crystal structures to create the test set on which energy predictions are evaluated. However, as the GNoME dataset contains several crystal structures with the same composition, this metric is less trustworthy over GNoME. Having several structures within the same composition in both the training and the test sets markedly reduces test error, although the test error does not provide a measure of how well the model generalizes to new compositions. In this paper, we use a deterministic hash for the reduced formula of each composition and assign examples to the training (85%) and test (15%) sets. This ensures that there are no overlapping compositions in the training and test sets. We take a standard MD5 hash of the reduced formula, convert the hexadecimal output to an integer and take modulo 100 and threshold at 85.

DFT evaluation

Vasp calculations.

We use the VASP (refs.  34 , 59 ) with the PBE 41 functional and PAW 40 , 60 potentials in all DFT calculations. Our DFT settings are consistent with the Materials Project workflows as encoded in pymatgen 23 and atomate 61 . We use consistent settings with the Materials Project workflow, including the Hubbard U parameter applied to a subset of transition metals in DFT+U, 520 eV plane-wave-basis cutoff, magnetization settings and the choice of PBE pseudopotentials, except for Li, Na, Mg, Ge and Ga. For Li, Na, Mg, Ge and Ga, we use more recent versions of the respective potentials with the same number of valence electrons. For all structures, we use the standard protocol of two-stage relaxation of all geometric degrees of freedom, followed by a final static calculation, along with the custodian package 23 to handle any VASP-related errors that arise and adjust appropriate simulations. For the choice of KPOINTS, we also force gamma-centred kpoint generation for hexagonal cells rather than the more traditional Monkhorst–Pack. We assume ferromagnetic spin initialization with finite magnetic moments, as preliminary attempts to incorporate different spin orderings showed computational costs that were prohibitive to sustain at the scale presented. In AIMD simulations, we turn off spin polarization and use the NVT ensemble with a 2-fs time step.

Bandgap calculations

For validation purposes (such as the filtration of Li-ion conductors), bandgaps are calculated for most of the stable materials discovered. We automate bandgap jobs in our computation pipelines by first copying all outputs from static calculations and using the pymatgen-based MPNonSCFSet in line mode to compute the bandgap and density of states of all materials. A full analysis of patterns in bandgaps of the novel discoveries is a promising avenue for future work.

r 2 SCAN is an accurate and numerically efficient functional that has seen increasing adoption from the community for increasing the fidelity of computational DFT calculations. This functional is provided in the upgraded version of VASP6 and, for all corresponding calculations, we use the settings as detailed by MPScanRelaxSet and MPScanStaticSet in pymatgen. Notably, r 2 SCAN functionals require the use of PBE52 or PBE54 potentials, which can differ slightly from the PBE equivalents used elsewhere in this paper. To speed up computation, we perform three jobs for every SCAN-based computation. First, we precondition by means of the updated PBE54 potentials by running a standard relaxation job under MPRelaxSet settings. This preconditioning step greatly speeds up SCAN computations, which—on average—are five times slower and can otherwise crash on our infrastructure owing to elongated trajectories. Then, we relax with the r 2 SCAN functional, followed by a static computation.

Metrics and analysis methodology

Decomposition energies.

To compute decomposition energies and count the total number of stable crystals relative to previous work 16 , 17 in a consistent fashion, we recalculated energies of all stable materials in the Materials Project and the OQMD with identical, updated DFT settings as enabled by pymatgen. Furthermore, to ensure fair comparison and that our discoveries are not affected by optimization failures in these high-throughput recalculations, we use the minimum energy of the Materials Project calculation and our recalculation when both are available.

Prototype analysis

We validate the novel discoveries using XtalFinder (ref.  39 ), using the compare_structures function available from the command line. This process was parallelized over 96 cores for improved performance. We also note that the symmetry calculations in the built-in library fail on less than ten of the stable materials discovered. We disable these filters but note that the low number of failures suggests minimal impact on the number of stable prototypes.

Families of interest

Layered materials.

To count the number of layered materials, we use the methodology developed in ref.  45 , which is made available through the pymatgen.analysis.dimensionality package with a default tolerance of 0.45 Å.

Li-ion conductors

The estimated number of viable Li-ion conductors reported in the main part of this paper is derived using the methodology in ref.  46 in a high-throughput fashion. This methodology involves applying filters based on bandgaps and stabilities against the cathode Li-metal anode to identify the most viable Li-ion conductors.

Li/Mn transition-metal oxide family

The Li/Mn transition-metal oxide family is discussed in ref.  25 to analyse the capabilities of machine-learning models for use in discovery. In the main text, we compare against the findings in the cited work suggesting limited discovery within this family through previous machine-learning methods.

Definition of experimental match

In the main part of this paper, we refer to experimentally validated crystal structures with the ICSD. More specifically, we queried the ICSD in January 2023 after many of crystal discoveries had been completed. We then extracted relevant journal (year) and chemical (structure) information from the provided files. By rounding to nearest integer formulas, we found 4,235 composition matches with materials discovered by GNoME. Of these, 4,180 are successfully parsed for structure. Then, we turn to the structural information provided by the ICSD. We used the CIF parser module of pymatgen to load the experimental ICSD structures into pymatgen and then compared those to the GNoME dataset using its structure matcher module. For both modules, we tried using the default settings as well as more tolerant settings that improve structure parsing and matching (higher occupancy tolerance in CIF parsing to fix cases with >1.0 total occupancy and allowing supercell and subset comparison in matching). The latter resulted in a slight increase (about 100) in the number of matched structures with respect to the default settings. Given that we are enforcing a strict compositional match, our matching process is still relatively conservative and is likely to yield a lower bound. Overall, we found 736 matches, providing experimental confirmation for the GNoME structures. 184 of these structures correspond to novel discoveries since the start of the project.

Methods for creating figures of GNoME model scaling

Figures 1e and 3a,b show how the generalization abilities of GNoME models scale with training set size. In Fig. 1e , the training sets are sampled uniformly from the materials from the Materials Project and from our structural pipeline, which only includes elemental and partial substitutions into stable materials in the Materials Project and the OQMD. The training labels are the final formation energy at the end of relaxation. The test set is constructed by running AIRSS on 10,000 random compositions filtered by the SMACT. Test labels are the final formation energy at the end of the AIRSS relaxation, for crystals that AIRSS and DFT (both electronically and ionically) converged. Because we apply the same composition-based hash filtering (see ‘Composition-based hashing’ section) on all of our datasets, there is no risk of label leakage between the training set from the structural pipeline and the test set from AIRSS.

In Fig. 3a , we present the classification error for predicting the outcome of DFT-based molecular dynamics using GNN molecular dynamics. ‘GNoME: unique structures’ refers to the first step in the relaxation of crystals in the structural pipeline. We train on the forces on each atom on the first DFT step of relaxation. The different training subsets are created by randomly sampling compositions in the structural pipeline uniformly. ‘GNoME: intermediate structures’ includes all the same compositions as ‘GNoME: unique structures’, but has all steps of DFT relaxation instead of just the first step. The red diamond refers to the same GNN interatomic potential trained on the data from M3GNet, which includes three relaxation steps per composition (first, middle and last), as described in the M3GNet paper 62 .

Coding frameworks

For efforts in machine learning, GNoME models make use of JAX and the capabilities to just-in-time compile programs onto devices such as graphics processing units (GPUs) and tensor processing units (TPUs). Graph networks implementations are based on the framework developed in Jraph, which makes use of a fundamental GraphsTuple object (encoding nodes and edges, along with sender and receiver information for message-passing steps). We also make great of use functionality written in JAX MD for processing crystal structures 63 , as well as TensorFlow for parallelized data input 64 .

Large-scale generation, evaluation and summarization pipelines make use of Apache Beam to distribute processing across a large number of workers and scale to the sizes as described in the main part of this paper (see ‘Overview of generation and filtration’ section). For example, billions of proposal structures, even efficiently encoded, requires terabytes of storage that would otherwise fail on single nodes.

Also, crystal visualizations are created using tooling from VESTA (ref.  65 ).

Pretrained GNoME potential

We train a NequIP potential 30 , implemented in JAX using the e3nn-jax library 66 , with five layers, hidden features of 128 ℓ  = 0 scalars, 64 ℓ  = 1 vectors and 32 ℓ  = 2 tensors (all even irreducible representations only, 128 x 0 e  + 64 x 1 x  + 32 x 2 e ), as well as an edge-irreducible representation of 0 e  + 1 e  + 2 e . We use a radial cutoff of 5 Å and embed interatomic distances r i j in a basis of eight Bessel functions, which is multiplied by the XPLOR cutoff function, as defined in HOOMD-blue (ref.  67 ), using an inner cutoff of 4.5 Å. We use a radial MLP R ( r ) with two hidden layers with 64 neurons and a SiLU nonlinearity. We also use SiLU for the gated, equivariant nonlinearities 68 . We embed the chemical species using a 94-element one-hot encoding and use a self-connection, as proposed in ref.  30 . For internal normalization, we divide by 26 after each convolution. Models are trained with the Adam optimizer using a learning rate of 2 × 10 −3 and a batch size of 32. Given that high-energy structures in the beginning of the trajectory are expected to be more diverse than later, low-energy structures, which are similar to one another and often come with small forces, each batch is made up of 16 structures sampled from the full set of all frames across all relaxations and 16 structures sampled from only the first step of the relaxation only. We found this oversampling of first-step structures to substantially improve performance on downstream tasks. The learning rate was decreased to a new value of 2 × 10 −4 after approximately 23 million steps, to 5 × 10 −5 after a further approximately 11 million steps and then trained for a final 2.43 million steps. Training was performed on four TPU v3 chips.

We train on formation energies instead of total energies. Formation energies and forces are not normalized for training but instead we predict the energy as a sum over scaled and shifted atomic energies, such that \(\widehat{E}={\sum }_{i\in {N}_{{\rm{atoms}}}}\left({\widehat{{\epsilon }}}_{i}\sigma +\mu \right)\) , in which \({\widehat{{\epsilon }}}_{i}\) is the final, scalar node feature on atom i and σ and μ are the standard deviation and mean of the per-atom energy computed over a single pass of the full dataset. The network was trained on a joint loss function consisting of a weighted sum of a Huber loss on energies and forces:

in which N a and N b denote the number of atoms in a structure and the number of samples in a batch, respectively, \({\widehat{E}}_{{\rm{b}}}\) and E b are the predicted and true energy for a given sample in a batch, respectively, and F a , α is the true force component on atom a , for which α   ∈  { x ,  y ,  z } is the spatial component. \({{\mathcal{L}}}_{{\rm{Huber}}}(\delta ,\widehat{a},a)\) denotes a Huber loss on quantity a , for which we use δ E = δ F = 0.01. The pretrained potential has 16.24 million parameters. Inference on an A100 GPU on a 50-atom system takes approximately 14 ms, enabling a throughput of approximately 12 ns day −1 at a 2-fs time step, making inference times highly competitive with other implementations of GNN interatomic potentials. Exploring new approaches with even further improved computational efficiency is the focus of future work.

Training on M3GNet data

To allow a fair comparison with the smaller M3GNet dataset used in ref.  62 , a NequIP model was trained on the M3GNet dataset. We chose the hyperparameters in a way that balances accuracy and computational efficiency, resulting in a potential with efficient inference. We train in two setups, one splitting the training and testing sets based on unique materials and the other over all structures. In both cases, we found the NequIP potential to perform better than the M3GNet models trained with energies and forces (M3GNet-EF) reported in ref.  62 . Given this improved performance, to enable a fair comparison of datasets and dataset sizes, we use the NequIP model trained on the structure-split M3GNet data in the scaling tests (the pretrained M3GNet model is used for zero-shot comparisons). We expect our scaling and zero-shot results to be applicable to a wide variety of modern deep-learning interatomic potentials.

The structural model used for downstream evaluation was trained using the Adam optimizer with a learning rate of 2 × 10 −3 and a batch size of 16 for a total of 801 epochs. The learning rate was decreased to 2 × 10 −4 after 601 epochs, after which we trained for another 200 epochs. We use the same joint loss function as in the GNoME pretraining, again with λ E  = 1.0, λ F  = 0.05 and δ E  =  δ F  = 0.01. The network hyperparameters are identical to the NequIP model used in GNoME pretraining. To enable a comparison with ref.  62 , we also subtract a linear compositional fit based on the training energies from the reference energies before training. Training was performed on a set of four V100 GPUs.

AIMD conductivity experiments

Following ref.  69 , we classify a material as having superionic behaviour if the conductivity σ at the temperature of 1,000 K, as measured by AIMD, satisfies σ 1,000K  > 101.18 mScm −1 . Refer to the original paper for applicable calculations. See  Supplementary Information for further details.

Robustness experiments

For the materials selected for testing the robustness of our models, As 24 Ca 24 Li 24 , Ba 8 Li 16 Se 32 Si 8 , K 24 Li 16 P 24 Sn 8 and Li 32 S 24 Si 4 , a series of models is trained on increasing training set sizes sampled from the T  = 400 K AIMD trajectory. We then evaluate these models on AIMD data sampled at both T  = 400 K (to measure the effect of fine-tuning on data from the target distribution) and T  = 1,000 K (to measure the robustness of the learned potentials). We trained two types of model: (1) a NequIP model from scratch and (2) a fine-tuned model that was pretrained on the GNoME dataset, starting from the checkpoint before the learning rate was reduced the first time. The network architecture is identical to that used in pretraining. Because the AIMD data contain fewer high-force/high-energy configurations, we use a L2 loss in the joint loss function instead of a Huber loss, again with λ E  = 1.0 and λ F  = 0.05. For all training set sizes and all materials, we scan learning rates 1 × 10 −2 and 2 × 10 −3 and batch sizes 1 and 16. Models are trained for a maximum of 1,000 epochs. The learning rate is reduced by a factor of 0.8 if the test error on a hold-out set did not improve for 50 epochs. We choose the best of these hyperparameters based on the performance of the final checkpoint on the 400-K test set. The 400-K test set is created using the final part of the AIMD trajectory. The training sets are created by sampling varying training set sizes from the initial part of the AIMD trajectory. The out-of-distribution robustness test is generated from the AIMD trajectory at 1,000 K. Training is performed on a single V100 GPU.

Molecular dynamics simulations

The materials for AIMD simulation are chosen on the basis of the following criteria: we select all materials in the GNoME database that are stable, contain one of the conducting species under consideration (Li, Mg, Ca, K, Na) and have a computationally predicted band gap >1 eV. The last criterion is chosen to not include materials with notable electronic conductivity, a desirable criterion in the search for electrolytes. Materials are run in their pristine structure, that is, without vacancies or stuffing. The AIMD simulations were performed using the VASP. The temperature is initialized at T  = 300 K, ramped up over a time span of 5 ps to the target temperature, using velocity rescaling. This is followed by a 45-ps simulation equilibration using a Nosé–Hoover thermostat in the NVT ensemble. Simulations are performed at a 2-fs time step.

Machine-learning-driven molecular dynamics simulations using JAX MD 63 are run on a subset of materials for which AIMD data were available and for which the composition was in the test set of the pretraining data (that is, previously unseen compositions), containing Li, Na, K, Mg and Ca as potentially conducting species. This results in 623 materials for which GNoME-driven molecular dynamics simulations are run. Simulations are performed at T  =1,000 K using a Nosé–-Hoover thermostat, a temperature equilibration constant of 40 time steps, a 2-fs time step and a total simulation length of 50 ps. Molecular dynamics simulations are performed on a single P100 GPU.

For analysis of both the AIMD and the machine learning molecular dynamics simulation, the first 10 ps of the simulation are discarded for equilibration. From the final 40 ps, we compute the diffusivity using the DiffusionAnalyzer class of pymatgen with the default smoothed=max setting 23 , 70 , 71 .

Data availability

Crystal structures corresponding to stable discoveries discussed throughout the paper will be made available at . In particular, we provide results for all stable structures, as well as any material that has been recomputed from previous datasets to ensure consistent settings. Associated data from the r 2 SCAN functional will be provided, expectantly serving as a foundation for analysing discrepancies between functional choices. Data will also be available via the Materials Project at with permanent link: .

Code availability

Software to analyse stable crystals and associated phase diagrams, as well as the software implementation of the static GNN and the interatomic potentials, will be made available at .

Green, M. A., Ho-Baillie, A. & Snaith, H. J. The emergence of perovskite solar cells. Nat. Photon.   8 , 506–514 (2014).

Article   CAS   ADS   Google Scholar  

Mizushima, K., Jones, P., Wiseman, P. & Goodenough, J. B. Li x CoO 2 (0< x <-1): a new cathode material for batteries of high energy density. Mater. Res. Bull. 15 , 783–789 (1980).

Article   CAS   Google Scholar  

Bednorz, J. G. & Müller, K. A. Possible high T c superconductivity in the Ba–La–Cu–O system. Z. Phys. B Condens. Matter 64 , 189–193 (1986).

Ceder, G. et al. Identification of cathode materials for lithium batteries guided by first-principles calculations. Nature 392 , 694–696 (1998).

Tabor, D. P. et al. Accelerating the discovery of materials for clean energy in the era of smart automation. Nat. Rev. Mater. 3 , 5–20 (2018).

Liu, C. et al. Two-dimensional materials for next-generation computing technologies. Nat. Nanotechnol. 15 , 545–557 (2020).

Article   CAS   PubMed   ADS   Google Scholar  

Nørskov, J. K., Bligaard, T., Rossmeisl, J. & Christensen, C. H. Towards the computational design of solid catalysts. Nat. Chem. 1 , 37–46 (2009).

Article   PubMed   Google Scholar  

Greeley, J., Jaramillo, T. F., Bonde, J., Chorkendorff, I. & Nørskov, J. K. Computational high-throughput screening of electrocatalytic materials for hydrogen evolution. Nat. Mater. 5 , 909–913 (2006).

Gómez-Bombarelli, R. et al. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nat. Mater. 15 , 1120–1127 (2016).

Article   PubMed   ADS   Google Scholar  

de Leon, N. P. et al. Materials challenges and opportunities for quantum computing hardware. Science 372 , eabb2823 (2021).

Wedig, A. et al. Nanoscale cation motion in TaO x , HfO x and TiO x memristive systems. Nat. Nanotechnol. 11 , 67–74 (2016).

Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33 , 1877–1901 (2020).

Google Scholar  

Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR, 2021);

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596 , 583–589 (2021).

Article   CAS   PubMed   PubMed Central   ADS   Google Scholar  

Hellenbrandt, M. The Inorganic Crystal Structure Database (ICSD)—present and future. Crystallogr. Rev. 10 , 17–22 (2004).

Jain, A. et al. Commentary: The Materials Project: a materials genome approach to accelerating materials innovation. APL Mater. 1 , 011002 (2013).

Article   ADS   Google Scholar  

Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials design and discovery with high-throughput density functional theory: the Open Quantum Materials Database (OQMD). JOM 65 , 1501–1509 (2013).

Belsky, A., Hellenbrandt, M., Karen, V. L. & Luksch, P. New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. Acta Crystallogr. B Struct. Sci. 58 , 364–369 (2002).

Aykol, M., Montoya, J. H. & Hummelshøj, J. Rational solid-state synthesis routes for inorganic materials. J. Am. Chem. Soc. 143 , 9244–9259 (2021).

Article   CAS   PubMed   Google Scholar  

Curtarolo, S. et al. AFLOWLIB.ORG: a distributed materials properties repository from high-throughput ab initio calculations. Comput. Mater. Sci. 58 , 227–235 (2012).

Draxl, C. & Scheffler, M. The NOMAD laboratory: from data sharing to artificial intelligence. J. Phys. Mater. 2 , 036001 (2019).

Hautier, G., Fischer, C., Ehrlacher, V., Jain, A. & Ceder, G. Data mined ionic substitutions for the discovery of new compounds. Inorg. Chem. 50 , 656–663 (2011).

Ong, S. P. et al. Python Materials Genomics (pymatgen): a robust, open-source Python library for materials analysis. Comput. Mater. Sci. 68 , 314–319 (2013).

Aykol, M. et al. Network analysis of synthesizable materials discovery. Nat. Commun. 10 , 2018 (2019).

Article   PubMed   PubMed Central   ADS   Google Scholar  

Bartel, C. J. et al. A critical examination of compound stability predictions from machine-learned formation energies. npj Comput. Mater. 6 , 97 (2020).

Pickard, C. J. & Needs, R. Ab initio random structure searching. J. Phys. Condens. Matter 23 , 053201 (2011).

Wang, H.-C., Botti, S. & Marques, M. A. Predicting stable crystalline compounds using chemical similarity. npj Comput. Mater. 7 , 12 (2021).

Hestness, J. et al. Deep learning scaling is predictable, empirically. Preprint at (2017).

Furness, J. W., Kaplan, A. D., Ning, J., Perdew, J. P. & Sun, J. Accurate and numerically efficient r 2 SCAN meta-generalized gradient approximation. J. Phys. Chem. Lett. 11 , 8208–8215 (2020).

Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13 , 2453 (2022).

Thomas, N. et al. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. Preprint at (2018).

Togo, A. & Tanaka, I. Spglib: a software library for crystal symmetry search. Preprint at (2018).

Behler, J. Constructing high-dimensional neural network potentials: a tutorial review. Int. J. Quantum Chem. 115 , 1032–1050 (2015).

Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54 , 11169 (1996).

Battaglia, P. W. et al. Relational inductive biases, deep learning, and graph networks. Preprint at (2018).

Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. Proc. Mach. Learn. Res. 70 , 1263–1272 (2017).

Chen, C., Ye, W., Zuo, Y., Zheng, C. & Ong, S. P. Graph networks as a universal machine learning framework for molecules and crystals. Chem. Mater. 31 , 3564–3572 (2019).

Kaplan, J. et al. Scaling laws for neural language models. Preprint at (2020).

Hicks, D. et al. AFLOW-XtalFinder: a reliable choice to identify crystalline prototypes. npj Comput. Mater. 7 , 30 (2021).

Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50 , 17953 (1994).

Perdew, J. P., Ernzerhof, M. & Burke, K. Rationale for mixing exact exchange with density functional approximations. J. Chem. Phys. 105 , 9982–9985 (1996).

Kitchaev, D. A. et al. Energetics of MnO 2 polymorphs in density functional theory. Phys. Rev. B 93 , 045132 (2016).

Kingsbury, R. et al. Performance comparison of r 2 SCAN and SCAN metaGGA density functionals for solid materials via an automated, high-throughput computational workflow. Phys. Rev. Mater. 6 , 013801 (2022).

Bassman Oftelie, L. et al. Active learning for accelerated design of layered materials. npj Comput. Mater. 4 , 74 (2018).

Cheon, G. et al. Data mining for new two- and one-dimensional weakly bonded solids and lattice-commensurate heterostructures. Nano Lett. 17 , 1915–1923 (2017).

Sendek, A. D. et al. Holistic computational structure screening of more than 12000 candidates for solid lithium-ion conductor materials. Energy Environ. Sci. 10 , 306–320 (2017).

Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98 , 146401 (2007).

Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104 , 136403 (2010).

Lot, R., Pellegrini, F., Shaidu, Y. & Küçükbenli, E. PANNA: properties from artificial neural network architectures. Comput. Phys. Commun. 256 , 107402 (2020).

Article   MathSciNet   CAS   Google Scholar  

Zhou, Y., Qiu, Y., Mishra, V. & Mar, A. Lost horses on the frontier: K 2 BiCl 5 and K 3 Bi 2 Br 9 . J. Solid State Chem. 304 , 122621 (2021).

Abudurusuli, A. et al. Li 4 MgGe 2 S 7 : the first alkali and alkaline-earth diamond-like infrared nonlinear optical material with exceptional large band gap. Angew. Chem. Int. Ed. 60 , 24131–24136 (2021).

Ruan, B.-B., Yang, Q.-S., Zhou, M.-H., Chen, G.-F. & Ren, Z.-A. Superconductivity in a new T 2 -phase Mo 5 GeB 2 . J. Alloys Compd. 868 , 159230 (2021).

Guo, Z. et al. Local distortions and metal–semiconductor–metal transition in quasi-one-dimensional nanowire compounds AV 3 Q 3 O δ (A = K, Rb, Cs and Q = Se, Te). Chem. Mater. 33 , 2611–2623 (2021).

Deng, A. et al. Novel narrow-band blue light-emitting phosphor of Eu 2+ -activated silicate used for WLEDs. Dalton Trans. 50 , 16377–16385 (2021).

Zhak, O., Köhler, J., Karychort, O. & Babizhetskyy, V. New ternary phosphides RE 5 Pd 9 P 7 ( RE =Tm, Lu): synthesis, crystal and electronic structure. Z. Anorg. Allg. Chem. 648 , e202200024 (2022).

Zuo, Y. et al. Performance and cost assessment of machine learning interatomic potentials. J. Phys. Chem. A 124 , 731–745 (2020).

Davies, D. W. et al. SMACT: semiconducting materials by analogy and chemical theory. J. Open Source Softw. 4 , 1361 (2019).

Goodall, R. E. & Lee, A. A. Predicting materials properties without crystal structure: deep representation learning from stoichiometry. Nat. Commun. 11 , 6280 (2020).

Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 6 , 15–50 (1996).

Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59 , 1758 (1999).

Mathew, K. et al. atomate: a high-level interface to generate, execute, and analyze computational materials science workflows. Comput. Mater. Sci. 139 , 140–152 (2017).

Chen, C. & Ong, S. P. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2 , 718–728 (2022).

Article   Google Scholar  

Schoenholz, S. & Cubuk, E. D. JAX MD: a framework for differentiable physics. Adv. Neural Inf. Process. Syst. 33 , 11428–11441 (2020).

MATH   Google Scholar  

Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. (2015).

Momma, K. & Izumi, F. VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J. Applied Crystallogr. 44 , 1272–1276 (2011).

Geiger, M. & Smidt, T. e3nn: Euclidean neural networks. Preprint at (2022).

Anderson, J. A., Glaser, J. & Glotzer, S. C. HOOMD-blue: a Python package for high-performance molecular dynamics and hard particle Monte Carlo simulations. Comput. Mater. Sci. 173 , 109363 (2020).

Hendrycks, D. & Gimpel, K. Gaussian Error Linear Units (GELUs). Preprint at (2016).

Jun, K. et al. Lithium superionic conductors with corner-sharing frameworks. Nat. Mater. 21 , 924–931 (2022).

Ong, S. P. et al. Phase stability, electrochemical stability and ionic conductivity of the Li 10±1 MP 2 X 1 2 (M = Ge, Si, Sn, Al or P, and X = O, S or Se) family of superionic conductors. Energy Environ. Sci. 6 , 148–156 (2013).

Mo, Y., Ong, S. P. & Ceder, G. First principles study of the Li 1 0GeP 2 S 1 2 lithium super ionic conductor material. Chem. Mater. 24 , 15–17 (2012).

Download references


We would like to acknowledge D. Eck, J. Sohl-Dickstein, J. Dean, J. Barral, J. Shlens, P. Kohli and Z. Ghahramani for sponsoring the project; L. Dorfman for product management support; A. Pierson for programme management support; O. Loum for help with computing resources; L. Metz for help with infrastructure; E. Ocampo for help with early work on the AIRSS pipeline; A. Sendek, B. Yildiz, C. Chen, C. Bartel, G. Ceder, J. Sun, J. P. Holt, K. Persson, L. Yang, M. Horton and M. Brenner for insightful discussions; and the Google DeepMind team for continuing support.

Author information

These authors contributed equally: Amil Merchant, Simon Batzner, Samuel S. Schoenholz, Ekin Dogus Cubuk

Authors and Affiliations

Google DeepMind, Mountain View, CA, USA

Amil Merchant, Simon Batzner, Samuel S. Schoenholz, Muratahan Aykol & Ekin Dogus Cubuk

Google Research, Mountain View, CA, USA

Gowoon Cheon

You can also search for this author in PubMed   Google Scholar


A.M. led the code development, experiments and analysis in most parts of the project, including the proposal of the data flywheel through active learning, candidate generation (for example, invention of SAPS), large-scale training and evaluation workflows, DFT calculations, convex-hull analysis and materials screening. S.B. led the code development, training and experiments of the force fields and the zero-shot evaluations, fine-tuning, robustness and the GNN molecular dynamics experiments, and contributed to overall code development, as well as training infrastructure. S.S.S. led the scaling of GNN training and JAX MD infrastructure and contributed to force-field experiments. M.A. contributed to data analyses, validation and benchmarking efforts, ran experiments and provided guidance. G.C. contributed to analysis, zero-shot evaluations and provided guidance. E.D.C. conceived and led the direction of the project, wrote software for data generation, model implementations and training, and led the scaling experiments. All authors contributed to discussion and writing.

Corresponding authors

Correspondence to Amil Merchant or Ekin Dogus Cubuk .

Ethics declarations

Competing interests.

Google LLC owns intellectual property rights related to this work, including, potentially, patent rights.

Peer review

Peer review information.

Nature thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

The supplementary information contains six sections, providing further context to the computational experiments performed.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit .

Reprints and Permissions

About this article

Cite this article.

Merchant, A., Batzner, S., Schoenholz, S.S. et al. Scaling deep learning for materials discovery. Nature 624 , 80–85 (2023).

Download citation

Received : 08 May 2023

Accepted : 10 October 2023

Published : 29 November 2023

Issue Date : 07 December 2023


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Google ai and robots join forces to build new materials.

  • Mark Peplow

Nature (2023)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research papers on online streaming

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: animate anyone: consistent and controllable image-to-video synthesis for character animation.

Abstract: Character Animation aims to generating character videos from still images through driving signals. Currently, diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities. However, challenges persist in the realm of image-to-video, especially in character animation, where temporally maintaining consistency with detailed information from character remains a formidable problem. In this paper, we leverage the power of diffusion models and propose a novel framework tailored for character animation. To preserve consistency of intricate appearance features from reference image, we design ReferenceNet to merge detail features via spatial attention. To ensure controllability and continuity, we introduce an efficient pose guider to direct character's movements and employ an effective temporal modeling approach to ensure smooth inter-frame transitions between video frames. By expanding the training data, our approach can animate arbitrary characters, yielding superior results in character animation compared to other image-to-video methods. Furthermore, we evaluate our method on benchmarks for fashion video and human dance synthesis, achieving state-of-the-art results.

Submission history

Access paper:.

  • Download PDF
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

This paper is in the following e-collection/theme issue:

Published on 6.12.2023 in Vol 25 (2023)

Online Forums as a Tool for Broader Inclusion of Voices on Health Care Communication Experiences and Serious Illness Care: Mixed Methods Study

Authors of this article:

Author Orcid Image

Original Paper

  • Carine Davila 1, 2 , MPH, MD   ; 
  • Stephanie H Chan 3, 4 , MPH   ; 
  • Anna Gosline 3, 4 , MS   ; 
  • Zamawa Arenas 5 , MSc   ; 
  • Jane Kavanagh 3, 6 , BA   ; 
  • Brian Feltz 5, 7 , MBA   ; 
  • Elizabeth McCarthy 5, 8 , BA   ; 
  • Tyrone Pitts 9 , DMin   ; 
  • Christine Ritchie 1, 2, 10 , MPH, MD  

1 Division of Palliative Care and Geriatric Medicine, Massachusetts General Hospital, Boston, MA, United States

2 Department of Medicine, Harvard Medical School, Boston, MA, United States

3 Massachusetts Coalition for Serious Illness Care, Boston, MA, United States

4 Blue Cross Blue Shield of Massachusetts, Boston, MA, United States

5 Flowetik, Boston, MA, United States

6 Ariadne Labs, Boston, MA, United States

7 3D Research Partners LLC, Harvard, MA, United States

8 Elizabeth M McCarthy Consulting, Boston, MA, United States

9 The Coalition to Transform Advanced Care, Washington, DC, United States

10 Center for Aging in Serious Illness, Mongan Institute, Boston, MA, United States

Corresponding Author:

Carine Davila, MPH, MD

Division of Palliative Care and Geriatric Medicine

Massachusetts General Hospital

55 Fruit St

Boston, MA, 02114

United States

Phone: 1 617 724 9197

Email: [email protected]

Background: Existing health care research, including serious illness research, often underrepresents individuals from historically marginalized communities. Capturing the nuanced perspectives of individuals around their health care communication experiences is difficult. New research strategies are needed that increase engagement of individuals from diverse backgrounds.

Objective: The aim of this study was to develop a mixed methods approach with qualitative online forums to better understand health communication experiences of individuals, including people from groups historically marginalized such as Black and Latino individuals; older adults; and people with low income, disability, or serious illness.

Methods: We used a multiphase mixed methods, community-informed research approach to design study instruments and engage participants. We engaged a diverse group of collaborators with lived experience of navigating the health care system who provided feedback on instruments, added concepts for testing, and offered guidance on creating a safe experience for participants (phase 1). We conducted a national quantitative survey between April and May 2021 across intrapersonal, interpersonal, and systems-level domains, with particular focus on interpersonal communication between patients and clinicians (phase 2). We conducted two asynchronous, qualitative online forums, a technique used in market research, between June and August 2021, which allowed us to contextualize the learnings and test concepts and messages (phase 3). Using online forums allowed us to probe more deeply into results and hypotheses from the survey to better understand the “whys” and “whats” that surfaced and to test public messages to encourage action around health.

Results: We engaged 46 community partners, including patients and clinicians from a Federally Qualified Health Center, to inform study instrument design. In the quantitative survey, 1854 adults responded, including 50.5% women, 25.2% individuals over 65 years old, and 51.9% individuals with low income. Nearly two-thirds identified as non-Hispanic white (65.7%), 10.4% identified as non-Hispanic Black, and 15.5% identified as Hispanic/Latino. An additional 580 individuals participated in online forums, including 60.7% women, 17.4% individuals over 65 years old, and 49.0% individuals with low income. Among the participants, 70.3% identified as non-Hispanic white, 16.0% as non-Hispanic Black, and 9.5% as Hispanic/Latino. We received rich, diverse input from our online forum participants, and they highlighted satisfaction and increased knowledge with engagement in the forums.

Conclusions: We achieved modest overrepresentation of people who were over 65 years old, identified as non-Hispanic Black, and had low income in our online forums. The size of the online forums (N=580) reflected the voices of 93 Black and 55 Hispanic/Latino participants. Individuals who identify as Hispanic/Latino remained underrepresented, likely because the online forums were offered only in English. Overall, our findings demonstrate the feasibility of using the online forum qualitative approach in a mixed methods study to contextualize, clarify, and expound on quantitative findings when designing public health and clinical communications interventions.


The COVID-19 pandemic raised the US health care community’s acknowledgement of both the historic and current inequities in health care access, treatment, and outcomes. The pandemic therefore highlighted the need for increased engagement of diverse communities to increase the validity of research studies [ 1 - 3 ]. This is true for all of health care, and serious illness research is no exception [ 3 - 7 ]. Serious illness communication (SIC) describes conversations that occur between patients with serious illness and clinicians to understand the patient’s goals, values, preferences, and priorities so that health care can be aligned with those priorities [ 8 ]. SIC is a type of shared decision-making and part of the broad set of activities known as advance care planning (ACP). SIC represents an important tool for the creation of therapeutic alliance and has the potential to align goals and clinical decisions. Such conversations are enhanced when clinicians within care systems have the cultural skills, attitudes, behaviors, and interactional styles to promote effective SIC [ 9 , 10 ]. However, there are documented disparities by race, ethnicity, and income in SIC and ACP-related activities, including health care proxy completion rates, conversations with clinicians and family about wishes for care, and the kinds of language that worked best when it comes to encouraging these activities [ 11 - 18 ].

Earlier work from the Massachusetts Coalition for Serious Illness Care (MCSIC) focused on how to best promote SIC and ACP to the public, especially to historically marginalized communities most likely to experience poor health outcomes [ 19 ]. Some of the open-ended qualitative research found confusion and misunderstanding about the language used to describe the many activities that are collectively referred to as ACP. It also became clear that many individuals associate the entire field of SIC and ACP with the very end of life and death. No matter how items were phrased, people continuously assumed that the topic was related specifically to do-not-resuscitate orders, “pull the plug” decisions, and what has come to be referred to as “true” end-of-life planning, such as estate or funeral planning [ 20 ]. Our goal for this research was to better contextualize people’s beliefs and attitudes about serious illness, SIC, and ACP within the larger canvas of their overall health care experiences. Unlike our prior work, we were not aiming to directly encourage SIC or ACP, but instead to further understand how SIC and ACP might align with the challenges and needs as seen by patients and to establish an approach to generate new insights into how and when ACP and SIC should be introduced and encouraged. Toward this end, we sought new approaches to obtain broader and more nuanced input from historically marginalized communities and ask new questions that solicited a more holistic focus on the health care journey, rather than exclusively on the serious illness journey or end of life journey. Accordingly, our specific question was as follows: How do we engage a wide range of insights on these issues to ensure that the perspectives of a small number of individuals are not extrapolated to reflect entire communities?

Our research had four key aims. The first aim was to understand individuals’ experiences with the health care system that shape care expectations and attitudes toward the system, specifically with regard to medical decision-making, recognition of and support for social determinants of health, and trust in and respect by clinicians and health care systems. Second, we sought to understand the greatest perceived medical, social, and financial needs when it comes to improving serious illness care in the United States. Third, we wanted to understand how individuals perceive sample language that clinicians may use to engage and support individuals under their care and understand what authentically resonates with them. Finally, we wanted to obtain input on how best to contextualize and frame different public messages to encourage action around health, including SIC and ACP. We sought to explicitly understand these perspectives of individuals from historically marginalized communities, who are often underrepresented in research.

Here, we describe our overarching approach to optimize representation in the research, demonstrate the breadth of engagement in our quantitative survey and qualitative online forums, and highlight participants’ satisfaction with the online forum engagement tool.

We designed a mixed methods community-informed research study. The study took place in three phases from August 2020 to August 2021 (see Figure 1 ). Guided by our aims, we sought to understand: What are people’s lived experiences in health care settings? What are the challenges faced by people with serious illness and caregivers? How does this impact what we should prioritize saying, doing, and asking people to do?

research papers on online streaming

Ethical Approval

This study received approval through the Harvard Longwood Campus Institutional Review Board (phase 2, IRB21-0398) and Massachusetts General Brigham’s Institutional Review Board (phase 3, 2021P003549).

Phase 1: Engage Community Partners in Study Design

Our goal was to gather input from community partners to ensure that our research objectives, approach, and framing were aligned with the needs of people from historically marginalized communities. We wanted to understand what matters to people around serious illness care, trying to discard our biases and assumptions. We engaged people through various channels, including focus groups with clinicians at a Federally Qualified Health Center (FQHC), in-depth telephone interviews with Black and Latino low-income older adults who participate in a Program for All-Inclusive Care for the Elderly at the FQHC, and an online survey with input from leading serious illness care organizations across the country. All participants were compensated for their time and engagement. We asked FQHC participants what situations and circumstances impact the type of health care Black and Latino patients receive, what they are most concerned about regarding their health and health care, and what they thought needed to change to improve people’s health care experiences. We asked clinicians and serious illness care leaders what and how to ask about people’s health care experiences, including SIC and ACP, and what needs to change to improve these health care experiences. These insights informed the development of study instruments for both our quantitative survey (phase 2) and qualitative online forum guide (phase 3). By the end of phase 1, we had a small diverse group of collaborators with lived experience of the challenges people face navigating the health care system, who provided more detailed and focused feedback on each instrument, added new concepts for testing, and offered guidance on creating a safe and caring experience for our participants. This group included additional MCSIC members and other national leaders in serious illness care, representing a diverse group of voices as listed in the Acknowledgments.

Phase 2: Quantitative Survey

Participant recruitment.

Survey respondents were included and invited to participate via the National Opinion Research Center (NORC) at the University of Chicago through their national AmeriSpeak panel. The AmeriSpeak panel is a probability-based household panel that uses a multistaged probability-based sampling method through which NORC achieved an estimated sample frame coverage of 97% of the residential United States, including a supplemental list of rural households not recorded on the US Postal Service Computerized Delivery Sequence file but identified through NORC in-person fieldwork. Households are sampled with a known, nonzero probability of selection from the NORC National Frame and recruited through a rigorous process that uses mail, telephone, and in-person recruitment by field interviewers to ensure that even hard-to-reach populations are represented in the panel [ 21 - 23 ]. Enrollment targets were set at a minimum of 100 respondents for specific groups: people with low income (less than US $50,000/year), Black respondents with low income, Latino respondents with low income, age greater than 65 years, people with disability (self-identified and/or answering “yes” to any of six questions about function from the American Community Survey, hereafter described as ACS-6 [ 24 , 25 ]), and people with serious illness (as per the “Identifying People With Serious Illness” subsection below). No specific exclusion criteria were designated. In addition to the national sample described here, a nonprobability-based sample of Massachusetts-based residents was administered through Lucid for separate state-specific analyses [ 26 ].

Quantitative Survey Development

The design of the quantitative survey was informed by extensive review of the literature, engagement with community partners in phase 1, and the social ecological model [ 27 ]. The survey instrument covered topics across intrapersonal, interpersonal, and systems-level domains ( Figure 2 ). We particularly focused on the interpersonal domain between patients and clinicians, covering topics such as whether patients feel engaged in shared decision-making, trust clinicians to do what is right, feel treated with dignity and respect, feel afraid to speak up and ask questions, or feel talked down to or made to feel inferior. Complementing these domains, we iteratively engaged our ongoing collaborators on question topics and language, including how we identified people with serious illness (see below).

research papers on online streaming

NORC conducted eight cognitive interviews from AmeriSpeak panelists by video to qualitatively understand how the survey questions were interpreted and make recommendations on alternative word choices. These interviews yielded meaningful recommendations to simplify language, add in clarifiers, provide additional answer choices that respondents thought were missing, and add appropriate prompts to facilitate the flow of the survey. The full quantitative survey instrument is included in Multimedia Appendix 1 .

Identifying People With Serious Illness

We iteratively engaged with our research and community partners and determined a need to distinguish people with serious illness from people with chronic disability. We identified people with serious illness based on whether participants self-reported “yes” to two questions: (1) have you ever been diagnosed with any of the following (diabetes; asthma, lung disease, emphysema, or chronic obstructive pulmonary disease; heart disease or had a stroke; cancer; Alzheimer disease, dementia, or memory loss; depression, anxiety, or other serious mental health problems; or chronic kidney disease or kidney failure)?, and (2) over the last 12 months, would you say that you have been feeling sicker and that it’s been getting harder to do your normal levels of work and activity?

Survey Fielding

NORC AmeriSpeak panelists were invited to participate in the survey between April 20, 2021, and May 17, 2021. The survey was offered in English and Spanish and via both the web and telephone to adults aged 18 years and older. Web-mode panelists were sent up to five email reminders to encourage participation and phone-mode panelists were called throughout the field period. Panelists were offered the cash equivalent of US $3 for completing the survey.

Phase 3: Qualitative Online Forums


To supplement a traditional quantitative survey, we elected to use a series of asynchronous, qualitative online forums, a technique commonly used in market research to gain deeper understanding of why individuals think, believe, and feel how they do. We engaged participants who were part of online forums to delve deeper into understanding people’s health care experiences, testing sample public messages with participants, and framing different public messages that encourage action around SIC and ACP in the context of people’s lived experiences (see Figure 3 ). Online forums allow engagement of a wide variety of individuals with sample sizes large enough to ensure varied perspectives even within groups that are historically underrepresented and marginalized.

research papers on online streaming

Participants were recruited by Full Circle Research Co, an independent online participant sample provider [ 28 ]. Individuals participating with the sample provider completed a screening questionnaire including demographics to determine eligibility. Inclusion criteria included availability to participate during scheduled online forum dates, aged 18 years and older, and having convenient access to a computer or smartphone with a high-speed or broadband internet connection. Exclusion criteria included lack of appropriate technology (such as convenient access to a device and the internet and an up-to-date internet browser such as Microsoft Internet Explorer Version 7.0 or higher, Mozilla Firefox Version 3.0 or higher, Safari Version 2.0.4 or higher, or any version of Google Chrome), inability to participate in the online forum in English (requiring ability to speak English and ability to type free-text responses), and less than high school education.

Participants were recruited to represent a mix of individuals across age, gender/gender identity, geography/state, education, marital status, employment status, race/ethnicity, income, and self-described health status. Individuals from the following groups were intentionally oversampled: people with low income (less than US $50,000/year), Black respondents with low income, Latino respondents with low income, age greater than 65 years, people with disability, people living with disability (as per phase 2 and ACS-6 [ 24 , 25 ]), people with serious illness (as per the identification criteria outlined in phase 2), and caregiver (defined as helping someone close to them who has a lot of medical or health needs or conditions). We sought to oversample these groups to ensure diversity of perspectives informing our findings.

Online Forum Activity Guide Design

Structured activity guides (see Multimedia Appendix 2 and Multimedia Appendix 3 ) were designed to engage participants in the online forum. Activity guide design was deeply informed by input from community partner engagement in phase 1 and coincided with receiving preliminary results from the phase 2 quantitative survey. Therefore, some activity prompts were created to probe deeper into specific questions, results, or hypotheses generated from the quantitative survey, allowing us to better understand the “whys” in addition to the “whats” that surfaced in the quantitative survey. Each online forum began with “getting to know you” activities to quickly establish trust and inspire participants to see their responses as more than answering research questions. Most activities were open-ended conversation prompts. To encourage continued engagement and reduce response fatigue, other activity formats were used, including answering multiple-choice questions; asking participants to tell us why they chose certain responses; reacting to definitions of quality care; rating the perceived impact of specific language clinicians may use to engage individuals in health care decision-making; and reading through sample public messages on actions one can take to address health and well-being with a highlighter tool to indicate words and phrases that stood out as being positive, negative, or neutral.

Online Forum Fielding

Qualified individuals were invited to participate in the online forums via itracksBoard, an online qualitative research platform [ 29 ]. Each week’s activities were designed to take around 2 hours to complete over the course of a 5-day week, with new activities available twice per day for a total of 7 or 8 activities per week (see Figure 4 for a sample activity). Participants responded to prompts asynchronously. A moderator (EM) published these activity prompts and then was present in the forum by asking probing, follow-up questions to the participants to encourage more conversation.

research papers on online streaming

The online forums were administered in two waves. Wave 1 was conducted for 2 weeks from June 21, 2021, to July 1, 2021, targeting 250 participants in two communities (targeting no more than 125-150 individuals in each community). The primary objective of the first wave was to explore consumers’ health care experiences, both good and bad; the drivers that contributed to these experiences; and reactions to sample messaging clinicians could use to engage patients in their care. Wave 2 was 1-week long from August 9, 2021, to August 13, 2021, targeting 250 participants with three communities (to allow testing with three different messaging frames). The second wave explored how well the participants feel known by their clinicians, their concerns about future care if they became ill, and their reactions to messaging encouraging them to act as one of three options: as a public health campaign, a simple doctor’s letter, and a doctor’s office letter with additional framing language. This sample messaging was developed and informed by the quantitative survey findings. The forums were purposefully smaller in the second wave to test different messages with distinct populations. Participants who completed all activities in wave 1 received a US $125 gift card and those that completed the activities in wave 2 received a US $100 gift card.

To account for differences in nonresponse, NORC applied statistical weighting to adjust to Current Population Survey totals associated with age, gender, education, race/Hispanic ethnicity, housing tenure, telephone status, and census division. Additional sampling weights were applied to account for the interactions of age and gender, age and race/ethnicity, and race/ethnicity and gender. The weighted data, which reflects the US population of adults aged over 18 years, were used in subsequent analyses (not shown here). Descriptive statistics were calculated using SPSS version 29 (IBM Corp, Armonk, NY, USA).

Two authors (EM and ZA) conducted qualitative content analysis of online forum data [ 30 ]. We conducted the qualitative analysis in a hybrid inductive-deductive manner based on a combination of the domains from the activity guides and the information that emerged from the online responses. This involved separately combing through daily participant responses verbatim, which comprised responses to research questions intentionally fielded from the activity guide and the organic conversation participants engaged in, all within the “walls” of the online forum, and identifying observations deemed consistent, insightful, and worthy of more exploration. We then compared high-level observations and grouped these into key themes that addressed the overall research questions. We created summaries of participants’ responses by activity, identifying important points, and highlighting key themes for review.

Phase 1: Engage Community Partners

We engaged with 46 community partners through various channels. We conducted two focus groups with seven clinicians and care navigators from an FQHC community health center, six in-depth telephone interviews with Black and Latino low-income older adults who receive care at an FQHC, and 12 respondents to an online survey with representatives from leading serious illness care organizations across the country. We also received additional input from 21 MCSIC members and other national leaders in serious illness care.

Our conversations from care providers and people with lived serious illness experience yielded many common experiences and concerns with health care systems that impact how historically marginalized groups experience serious illness care (see Textbox 1 ). People shared challenges in navigating the health care system; the unaffordability of care; experiences with bias and discrimination; the geographic inaccessibility of care; and the limited ability clinicians have to address more pressing social challenges impacting their health, including housing, food insecurity, immigration status, and loss of income. These findings led us to focus our study instruments on people’s prior health care experiences and how their current care fits into other life priorities.

  • Difficulty navigating the health care system, such as managing referrals to multiple specialists, accessing home care, and getting prescriptions refilled.
  • Unaffordability of care (including the cost of health insurance and medications) as having a real impact on well-being.
  • Experiencing bias and discrimination based on a variety of factors, including race, immigration status, age, disability, insurance coverage, and more.
  • Lack of high-quality care geographically close by and transportation challenges getting to appointments far from where they live.
  • Clinicians are not trained to help beyond immediate medical needs and those other needs are often much more pressing, such as housing, food insecurity, immigration status, and loss of income.

The survey was fielded to 6126 households with a survey completion rate of 30.3%. The margin of error was ±3.08 percentage points and the median duration to complete the survey was 12 minutes. The population consisted of 1854 adults (955 women, 51.5%; mean age 48.4, SD 17.5 years). Table 1 shows the demographics of the survey respondents, with both unweighted and weighted values. Of 1854 surveys, 94.5% occurred online (not by phone) and 2% occurred in Spanish. The respondents were 65.7% white (non-Hispanic), 10.4% Black (non-Hispanic), and 15.5% Hispanic. Approximately 1 in 5 (19.8%) respondents were categorized as having a serious illness and 18.9% self-identified as having a disability.

Phase 3: Online Forums

A total of 5191 individuals received the screening questionnaire, with 1512 qualifying individuals invited to participate. Of the 1512 individuals invited to participate in the forums, 644 (42.6%) joined and 90.0% (n=580) completed every activity and comprised the participant pool. Online forum participants’ demographics are outlined in Table 2 . Approximately 60.7% of participants identified as women, compared to 50.5% of the US population [ 31 ]. An estimated 70.3% of participants were white (non-Hispanic), 16.0% Black (non-Hispanic), 9.5% Hispanic, and 1.9% Asian, compared to census estimates of 59.3%, 13.6%, 18.9%, and 6.1%, respectively [ 31 ]. Approximately 15.9% of the participants were categorized as having a serious illness, 15.2% self-identified as having a disability, and 18.4% self-identified as a caregiver.

a Percentages may not all add to 100% due to rounding.

Furthermore, participants highlighted that they enjoyed the opportunity to engage in manner through online forums, that they learned from one another and their experiences, and that some even felt motivated to take action in their own health journey (see Textbox 2 ).

  • “I enjoyed participating in this study. Thank you for the opportunity to voice my opinions.” [middle-aged multiracial woman with serious illness and disability]
  • “Thank you for this fun group. I learned a lot from the other members and was glad I was able to contribute my thoughts. I really felt heard.” [young Latina woman with comorbid illness]
  • “This was a great group to be in. I appreciated everyone’s thoughts/comments. Some of them were some real eye openers. Thank you for allowing me to be a part of this group. I appreciated it.” [older white woman with disability]
  • “It has been not only a pleasure but also a learning experience for me as well. It made me think of health issues that could and will arise as time goes by and it’s good to have plans and invest time in surrounding yourself and family with a health care environment that will be proactive about your care and well being. Again my thanks to you Beth and the Coalition.” [older Black man]

Behaviors (engagement/activation)

  • “It has inspired me to not be afraid to talk up and give my opinions regarding whatever situation but especially health care. Only because I used to be so nervous walking into a doctor’s office. Now, I’ve learned to put my fears aside and ask questions and express myself because that is the only way my doctor is able to know me better and prescribe my treatments.” [older woman with serious illness]
  • “For a while I have been procrastinating about designating a proxy to speak for me regarding health care decisions…and getting a will. Why? Because like most human beings I want to concentrate on life in the present instead of worrying too much about sickness and death in the future. It is a mechanism of defense we human beings have. The discussions here addressing these topics made me realize that I have to take action now, grab the bull by the horns as they say. These decisions and actions are not easy but necessary because even if we are healthy and alive today we never know what can happen tomorrow. I have to take action. For starters, I have to think carefully about who will be my health care proxy and I have to work on getting a will.” [younger man with comorbid illness]

Understanding people’s health care communication experiences in the past is important as these experiences will influence their willingness to seek health care services [ 32 ] and in turn engage in future serious illness conversations. SIC strategies improve quality of life, enhance communication quality, reduce psychological distress, and promote positive patient and clinician experiences [ 33 - 38 ]. This community-informed mixed methods research study utilized online forums to better understand how SIC and ACP fit in the context of people’s prior experiences and life priorities. This study presents a unique approach with online engagement that is feasible and offers benefits over other traditional research approaches. We engaged 46 individual community partners, including patients, clinicians, and serious illness national experts, in our first phase of research, obtaining rich and compelling feedback that we incorporated in the design of study instruments. In our quantitative second phase, we used an intentionally recruited census-representative cohort of US adults through the NORC AmeriSpeak panel to reach over 1850 adults, which is in line with the recent large Kaiser Family Foundation National Serious Illness Care Survey with 2000 respondents [ 15 ]. Our survey population included nearly 20% respondents who have a serious illness, which is a greater proportion than included in the Kaiser Family Foundation’s survey, and 19% who have a disability, which is below the 25.7% estimated prevalence of US adults who live with at least one disability [ 39 ]. Our survey population also included 1386 (74.7%) younger and middle-aged adults (less than 65 years), who are often not captured relative to health care lived experience, beliefs, and concerns related to serious illness care. In our qualitative phase, we used asynchronous online forums as a method of engaging 580 US adults, a much larger number of people than traditional qualitative method styles can obtain, to ensure that the voices of a few are not used to represent entire historically marginalized populations. Recruitment for the online forums focused on engaging individuals from historically marginalized communities that are often underincluded in research efforts. In recruitment of diverse populations, it is important to assure compensation [ 40 ]. In all phases of our work, including online forums, we provided reimbursement for participation. With intentional efforts to oversample people from historically marginalized communities, we achieved overrepresentation of individuals who are over 65 years old, identify as non-Hispanic Black, and who have an annual income below the median US household income in the online forum population. We remained underrepresented in terms of individuals who identify as Hispanic/Latino and non-Hispanic Asian in our online forum population.

Our study achieved mixed results in terms of diverse racial/ethnic participant engagement in the quantitative and qualitative phases. In the quantitative phase, where we relied on the prerecruited panel of potential participants, we achieved lower than census levels of representation from non-Hispanic Black (10.4% vs 13.6% in the census), Hispanic/Latino (15.5% vs 18.9% in the census), and non-Hispanic Asian (2.7% vs 6.1% in the census) individuals [ 31 ]. These findings reflect that even when utilizing a census-representative survey sampling population, additional efforts may be needed to achieve census-level representation of historically marginalized racial/ethnic communities. In the qualitative phase, where we conducted intentional oversampling of Black and Latino individuals, we achieved higher than census levels of representation of non-Hispanic Black individuals (16.0% vs 13.6% in the census) and lower than census levels of representation of Hispanic/Latino (9.5% vs 18.9% in the census) and Asian (1.9% vs 6.1% in the census) individuals [ 31 ]. Importantly, the large size of the online forum population (N=580) reflected 93 Black and 55 Hispanic/Latino participants who were included in the study, ensuring that the perspectives of only a small handful of individuals are not extrapolated to reflect an entire historically marginalized community. This online forum recruitment highlights that intentional oversampling can achieve overrepresentation of some populations, including non-Hispanic Black individuals, but additional effort to offer participation in Spanish may improve engagement of Hispanic/Latino individuals, as two out of three Latino individuals report Spanish as their language preference [ 41 ], and Asian individuals, who have also been historically underincluded and often grouped even though they reflect a wide variation of languages, cultures, and backgrounds [ 42 ].

Prior studies have qualitatively examined patients’ health care experience and others have utilized publicly available online forum comments for qualitative analysis [ 43 , 44 ]. There has been some use of online forums to qualitatively collect information from patients [ 43 - 47 ], although here we describe the benefit of using online forums that allowed for tailored recruitment and greater representation from groups that are historically marginalized. This mixed methods approach, especially with such a flexible design for how to engage participants in online forums, allowed for rich, people-centered results in a scalable format. The recruitment approach engaged individuals from specific subgroups whose voices are not always elevated in research, including individuals from some historically marginalized communities (eg, 93 Black individuals and 55 Hispanic/Latino individuals) although notably not all (eg, only 11 Asian individuals participated). This helped to ensure that we heard varied perspectives within groups as well as between groups. The size and interactive nature of the forums allowed participants to surface themes that we did not know to ask about, such as weight bias, and probe concurrently through the moderator as they arose. We were also able to explore survey questions we needed further insights on to interpret the quantitative survey responses.

Finally, participants shared that they were engaged and satisfied with the online forum engagement process, some even sharing that they felt inspired to do new or different things (eg, finding a new doctor or speaking up more at a visit) given the content they were exposed to during the forums through the structured exercises as well as from interacting with each other. This feedback informally suggests that participation in this study increased their patient activation, although formal assessment of the Patient Activation Measure was not conducted. Prior research by Hibbard and Greene [ 48 ] outlined that higher patient activation is linked with improved health outcomes. Furthermore, the same research group highlighted that when patient activation increases, it is frequently associated with improved health outcomes and lower costs [ 49 ]. Participants’ positive reflections on their online forum experience suggest that this is a feasible way to engage a wide range of perspectives in qualitative research.

There are also study limitations that should be mentioned. The quantitative survey was fielded in English and Spanish, as well as electronically online and by telephone, although respondents were limited to those participating in NORC’s AmeriSpeak panel, which may inherently represent a population typically more willing to engage in survey research. The survey completion rate was 30%, which is in line with probability-based survey panel completion rates using the NORC AmeriSpeak panel [ 50 ]. Statistical weighting was used to overcome sampling bias by adjusting to the Current Population Survey and for interactions. In our qualitative online forums, participation was likely limited by the forums being conducted only in English, online only, and requiring typing for many of the activities. This likely contributed to underrepresentation of Hispanic/Latino and non-Hispanic Asian participants and possibly contributed to less engagement from people with disability, who may experience challenges with eyesight and typing ability required to engage in this forum. Future studies could consider offering a parallel forum in Spanish and ensuring browser/forum compatibility with text-to-voice and voice-to-text software for individuals with eyesight or typing challenges.

There are important opportunities for future research. In addition to highlighting the recruitment of individuals from historically marginalized communities in this mixed methods study, further research highlighting the differences in people’s prior health care experiences and engagement in SIC and ACP across individual characteristics and identities (eg, race/ethnicity, income, serious illness, and disability) is important. Future qualitative research could consider using a parallel online forum offered in Spanish to engage perspectives from Hispanic/Latino individuals who prefer Spanish. It would also be helpful to better understand the barriers and facilitators for people from historically marginalized communities to engage in serious illness conversations, building on prior work exploring barriers and facilitators to SIC among Black individuals and increasing understanding for Hispanic/Latino and Asian individuals [ 51 ].

Here, we outline a community-informed mixed methods approach to understanding people’s prior health care experiences and testing the framing of public messages to encourage engagement with clinicians and action around health, in service of finding new strategies to improve serious illness care. Our approach of engaging the community throughout the study design and execution focused our study, enriched our study instruments, and yielded important findings. This study can serve as a feasible model of choosing methodologies that allow for engagement with target communities and in-depth exploration of research aims, which in our case is to better understand how to improve serious illness care and communication.


This work was funded by sponsors of Massachusetts Coalition for Serious Illness Care, notably Blue Cross Blue Shield of Massachusetts, and with dedicated grant support from The John A Hartford Foundation and the Cambia Health Foundation, who together made this research possible as part of their innovative MessageLab program. We are indebted to Tony Back and Marian Grant, who lead MessageLab. We are grateful to the organizations forming part of MessageLab who provided guidance and support throughout the study. These organizations include: the American Academy of Hospice and Palliative Medicine (AAHPM), Ariadne Labs, the Center to Advance Palliative Care (CAPC), the Coalition to Transform Advanced Care (C-TAC), the National Coalition for Hospice and Palliative Care, the National Hospice and Palliative Care Organization (NHPCO), National POLST (Portable Medical Orders), Respecting Choices, The Conversation Project, and Vital Talk. CD also received funding and support from the Commonwealth Fund Fellowship in Minority Health Policy at Harvard University. We also want to thank the voices from our community collaborators who took a deep dive with us and helped ensure that we were asking the right questions, in the right way, with the right emphasis to capture what really matters to people. Thank you to Manny Lopes and the Federally Qualified Health Center (FQHC) East Boston Neighborhood Health Center for its staff participation in focus groups and connecting us to older adults for in-depth interviews. Thank you to those that provided feedback through our online partner survey, to Kate DeBartolo and her team at The Conversation Project, and to Jon Broyles and Adriana Krasinsky at C-TAC for their supportive outreach in connecting us to members in their communities. Thank you especially to Shirley Roberson (C-TAC) for the many hours spent working on the questions, approaches, and ideas. We benefited greatly from the expertise, experience, and previous research done by colleagues and are so grateful for the time they spent commenting, brainstorming, and supporting this work. Our thanks to Rebekah Angove (Patient Advocate Foundation), Rebecca Kirch (National Patient Advocate Foundation), Diane Meier and Lisa Morgan (CAPC), Liz Hamel (Kaiser Family Foundation), Rebecca Sudore (University of California, San Francisco), Glynn Elwyn (The Dartmouth Institute), and Cynthia Carter Perrilliat (Alameda County Care Alliance). We also benefited greatly from the voices, networks, and expertise of our Massachusetts Coalition collaborators: Rachel Broudy (Ariadne Labs), Ellen DiPaola (Honoring Choices Massachusetts), Lachlan Forrow ( retired Beth Israel Lahey Health), Arlene Germain (Massachusetts Advocates for Nursing Home Reform), Joan Reede and Emorcia Hill (Harvard Medical School Office for Diversity Inclusion & Community Partnerships), Lisa Iezzoni (The Mongan Institute), Vicki Jackson (Massachusetts General Hospital), Paul Lanzikos (Dignity Alliance Massachusetts), Alexis Levitt (Massachusetts Chapter of the National Academy of Elder Law Attorneys), James Lomastro (Dignity Alliance Massachusetts), Nathalie McIntosh (Massachusetts Health Quality Partners), and Sandy Novack (Dignity Alliance Massachusetts). We also thank Zhimeng Jia and Joanna Paladino for their critical review and feedback of this manuscript. Generative artificial intelligence was not used in the preparation of this manuscript.

Data Availability

The data sets generated during and/or analyzed during this study are not publicly available due to terms of consent for study participation but are available from the corresponding author on reasonable request.

Conflicts of Interest

None declared.

Phase 2: quantitative survey instrument, March 2021.

Phase 3: online forum activity guide, June 2021.

Phase 3: online forum activity guide, August 2021.

  • Armstrong K, Ritchie C. Research participation in marginalized communities - overcoming barriers. N Engl J Med. 2022 Jan 20;386(3):203-205 [ CrossRef ] [ Medline ]
  • George S, Duran N, Norris K. A systematic review of barriers and facilitators to minority research participation among African Americans, Latinos, Asian Americans, and Pacific Islanders. Am J Public Health. 2014 Feb;104(2):e16-e31 [ CrossRef ] [ Medline ]
  • Barrett NJ, Hasan M, Bethea K, Johnson KS. The fierce urgency of now: addressing racial and ethnic disparities in serious illness care. N C Med J. 2020 Jul 08;81(4):254-256 [ ] [ CrossRef ] [ Medline ]
  • Mack JW, Paulk ME, Viswanath K, Prigerson HG. Racial disparities in the outcomes of communication on medical care received near death. Arch Intern Med. 2010 Sep 27;170(17):1533-1540 [ ] [ CrossRef ] [ Medline ]
  • Johnson KS. Racial and ethnic disparities in palliative care. J Palliat Med. 2013 Nov;16(11):1329-1334 [ ] [ CrossRef ] [ Medline ]
  • Frydman JL, Gelfman LP, Morillo J, Allen OS, Bickell NA, Kwon D, et al. Racial/ethnic disparities in serious illness communication for patients with cancer. J Clin Oncol. 2022 Jun 01;40(16_suppl):6540-6540 [ CrossRef ]
  • Rhodes RL, Barrett NJ, Ejem DB, Sloan DH, Bullock K, Bethea K, et al. A review of race and ethnicity in hospice and palliative medicine research: representation matters. J Pain Symptom Manage. 2022 Nov;64(5):e289-e299 [ CrossRef ] [ Medline ]
  • Sanders JJ, Paladino J, Reaves E, Luetke-Stahlman H, Anhang Price R, Lorenz K, et al. Quality measurement of serious illness communication: recommendations for health systems based on findings from a symposium of national experts. J Palliat Med. 2020 Jan 01;23(1):13-21 [ CrossRef ] [ Medline ]
  • Shim JK. Cultural health capital: a theoretical approach to understanding health care interactions and the dynamics of unequal treatment. J Health Soc Behav. 2010 Mar 24;51(1):1-15 [ CrossRef ] [ Medline ]
  • Epstein RM, Street Jr RL. Patient-centered communication in cancer care: promoting healingreducing suffering. Patient-centered communication in cancer care. National Cancer Institute. 2007. URL: [accessed 2020-10-19]
  • Clark MA, Person SD, Gosline A, Gawande AA, Block SD. Racial and ethnic differences in advance care planning: results of a statewide population-based survey. J Palliat Med. 2018 Aug;21(8):1078-1085 [ CrossRef ] [ Medline ]
  • Huang IA, Neuhaus JM, Chiong W. Racial and ethnic differences in advance directive possession: role of demographic factors, religious affiliation, and personal health values in a national survey of older adults. J Palliat Med. 2016 Feb;19(2):149-156 [ ] [ CrossRef ] [ Medline ]
  • Smith AK, McCarthy EP, Paulk E, Balboni TA, Maciejewski PK, Block SD, et al. Racial and ethnic differences in advance care planning among patients with cancer: impact of terminal illness acknowledgment, religiousness, and treatment preferences. J Clin Oncol. 2008 Sep 01;26(25):4131-4137 [ ] [ CrossRef ] [ Medline ]
  • Harrison KL, Adrion ER, Ritchie CS, Sudore RL, Smith AK. Low completion and disparities in advance care planning activities among older Medicare beneficiaries. JAMA Intern Med. 2016 Dec 01;176(12):1872-1875 [ ] [ CrossRef ] [ Medline ]
  • DiJulio B, Hamel L, Wu B, Brodie M. Serious illness in late life: the public's views and experiences. Kaiser Family Foundation. 2017 Nov. URL: [accessed 2020-10-19]
  • Nouri S, Lyles CR, Rubinsky AD, Patel K, Desai R, Fields J, et al. Evaluation of neighborhood socioeconomic characteristics and advance care planning among older adults. JAMA Netw Open. 2020 Dec 01;3(12):e2029063 [ ] [ CrossRef ] [ Medline ]
  • Nouri SS, Barnes DE, Volow AM, McMahan RD, Kushel M, Jin C, et al. Health literacy matters more than experience for advance care planning knowledge among older adults. J Am Geriatr Soc. 2019 Oct 19;67(10):2151-2156 [ ] [ CrossRef ] [ Medline ]
  • Massachusetts Survey on Advance Care Planning and Serious Illness Care, Spring 2018 Survey of Massachusetts Residents. Massachusetts Coalition for Serious Illness Care. 2018. URL: [accessed 2020-11-15]
  • Advancing the language of advance care planning: a messaging research project. Massachusetts Coalition for Serious Illness Care. 2019 Nov. URL: [accessed 2020-10-10]
  • 2017 consumer research: deep dive on conversations. Massachusetts Coalition for Serious Illness Care. URL: [accessed 2020-11-15]
  • AmeriSpeak. URL: [accessed 2023-04-26]
  • NORC at the University of Chicago. URL: [accessed 2023-04-26]
  • Technical overview of the AmeriSpeak® panel NORC's probability-based household panel. AmeriSpeak and NORC and the University of Chicago. URL: [accessed 2022-11-15]
  • HHS implementation guidance on data collection standards for race, ethnicity, sex, primary language, and disability status. Assistant Secretary for Planning and Evaluation (ASPE). 2011 Oct. URL: https:/​/aspe.​​reports/​hhs-implementation-guidance-data-collection-standards-race-ethnicity-sex-primary-language-disability [accessed 2023-04-12]
  • Population surveys that include the Standard Disability Questions. Centers for Disease Control and Prevention. 2020 Sep 16. URL: [accessed 2023-04-12]
  • Lucid, a Cint Group company. URL: [accessed 2023-04-26]
  • Crabtree BF, Miller WL. Doing qualitative research. Newbury Park, CA. SAGE Publications, Inc; 2012.
  • Full Circle Research. 2021. URL: [accessed 2023-04-26]
  • itracks: Insight for growth. URL: [accessed 2023-04-26]
  • Neergaard MA, Olesen F, Andersen RS, Sondergaard J. Qualitative description - the poor cousin of health research? BMC Med Res Methodol. 2009 Jul 16;9:52 [ ] [ CrossRef ] [ Medline ]
  • QuickFacts United States. US Census Bureau. URL: [accessed 2023-04-19]
  • Butler SM, Sheriff N. How poor communication exacerbates health inequities – and what to do about it. Brookings Institution. 2021 Feb 22. URL: [accessed 2023-09-05]
  • Curtis JR, Downey L, Back AL, Nielsen EL, Paul S, Lahdya AZ, et al. Effect of a patient and clinician communication-priming intervention on patient-reported goals-of-care discussions between patients with serious illness and clinicians: a randomized clinical trial. JAMA Intern Med. 2018 Jul 01;178(7):930-940 [ ] [ CrossRef ] [ Medline ]
  • Bernacki R, Paladino J, Neville BA, Hutchings M, Kavanagh J, Geerse OP, et al. Effect of the serious illness care program in outpatient oncology: a cluster randomized clinical trial. JAMA Intern Med. 2019 Jun 01;179(6):751-759 [ ] [ CrossRef ] [ Medline ]
  • Kumar P, Wixon-Genack J, Kavanagh J, Sanders JJ, Paladino J, O’Connor NR. Serious illness conversations with outpatient oncology clinicians: understanding the patient experience. JCO Oncol Pract. 2020 Dec;16(12):e1507-e1515 [ CrossRef ]
  • Paladino J, Koritsanszky L, Nisotel L, Neville BA, Miller K, Sanders J, et al. Patient and clinician experience of a serious illness conversation guide in oncology: a descriptive analysis. Cancer Med. 2020 Jul 04;9(13):4550-4560 [ ] [ CrossRef ] [ Medline ]
  • You J, Singh J, Simon J, Ma IW, Paladino J, Swinton M, et al. A quality improvement initiative to implement the Serious Illness Care Program on hospital medical wards. Can J Gen Int Med. 2022 Feb 08;17(1):29-51 [ CrossRef ]
  • Wright AA, Zhang B, Ray A, Mack JW, Trice E, Balboni T, et al. Associations between end-of-life discussions, patient mental health, medical care near death, and caregiver bereavement adjustment. JAMA. 2008 Oct 08;300(14):1665-1673 [ ] [ CrossRef ] [ Medline ]
  • Prevalence of disability and disability types by urban-rural county classification - United States, 2016. Centers for Disease Control and Prevention. 2019 Oct 27. URL: [accessed 2023-04-19]
  • Bierer BE, White SA, Gelinas L, Strauss DH. Fair payment and just benefits to enhance diversity in clinical research. J Clin Transl Sci. 2021 Jul 14;5(1):e159 [ ] [ CrossRef ] [ Medline ]
  • American Community Survey B16001: Language Spoken at Home 2019. United States Census Bureau. URL: [accessed 2023-04-12]
  • Trinh-Shevrin C, Islam NS, Rey MJ. Asian American communities and health: context, research, policy, and action. 1st edition. San Francisco, CA. Jossey-Bass; 2009.
  • Pearson SE, Taylor J, Hoare DJ, Patel P, Baguley DM. Exploring the experiences of cancer patients with chemotherapy-induced ototoxicity: qualitative study using online health care forums. JMIR Cancer. 2019 Mar 14;5(1):e10883 [ ] [ CrossRef ] [ Medline ]
  • Yi EG, Adamek ME. Alzheimer's caregivers' experience with and perceptions of the Affordable Care Act: thematic analysis of online discussion forums. J Appl Gerontol. 2021 Dec 14;40(12):1796-1806 [ CrossRef ] [ Medline ]
  • Horgan A, McCarthy G, Sweeney J. An evaluation of an online peer support forum for university students with depressive symptoms. Arch Psychiatr Nurs. 2013 Apr;27(2):84-89 [ ] [ CrossRef ] [ Medline ]
  • Im EO, Lee B, Chee W, Dormire S, Brown A. A national multiethnic online forum study on menopausal symptom experience. Nurs Res. 2010;59(1):26-33 [ ] [ CrossRef ] [ Medline ]
  • Ferrante JM, Friedman A, Shaw EK, Howard J, Cohen DJ, Shahidi L. Lessons learned designing and using an online discussion forum for care coordinators in primary care. Qual Health Res. 2016 Nov 11;26(13):1851-1861 [ ] [ CrossRef ] [ Medline ]
  • Hibbard JH, Greene J. What the evidence shows about patient activation: better health outcomes and care experiences; fewer data on costs. Health Aff. 2013 Feb;32(2):207-214 [ CrossRef ] [ Medline ]
  • Greene J, Hibbard JH, Sacks R, Overton V, Parrotta CD. When patient activation levels change, health outcomes and costs change, too. Health Aff. 2015 Mar;34(3):431-437 [ CrossRef ] [ Medline ]
  • Technical Overview of the AmeriSpeak® Panel NORC'S Probability-Based Household Panel. AmeriSpeak. 2022 Feb 08. URL: [accessed 2023-04-12]
  • Sanders JJ, Johnson KS, Cannady K, Paladino J, Ford DW, Block SD, et al. From barriers to assets: rethinking factors impacting advance care planning for African Americans. Palliat Support Care. 2019 Jun;17(3):306-313 [ CrossRef ] [ Medline ]


Edited by A Mavragani; submitted 28.04.23; peer-reviewed by D Chao, N Finn; comments to author 20.08.23; revised version received 10.09.23; accepted 28.09.23; published 06.12.23

©Carine Davila, Stephanie H Chan, Anna Gosline, Zamawa Arenas, Jane Kavanagh, Brian Feltz, Elizabeth McCarthy, Tyrone Pitts, Christine Ritchie. Originally published in the Journal of Medical Internet Research (, 06.12.2023.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.

research papers on online streaming

Field Operations

Take the power of location anywhere

Optimize efficiency in field activities with the power of location intelligence

research papers on online streaming

The power of location improves coordination and efficiency in field operations. Use field apps to reduce or even replace reliance on paper. Improve the data accuracy of field assets through purpose-built data collection apps. Ensure that field and office staff use the same authoritative data to reduce errors, boost productivity, and save money.

Digitally transform field operations

Discover how the ArcGIS suite of field apps transforms disparate field activities and processes into a unified workflow.

Location-enabled field activities

Purpose-built apps working together.

Organizations of all types find numerous advantages from a suite of mobile apps that support their field activities in connected or disconnected environments. These easy-to-use apps can be deployed as a software-as-a-service (SaaS) solution or behind your firewall.

Get the office and the field working in unison. Leverage the power of location to understand where work needs to be done and effectively coordinate and dispatch resources. Your existing authoritative data is the backbone on which field activity planning is done. 

Explore ArcGIS Online Explore ArcGIS Workforce

research papers on online streaming

Your data, your roads—even while offline. Consistently meet deadlines by using the most efficient routes so you can arrive on time and get the job done. Drivers keep their eyes on the road by using voice-guided routing that even considers the type of vehicle being driven and road restrictions along the route.

Discover ArcGIS Navigator

research papers on online streaming

Take your organization's digital maps with you, anywhere and anytime. Use a current map to find assets and areas of interest or to see what is in the surrounding area. Promote your spatial awareness and understanding when performing inspections, responding to natural disasters, or engaging in other activities that benefit from spatial context.

Discover ArcGIS Field Maps

research papers on online streaming

Replace outdated paper-based workflows. Enable workers to easily perform accurate data collection and asset inspections in any environment with data collection apps for mobile workers of all technical levels. Field-captured data immediately feeds into your system of record, streamlining life cycle management and supporting better decision-making.

Explore ArcGIS Field Maps Explore ArcGIS Survey123 Explore ArcGIS QuickCapture

A mobile device displays a data collection map of water fire hydrant assets next to a line graph and icon of a GPS arrow

Make decisions at a glance. Easy-to-understand dashboards and maps support informed decision-making. Communicate the status of field operations to managers by monitoring, tracking, and reporting real-time data feeds, location tracks, and activities that focus on what matters most. Present maps and dashboards to apprise constituents of activities and events that impact them.

Discover ArcGIS Dashboards

research papers on online streaming

Location sharing

Know what happens in the field. Enable those in the field to share their location tracks so you can know where everyone is and where they have been. Location sharing is a capability of ArcGIS that is available in multiple solutions. For field personnel, track sharing can be accomplished through a field app that is controlled entirely by the user. Authorized managers and supervisors can use a web app to visualize and analyze track data to better allocate field personnel to areas of need, supporting more effective management of field activities.

A mobile device shows options for device and map layer settings next to a laptop display of a map and an icon of three people

Sharing information can be critical to the success of a project. Whether collaborating as a single team or with multiple organizations, the ability to scale to the specific needs of your situation is easily accomplished within a single system. Easily control data access for internal and external stakeholders, so everyone has only the information they need.

Learn more about ArcGIS Online

research papers on online streaming

Emergency response

Managing emergency response

The Environmental Protection Agency used apps to coordinate emergency response among local, state, and federal agencies following a severe hurricane.

research papers on online streaming

Digital transformation of operations

Taylor Shellfish Farms used mobile apps to digitally transform operations to sustainably raise shellfish on more than 30 farms.

Electric utility

Efficient field inspections

A rural utility cooperative realized greater inspection efficiency by using apps to create an authoritative map of its electrical system.

Local government

Digital workflows in public works

The City of Nacogdoches, Texas, saved time and money by replacing paper processes with digital workflows.

Improving collaboration

Los Angeles County used mobile GIS for improved collaboration to better serve homeless populations.

Oil and gas

Streamlined data collection

An oil and gas organization achieved a $2 million first-year savings through the use of field data collection apps.

Contact sales

Please share your information, and our sales team will contact you soon. We look forward to communicating with you.

Customer service

Technical support



  1. How to download any research paper for FREE

    research papers on online streaming

  2. How to Read More Research Papers?

    research papers on online streaming

  3. Example Of Result Of Research Paper / How To Write The Results Section Of A Dissertation

    research papers on online streaming

  4. Research Paper Sample Pdf Chapter Download Scientific Pertaining To Academic Journal Template

    research papers on online streaming

  5. Research Paper Format

    research papers on online streaming

  6. Know the Reason Why You Must Buy Research Papers online

    research papers on online streaming


  1. Research papers in 📌 comment. Subscribe ✨ #hindu #meditation #mantrameditation

  2. 16. How to write outlines for research papers?

  3. Procrastinating On Your Research Papers? How To Fix It TODAY


  5. How to Download Research Paper free!!!

  6. How to search research papers with different Keywords l Techniques


  1. (PDF) Factors Affecting Online Streaming Subscriptions

    Numerous studies have examined the connection between the adoption of cable and online media, and key variables such as cost, ease of use, and social trends. In this study, we explore a number of ...

  2. Consumption of OTT Media Streaming in COVID-19 Lockdown: Insights from

    As a result, popular OTT service providers such as YouTube, Netflix and Spotify have seen an instrumental role in the growth of data streaming, recording a staggering 140% rise in video streaming apps in Australia, India, Indonesia, South Korea and Thailand (App Annie, The state of the mobile 2019).These statistics show that there exists a strong opportunity for OTT service providers to ...

  3. Factors Affecting Online Streaming Subscriptions

    online streaming or cable. Our research intent is to establish whether there are any significant relationships between these factors—that is, whether the choice between online streaming or cable services leads to more sales or the purchase of additional add-ons. Reviewing the research performed by Cha and Chan-Olmsted (2012), there was

  4. The impact of network social presence on live streaming ...

    Therefore, our research complements existing research on the viewer's social support willingness in the online live streaming environment. Secondly, this study provides a new path for the study ...

  5. The dimensions of streaming: toward a typology of an evolving concept

    Based on research on the development of streaming solutions across media forms and industries, this article traces the dynamics and dimensions of the notion of streaming. It theorizes streaming as an evolving concept, and argues against strict, set and limited definitions such as those suggested by Lotz and Herbert et al.

  6. An analysis of user behavior in online video streaming

    To this end, the paper provides an extensive analysis of user behavior in online video streaming, based on a large scale trace database of online streaming video access sessions. We categorize ...

  7. The Future of TV and Online Video Platforms: A Study on Predictors of

    Original Research Introduction Online video streaming sites and television websites have attracted large audiences and have been able to maintain viewership within different markets. In 2021, the number of global subscribers to online video streaming service reached 1,060.8 million, with Netflix as the largest player command-

  8. A survey on cloud-based video streaming services

    4.6. Cloud storage for video streaming. The rapid growth of video streaming usage in various applications, such as e-learning, video surveillance and situational awareness, and on various forms of mobile devices (e.g., smartphones, tablets, laptops) has created the problem of big video data [ 132 ].

  9. (PDF) A Contemporary Survey on Live Video Streaming from ...

    This paper provides a contemporary survey of cutting-edge live video streaming studies from a computation-driven perspective. First, an overview of the global standards, system architectures, and ...

  10. OTT and live streaming services: Past, present, and future

    The fifth paper is an integrative literature review of research on advertising in streaming video while the sixth paper is a case study explicating the changes caused by Netflix in local Thai content industries and regulations. ... groups: those focusing on online streaming television advertising, in-stream ads, overlay ads, general or more ...

  11. Online Streaming: A Case of Broadcasting? by Zisha Rizvi :: SSRN

    If you need immediate assistance, call 877-SSRNHelp (877 777 6435) in the United States, or +1 212 448 2500 outside of the United States, 8:30AM to 6:00PM U.S. Eastern, Monday - Friday.

  12. Psychosocial Impact of Web Series and Streaming Content: A Study on

    Nowadays, Web series and online streaming content are becoming the heart of the youth. Web series are replacing television and have seen a boom in online streaming and web series content produced in India. Many big companies like Amazon, Netflix, SonyLiv, Hot star, and Eros Now have invested heavily in regional content.

  13. Full article: A Data Mining Approach for Developing Online Streaming

    Introduction. With the increasing popularity of broadband networks and the improvement of computing power, live-streaming media that was originally only suitable for electronic media and enterprises can be extended to users and families on the Internet (Singh and Sharma Citation 2020).The line stream uses and continues the advantages of the Internet, and uses video for online live streaming.

  14. Adoption of online streaming services: moderating role of personality

    Originality/value. The paper explores the adoption of online streaming services from the technology acceptance perspective. Further, very few studies have examined the moderating role of personality traits in technology adoption. This paper attempts to fill this gap. It expands the understanding of technology adoption literature by assessing ...

  15. The commercial impact of live streaming: A ...

    It is distinct from earlier forms of social media in that it allows for real-time interaction and is extremely synchronous. That makes live streaming an important new area of enquiry. Yet live streaming platforms, streamers, and scholars lack an informed structure from which to build more holistic understanding and strategy.

  16. The Law of Live Streaming: A Systematic Literature Review ...

    Another limitation could be the language of the analyzed papers, since there are many live streaming services in China and, potentially, many research papers on live streaming in Chinese. Furthermore, the choice of the search query as well as the three databases could have potentially restricted the number of retrieved studies.

  17. 25 Years of Online Video Streaming Research: A Bibliometric Analysis

    25 Years of Online Video Streaming Research: A Bibliometric Analysis. ... including Harvard Business Publishing, Asian Case Research Journal, IIMA Case Center, Ivey Publishing and CEIBS Case Center. ... financial economics, corporate governance, and shadow banking. He has published research papers in several SCOPUS/ABDC-listed journals. He is a ...

  18. The streaming network: Conceptualizing distribution economy, technology

    Here, we departed from what I conceive to be the central relationships between the streaming provider, the database, the user and the device and software (the core streaming model). Future research will indicate to what extent the network components presented here - including flows of data and payments and relationships of access, control and ...

  19. Music streaming services: understanding the drivers of customer

    In fact, recommending a technology to others is a behaviour that has been significantly neglected by researchers (Luo et al., 2016), and to the best of our knowledge, it is the first time that this post-adoption behaviour has been studied in the context of music streaming. The remainder of this paper is structured as follows.

  20. (PDF) OTT Viewership and Pandemic: A study on New Trends of online

    This paper attempts to elaborate the impact of online streaming services over the market of main stream cinema theatres and multiplexes with a special reference towards post pandemic condition.

  21. Study of Perception of College Going Young Adults towards Online ...

    The research conducted focuses upon the perception of college going young adults towards online video streaming services. Researcher has worked upon responses gathered from young adults, their perceptions and various options available to them. Researchers collected responses from 120 college going young adults from Pune.

  22. PDF Scalable Extraction of Training Data from (Production) Language Models

    This paper studies extractable memorization: training data ... • Research papers. We extract snippets from several re-search papers, e.g., the entire abstract from a Nature pub- ... T224;u (London). Watch Free Movies Online without registration or sign up, enjoy latest free movies in high quality Is

  23. Scaling deep learning for materials discovery

    Discovered stable crystals. Using the described process of scaling deep learning for materials exploration, we increase the number of known stable crystals by almost an order of magnitude. In ...

  24. Animate Anyone: Consistent and Controllable Image-to-Video Synthesis

    Character Animation aims to generating character videos from still images through driving signals. Currently, diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities. However, challenges persist in the realm of image-to-video, especially in character animation, where temporally maintaining consistency with detailed information ...

  25. Journal of Medical Internet Research

    This paper is in the following e-collection/theme issue: Electronic/Mobile Data Capture, Internet-based Survey & Research Methodology (682) Participatory Medicine & E-Patients (660) Information Seeking, Information Needs (387) Participatory Research Protocols and Proposals (108) Medicine 2.0: Social Media, Open, Participatory, Collaborative Medicine (1452) Peer-to-Peer Support and Online ...

  26. Netflix audience data, streaming industry discourse, and the emerging

    To understand the industrial discourses related to proprietary streaming audience data, this research examines a variety of publicly available secondary data from sources including trade press articles, press releases, trade and popular press interviews, videos of industry roundtables, promotional appearances, and more than 10 years of Netflix ...

  27. Field Operations & Data Collection App

    The power of location improves coordination and efficiency in field operations. Use field apps to reduce or even replace reliance on paper. Improve the data accuracy of field assets through purpose-built data collection apps. Ensure that field and office staff use the same authoritative data to reduce errors, boost productivity, and save money.

  28. A Study on Use of Streaming Media by the University Students

    Divanka Randula Podduwage. Streaming media" is a new media platform which has shown a striking growth around the world within the last five years. This technology is used to deliver audio and ...

  29. The Top 10 Real Estate Markets of 2024: You'll Never Guess Which Ones

    November median home list price: $1,150,000Forecasted 2024 home sales price change: 3.5%Forecasted 2024 home sales change: 9.2%. The Los Angeles housing market took a beating as mortgage rates ...