Journal Article PUBDB-2024-05800

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
MLstructureMining: a machine learning tool for structure identification from X-ray pair distribution functions

 ;  ;  ;  ;  ;  ;

2024
Royal Society of Chemistry Washington DC

Digital discovery 3(5), 908-918 () [10.1039/D4DD00001C]
 GO

This record in other databases:        

Please use a persistent id in citations: doi:  doi:

Abstract: Synchrotron X-ray techniques are essential for studies of the intrinsic relationship between synthesis, structure, and properties of materials. Modern synchrotrons can produce up to 1 petabyte of data per day. Such amounts of data can speed up materials development, but also comes with a staggering growth in workload, as the data generated must be stored and analyzed. We present an approach for quickly identifying an atomic structure model from pair distribution function (PDF) data from (nano)crystalline materials. Our model, MLstructureMining, uses a tree-based machine learning (ML) classifier. MLstructureMining has been trained to classify chemical structures from a PDF and gives a top-3 accuracy of 99% on simulated PDFs not seen during training, with a total of 6062 possible classes. We also demonstrate that MLstructureMining can identify the chemical structure from experimental PDFs from nanoparticles of CoFe$_2$O$_4$ and CeO$_2$, and we show how it can be used to treat an in situ PDF series collected during Bi$_2$Fe$_4$O$_9$ formation. Additionally, we show how MLstructureMining can be used in combination with the well-known methods, principal component analysis (PCA) and non-negative matrix factorization (NMF) to analyze data from in situ experiments. MLstructureMining thus allows for real-time structure characterization by screening vast quantities of crystallographic information files in seconds.

Classification:

Contributing Institute(s):
  1. DOOR-User (DOOR ; HAS-User)
Research Program(s):
  1. 6G3 - PETRA III (DESY) (POF4-6G3) (POF4-6G3)
  2. DFG project G:(GEPRIS)429360100 - Studien an in-situ Daten der totalen Streufunktion: Bildungsmechanismen von ternären multiferroischen Bismutferraten (429360100) (429360100)
Experiment(s):
  1. PETRA Beamline P02.1 (PETRA III)
  2. PETRA Beamline P21.1 (PETRA III)

Appears in the scientific report 2024
Database coverage:
Medline ; Creative Commons Attribution-NonCommercial CC BY-NC 4.0 ; DOAJ ; OpenAccess ; Clarivate Analytics Master Journal List ; DOAJ Seal ; Emerging Sources Citation Index ; SCOPUS ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Private Collections > >Extern > >HAS-User > HAS-User
Document types > Articles > Journal Article
Public records
Publications database
OpenAccess

 Record created 2024-09-04, last modified 2025-07-23


OpenAccess:
Download fulltext PDF Download fulltext PDF (PDFA)
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)